Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Hi, Could see intermittent GCs and mutation drops. *System log reports:* INFO [Service Thread] GCInspector.java:252 - ParNew GC in 3816ms. CMS Old Gen: 4663180720 -> 5520012520; Par Eden Space: 1718091776 -> 0; Par Survivor Space: 0 -> 214695936 INFO [ScheduledTasks:1]

Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Hi Nitan, Rolling reatart did not helped. Same compaction status after restart. No other processes running here. These are dedicated cassandra nodes. Sent from my iPhone > On 16-May-2017, at 7:16 PM, Nitan Kainth wrote: > > Have you tried rolling restart? > Any agent or

Re: Bootstraping a Node With a Newer Version

2017-05-16 Thread daemeon reiydelle
What makes you think you cannot upgrade the kernel? “All men dream, but not equally. Those who dream by night in the dusty recesses of their minds wake up in the day to find it was vanity, but the dreamers of the day are dangerous men, for they may act their dreams with open eyes, to make it

Re: Decommissioned node cluster shows as down

2017-05-16 Thread Hannu Kröger
That’s weird. I thought decommission would ultimately remove the node from the cluster because the token(s) should be removed from the ring and data should be streamed to new owners. “DN” is IMHO not a state where the node should end up in. Hannu > On 16 May 2017, at 19:05, suraj pasuparthy

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
That is another way to see the question: are reverse iterators range tombstone aware? Yes. That is why I am puzzled by this afore-mentioned behavior. I would expect them to handle this case more gracefully. Cheers, Stefano On Tue, May 16, 2017 at 3:29 PM, Nitan Kainth wrote:

Re: Long running compaction on huge hint table.

2017-05-16 Thread Nitan Kainth
Yes but it means data has to be replicated using repair. Hints are out come of unhealthy nodes, focus on finding why you have mutation drops, is it node, io or network etc. ideally you shouldn't see increasing hints all the time. Sent from my iPhone > On May 16, 2017, at 7:58 AM, varun saluja

Re: Long running compaction on huge hint table.

2017-05-16 Thread Nitan Kainth
Have you tried rolling restart? Any agent or other process hogging system? Sent from my iPhone > On May 16, 2017, at 7:58 AM, varun saluja wrote: > > Hi Nitan, > > Thanks for response. > > Yes, I could see mutation drops and increase count in system.hints. Is there > any

Re: Non-zero nodes are marked as down after restarting cassandra process

2017-05-16 Thread Andrew Jorgensen
Thanks for the info! When you say "overall stability problems due to some bugs", can you elaborate on if those were bugs in cassandra that were fixed due to an upgrade or bugs in your own code and how you used cassandra. If the latter would it be possible to highlight what the most impactful fix

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Hannu Kröger
Hello, If you mean how to construct a query like that: you use ORDER BY clause with SELECT which is reverse to the default just like in the example below? If the table is constructed with "clustering order by (timeid ASC)” and you query “SELECT ... ORDER BY timeid DESC”, then the partition is

Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Thanks a lot Jeff. You have explaned very well here. We have consitency as local quorum. Will follow truncate hints and repair therafter. I hope this brings cluster in stable state Thanks again. Regards, Varun Saluja Sent from my iPhone > On 16-May-2017, at 8:42 PM, Jeff Jirsa

Re: Bootstraping a Node With a Newer Version

2017-05-16 Thread Jeff Jirsa
On 2017-05-16 05:27 (-0700), Shalom Sagges wrote: > Hi All, > > Hypothetically speaking, let's say I want to upgrade my Cassandra cluster, > but I also want to perform a major upgrade to the kernel of all nodes. > In order to upgrade the kernel, I need to reinstall the

Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
Hi all, I am seeing inconsistencies when mixing range tombstones, wide partitions, and reverse iterators. I still have to understand if the behaviour is to be expected hence the message on the mailing list. The situation is conceptually simple. I am using a table defined as follows: CREATE

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Nitan Kainth
Hannu, How can you read a partition in reverse? Sent from my iPhone > On May 16, 2017, at 9:20 AM, Hannu Kröger wrote: > > Well, I’m guessing that Cassandra doesn't really know if the range tombstone > is useful for this or not. > > In many cases it might be that the

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
Hi Hannu, the piece of data in question is older. In my example the tombstone is the newest piece of data. Since a range tombstone has information re the clustering key ranges, and the data is clustering key sorted, I would expect a linear scan not to be necessary. On Tue, May 16, 2017 at 3:46

Re: Long running compaction on huge hint table.

2017-05-16 Thread Jeff Jirsa
In Cassandra versions up to 3.0, hints are stored within a table, where the partition key is the host ID of the server for which the hints are stored. In such a data model, accumulating 800GB of hints is almost certain to cause very wide rows, which will in turn cause GC pressure when you

Read timeouts

2017-05-16 Thread Nitan Kainth
Hi, We see read timeouts intermittently. Mostly after they have occurred. Timeouts are not consistent and does not occur in 100s at a moment. 1. Does read timeout considered as Dropped Mutation? 2. What is best way to nail down exact issue of scattered timeouts? Thank you.

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
Yes, that was my intention but I wanted to cross-check with the ML and the devs keeping an eye on it first. On Tue, May 16, 2017 at 5:10 PM, Hannu Kröger wrote: > Well, > > sstables contain some statistics about the cell timestamps and using that > information and the

Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Thanks Nitan. Appreciate your help. Can anyone suggest parameter change or something which can help in this situation. Regards, Varun Sent from my iPhone > On 16-May-2017, at 7:31 PM, Nitan Kainth wrote: > > If target table is dropped then you can remove its hints but

Re: Long running compaction on huge hint table.

2017-05-16 Thread Nitan Kainth
You can control compaction with nodetool compactionthroughput but it will just slow down compaction and give resources for application, however it's not a fix. Sent from my iPhone > On May 16, 2017, at 9:15 AM, varun saluja wrote: > > Thanks Nitan. > Appreciate your help.

Re: Reg:- Data Modelling Concepts

2017-05-16 Thread @Nandan@
Hi Jon, We need to keep tracking of all updates like 'User' of our platform can check what changes made before. I am thinking in this way.. CREATE TABLE book_info ( book_id uuid, book_title text, author_name text, updated_at timestamp, PRIMARY KEY(book_id)); This table will contain details about

Re: Long running compaction on huge hint table.

2017-05-16 Thread Jeff Jirsa
You could also try stopping compaction, but that'll probably take a very long time as well Manually stopping each node (one at a time) and removing the sstables from only system.hints may be a better option. May want to take a snapshot if you're very concerned with that data. -- Jeff

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Hannu Kröger
This is a bit of guessing but it probably reads sstables in some sort of sequence, so even if sstable 2 contains the tombstone, it still scans through the sstable 1 for possible data to be read. BR, Hannu > On 16 May 2017, at 19:40, Stefano Ortolani wrote: > > Little

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
But it should skip those records since they are sorted. My understanding would be something like: 1) read sstable 2 2) read the range tombstone 3) skip records from sstable2 and sstable1 within the range boundaries 4) read remaining records from sstable1 5) no records, return On Tue, May 16,

Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Hi Jeff, I ran nodetool truncatehints on all nodes. Its running for more than 30 mins now. Status for compactstats reports same. pending tasks: 1 compaction type keyspace table completed totalunit progress Compaction system hints 11189118129

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Hannu Kröger
Yes, I agree. I would say it cannot skip those cells because it doesn’t check the max timestamp of the cells of the sstable and therefore scans them one by one. Hannu > On 16 May 2017, at 19:48, Stefano Ortolani wrote: > > But it should skip those records since they are

RE: Decommissioned node cluster shows as down

2017-05-16 Thread Mark Furlong
I thought the same that the decommission would complete the removal of a node. I have heard something said about a 72 hour window, I’m not sure if that pertains to this version. Thanks Mark 801-705-7115 office From: Hannu Kröger [mailto:hkro...@gmail.com] Sent: Tuesday, May 16, 2017 10:09 AM

Re: RE: Decommissioned node cluster shows as down

2017-05-16 Thread Jeff Jirsa
On 2017-05-16 09:28 (-0700), Mark Furlong wrote: > I thought the same that the decommission would complete the removal of a > node. I have heard something said about a 72 hour window, I’m not sure if > that pertains to this version. > We keep a record of it in

Re: Read timeouts

2017-05-16 Thread Nitan Kainth
Thank you Jeff. We are at Cassandra 3.0.10 Will look forward to upgrade or enable driver logging. > On May 16, 2017, at 11:44 AM, Jeff Jirsa wrote: > > > > On 2017-05-16 08:53 (-0700), Nitan Kainth wrote: >> Hi, >> >> We see read timeouts

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
Little update: also the following query timeouts, which is weird since the range tombstone should have been read by then... SELECT * FROM test_cql.test_cf WHERE hash = 0x963204d451de3e611daf5e340c3594acead0eaaf AND timeid < the_oldest_deleted_timeid ORDER BY timeid DESC; On Tue, May 16, 2017

Re: Read timeouts

2017-05-16 Thread Jeff Jirsa
On 2017-05-16 08:53 (-0700), Nitan Kainth wrote: > Hi, > > We see read timeouts intermittently. Mostly after they have occurred. > Timeouts are not consistent and does not occur in 100s at a moment. > > 1. Does read timeout considered as Dropped Mutation? No, a dropped

Re: Decommissioned node cluster shows as down

2017-05-16 Thread Jeff Jirsa
On 2017-05-16 09:08 (-0700), Hannu Kröger wrote: > That’s weird. I thought decommission would ultimately remove the node from > the cluster because the token(s) should be removed from the ring and data > should be streamed to new owners. “DN” is IMHO not a state

Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Thanks for update. I could see lot of io waits. This causing Gc and mutation drops . But as i mentioned we do not have high load for now. Hint replays are creating such high disk I/O. compactionstats show very high hint bytes like 780gb around. Is this normal? Just mentioning we are using flash

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Nitan Kainth
If the data is stored in ASC order and query asks for DESC, then wouldn’t it read whole partition in first and then pick data from reverse order? > On May 16, 2017, at 10:03 AM, Stefano Ortolani wrote: > > Hi Hannu, > > the piece of data in question is older. In my

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Stefano Ortolani
No, because C* has reverse iterators. On Tue, May 16, 2017 at 4:47 PM, Nitan Kainth wrote: > If the data is stored in ASC order and query asks for DESC, then wouldn’t > it read whole partition in first and then pick data from reverse order? > > > On May 16, 2017, at 10:03 AM,

Re: Decommissioned node cluster shows as down

2017-05-16 Thread suraj pasuparthy
Yes, you have to run a nodetool removenode to decomission completely.. this will also allow another node with the same ip different HashId to join the cluster.. Thanks -suraj On Tue, May 16, 2017 at 9:01 AM Mark Furlong wrote: > > > > > > > > > > > > > > > > > I have a

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Hannu Kröger
Well, sstables contain some statistics about the cell timestamps and using that information and the tombstone timestamp it might be possible to skip some data but I’m not sure that Cassandra currently does that. Maybe it would be worth a JIRA ticket and see what the devs think about it. If

Re: Long running compaction on huge hint table.

2017-05-16 Thread Nitan Kainth
Do you see mutation drops? Select count from system.hints; is it increasing? Sent from my iPhone > On May 16, 2017, at 5:52 AM, varun saluja wrote: > > Hi Experts, > > We are facing issue on production cluster. Compaction on system.hint table is > running from last 2

Re: Reg:- DSE 5.1.0 Issue

2017-05-16 Thread DuyHai Doan
Nandan Since you have asked many times questions about DSE on this OSS mailing list, I suggest you to contact directly Datastax if you're using their enterprise edition. Every Datastax customer has access to their support. If you're a sub-contractor for a final customer that is using DSE, ask

Bootstraping a Node With a Newer Version

2017-05-16 Thread Shalom Sagges
Hi All, Hypothetically speaking, let's say I want to upgrade my Cassandra cluster, but I also want to perform a major upgrade to the kernel of all nodes. In order to upgrade the kernel, I need to reinstall the server, hence lose all data on the node. My question is this, after reinstalling the

Re: Long running compaction on huge hint table.

2017-05-16 Thread Jason Brown
Varun, This a message better for the user@ ML. Thanks, -Jason On Tue, May 16, 2017 at 3:41 AM, varun saluja wrote: > Hi Experts, > > We are facing issue on production cluster. Compaction on system.hint table > is running from last 2 days. > > > pending tasks: 1 >

Re: Bootstraping a Node With a Newer Version

2017-05-16 Thread Mateusz Korniak
On Tuesday 16 of May 2017 15:27:11 Shalom Sagges wrote: > My question is this, after reinstalling the server with the new kernel, can > I first install the upgraded Cassandra version and then bootstrap it to the > cluster? No. Bootstrap/repair may/will not work between nodes with different major

Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Hi Experts, We are facing issue on production cluster. Compaction on system.hint table is running from last 2 days. pending tasks: 1 compaction type keyspace table completed total unit progress Compaction system hints 20623021829

Reg:- DSE 5.1.0 Issue

2017-05-16 Thread @Nandan@
Hi , Sorry in Advance if I am posting here . I stuck in some particular steps. I was using DSE 4.8 on Single DC with 3 nodes. Today I upgraded my all 3 nodes to DSE 5.1 Issue is when I am trying to start SERVICE DSE RESTART i am getting error message as Hadoop functionality has been removed

Re: Reg:- DSE 5.1.0 Issue

2017-05-16 Thread Hannu Kröger
Hello, DataStax is probably more than happy answer your particaly DataStax Enterprise related questions here (I don’t know if that is 100% right place but…): https://support.datastax.com/hc/en-us This mailing list is for open source Cassandra and DSE

Re: LCS, range tombstones, and eviction

2017-05-16 Thread Stefano Ortolani
That makes sense. I see however some unexpected performance data on my test, but I will start another thread for that. Thanks again! On Fri, May 12, 2017 at 6:56 PM, Blake Eggleston wrote: > The start and end points of a range tombstone are basically stored as > special

Re: Long running compaction on huge hint table.

2017-05-16 Thread Nitan Kainth
If target table is dropped then you can remove its hints but there could be more hints from other table. If it has tables of your interest , then I won't comment on truncating hints. Size of hints depends on Kafka load , looks like you had overloaded the cluster during data load and not hints

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Hannu Kröger
Well, I’m guessing that Cassandra doesn't really know if the range tombstone is useful for this or not. In many cases it might be that the partition contains data that is within the range of the tombstone but is newer than the tombstone and therefore it might be still be returned. Scanning

Replication issue with Multi DC setup in cassandra

2017-05-16 Thread suraj pasuparthy
Hello, I am tying to find a way to PREVENT just one of my keyspaces to not sync to the other datacenter. I have 2 datacenters setup this way : Datacenter: DC:4.4.4.4 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns

Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Hi Nitan, Thanks for response. Yes, I could see mutation drops and increase count in system.hints. Is there any way , i can proceed to truncate hints like using nodetool truncatehints. Regards, Varun Saluja On 16 May 2017 at 17:52, Nitan Kainth wrote: > Do you see

Reg:- Data Modelling Concepts

2017-05-16 Thread @Nandan@
The requirement is to create DB in which we have to keep data of Updated values as well as which user update the particular book details and what they update. We are like to create a schema which store book info, as well as the history of the update, made based on book_title, author, publisher,

Re: Reg:- Data Modelling Concepts

2017-05-16 Thread Jonathan Haddad
I don't understand why you need to store the old value a second time. If you know that the value went from A -> B -> C, just store the new value, not the old. You can see that it changed from A->B->C without storing it twice. On Tue, May 16, 2017 at 6:36 PM @Nandan@

Re: Reg:- Data Modelling Concepts

2017-05-16 Thread Jonathan Haddad
Sorry, I hit return a little early. What you want is called "event sourcing": https://martinfowler.com/eaaDev/EventSourcing.html Think of it as time series applied to state (instead of mutable state) CREATE TABLE book ( name text, ts timeuuid, author text, primary key(bookid, ts) ); for

Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Hi, Truncatehints on nodes is running for more than 7 hours now. Nothing mentioned for same in sysemt logs even. And compaction stats reports increase in hints total bytes. pending tasks: 1 compaction type keyspace table completed totalunit progress

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Hannu Kröger
Well, as mentioned, probably Cassandra doesn’t have logic and data to skip bigger regions of deleted data based on range tombstone. If some piece of data in a partition is newer than the tombstone, then it cannot be skipped. Therefore some partition level statistics of cell ages would need to

Re: Non-zero nodes are marked as down after restarting cassandra process

2017-05-16 Thread Jeff Jirsa
On 2017-05-16 07:07 (-0700), Andrew Jorgensen wrote: > Thanks for the info! > > When you say "overall stability problems due to some bugs", can you > elaborate on if those were bugs in cassandra that were fixed due to an > upgrade or bugs in your own code and how

Decommissioned node cluster shows as down

2017-05-16 Thread Mark Furlong
I have a node I decommissioned on a large ring using 2.1.12. The node completed the decommission process and is no longer communicating with the rest of the cluster. However when I run a nodetool status on any node in the cluster it shows the node as ‘DN’. Why is this and should I just run a

Re: Range deletes, wide partitions, and reverse iterators

2017-05-16 Thread Nitan Kainth
Thank you Stefano > On May 16, 2017, at 10:56 AM, Stefano Ortolani wrote: > > No, because C* has reverse iterators. > > On Tue, May 16, 2017 at 4:47 PM, Nitan Kainth > wrote: > If the data is stored in ASC order and query asks

Re: Replication issue with Multi DC setup in cassandra

2017-05-16 Thread suraj pasuparthy
So i though the same, I see the data via the CQLSH in both the datacenters. consistency is set to LQ thanks -Suraj On Tue, May 16, 2017 at 2:19 PM, Nitan Kainth wrote: > Do you see data on other DC or just directory structure? Directory > structure would populate because it

Re: Replication issue with Multi DC setup in cassandra

2017-05-16 Thread Nitan Kainth
check for datafiles on filesystem in both DCs. > On May 16, 2017, at 4:42 PM, suraj pasuparthy > wrote: > > So i though the same, > I see the data via the CQLSH in both the datacenters. consistency is set to LQ > > thanks > -Suraj > > On Tue, May 16, 2017 at 2:19

Re: Replication issue with Multi DC setup in cassandra

2017-05-16 Thread suraj pasuparthy
Yes is see them in the datacenter's data directories.. infact i see then even after i bring down the interface between the 2 DC's which further confirms that a local copy is maintained in the DC that was not configured in the strategy .. its quite important that we block the info for this keyspace

Re: Replication issue with Multi DC setup in cassandra

2017-05-16 Thread Nitan Kainth
Do you see data on other DC or just directory structure? Directory structure would populate because it is DDL but inserts shouldn’t populate, ideally. > On May 16, 2017, at 3:19 PM, suraj pasuparthy > wrote: > > elp me fig