Re: Cassandra taking very long to start and server under heavy load

2019-05-01 Thread Evgeny Inberg
Using a sigle data disk. Also, it is performing mostly heavy read operations according to the metrics cillected. On Wed, 1 May 2019, 20:14 Jeff Jirsa wrote: > Do you have multiple data disks? > Cassandra 6696 changed behavior with multiple data disks to make it safer > in the situation that one

Re: Bootstrapping to Replace a Dead Node vs. Adding a NewNode:Consistency Guarantees

2019-05-01 Thread Alok Dwivedi
Cassandra-2434 is ensuring that when we add new node, it streams data from a source that it will replace, once the data has been completely streamed. This is explained in detail in the blog post you shared. This ensures that one continues to get same consistency as it was before new node was

RE: Bootstrapping to Replace a Dead Node vs. Adding a NewNode:Consistency Guarantees

2019-05-01 Thread Fd Habash
Appreciate your response. As for extending the cluster & keeping the default range movement = true, C* won’t allow me to bootstrap multiples nodes, anyway. But, the question I’m still posing and have not gotten an answer for, is if fix Cassandra-2434 disallows bootstrapping multiple nodes

RE: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

2019-05-01 Thread ZAIDI, ASAD A
The article you mentioned here clearly says “For new users to Cassandra, the safest way to add multiple nodes into a cluster is to add them one at a time. Stay tuned as I will be following up with another post on bootstrapping.” When extending cluster it is indeed recommended to go slow &

RE: Bootstrapping to Replace a Dead Node vs. Adding a New Node:Consistency Guarantees

2019-05-01 Thread Fd Habash
Probably, I needed to be clearer in my inquiry …. I’m investigating a situation where our diagnostic data is telling us that C* has lost some of the application data. I mean, getsstables for the data returns zero on all nodes in all racks. The last pickle article below & Jeff Jirsa had

Re: Joining a node to the cluster when streaming hosts die

2019-05-01 Thread Jeff Jirsa
There is "resumable bootstrap" in 3.0 and newer (maybe 2.2 and newer?), but I've never used it and have no opinion about whether or not I'd trust it myself. I'd personally stop the joining instance, clear the data, and start again. On Wed, May 1, 2019 at 10:44 AM Nick Hatfield wrote: > Hello,

RE: cassandra node was put down with oom error

2019-05-01 Thread ZAIDI, ASAD A
Is there any chance partition size has grown over time and taking much allocated memory - if yes, that could also affect compaction thread as they'll too take more heap and kept in heap longer - leaving less for other processes . You can check partition size if they are manageable using

Re: Bootstrapping to Replace a Dead Node vs. Adding a New Node: Consistency Guarantees

2019-05-01 Thread Fred Habash
I, probably, should've been clearer in my inquiry ... I'm investigating a scenario where our diagnostic data is tell us that a small portion of application data has been lost. I mean, getsstables for the keys returns zero on all cluster nodes. The last pickle article below (which includes a case

Re: Cassandra taking very long to start and server under heavy load

2019-05-01 Thread Jeff Jirsa
Do you have multiple data disks? Cassandra 6696 changed behavior with multiple data disks to make it safer in the situation that one disk fails . It may be copying data to the right places on startup, can you see if sstables are being moved on disk? -- Jeff Jirsa > On May 1, 2019, at 6:04

Re: cassandra node was put down with oom error

2019-05-01 Thread Steve Lacerda
First, you have to find out where the memory is going. So, you can use the mbeans in jconsole or something like that. You'll have to look at different caches and offheap in cache and metrics types. Once you've figured that out, then you can start working on tuning things. Yes, your heap is 32G,

Cassandra taking very long to start and server under heavy load

2019-05-01 Thread Evgeny Inberg
I have upgraded a Cassandra cluster from version 2.0.x to 3.11.4 going trough 2.1.14. After the upgrade, noticed that each node is taking about 10-15 minutes to start, and server is under a very heavy load. Did some digging around and got view leads from the debug log. Messages like:

Re: cassandra node was put down with oom error

2019-05-01 Thread Sandeep Nethi
I think 3.11.3 has some bug and which can cause OOMs on nodes with full repairs. Just check if there is any correlation with ooms and repair process. Thanks, Sandeep On Wed, 1 May 2019 at 11:02 PM, Mia wrote: > Hi Sandeep. > > I'm not running any manual repair and I think there is no running

Re: cassandra node was put down with oom error

2019-05-01 Thread Mia
Hi Sandeep. I'm not running any manual repair and I think there is no running full repair. I cannot see any log about repair in system.log these days. Does full repair have anything to do with using large amount of memory? Thanks. On 2019/05/01 10:47:50, Sandeep Nethi wrote: > Are you by any

Re: cassandra node was put down with oom error

2019-05-01 Thread Sandeep Nethi
Are you by any chance running the full repair on these nodes? Thanks, Sandeep On Wed, 1 May 2019 at 10:46 PM, Mia wrote: > Hello, Ayub. > > I'm using apache cassandra, not dse edition. So I have never used the dse > search feature. > In my case, all the nodes of the cluster have the same

Re: Bootstrapping to Replace a Dead Node vs. Adding a New Node: Consistency Guarantees

2019-05-01 Thread Fred Habash
Thank you. Range movement is one reason this is enforced when adding a new node. But, what about forcing a consistent bootstrap i.e. bootstrapping from primary owner of the range and not a secondary replica. How’s consistent bootstrap enforced when replacing a dead node. - Thank you.

Re: cassandra node was put down with oom error

2019-05-01 Thread Mia
Hello, Ayub. I'm using apache cassandra, not dse edition. So I have never used the dse search feature. In my case, all the nodes of the cluster have the same problem. Thanks. On 2019/05/01 06:13:06, Ayub M wrote: > Do you have search on the same nodes or is it only cassandra. In my case it

Re: Exception while running two CQL queries in Parallel

2019-05-01 Thread Stefan Miklosovic
what are your replication factors for that keyspace? why are you using each quorum? might be handy https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlConfigSerialConsistency.html On Wed, 1 May 2019 at 17:57, Bhavesh Prajapati wrote: > > I had two queries run on same row in parallel

Exception while running two CQL queries in Parallel

2019-05-01 Thread Bhavesh Prajapati
I had two queries run on same row in parallel (that's a use-case). While Batch Query 2 completed successfully, query 1 failed with exception. Following are driver logs and sequence of log events. QUERY 1: STARTED 2019-04-30T13:14:50.858+ CQL update "EACH_QUORUM" "UPDATE dir SET bid='value'

Re: cassandra node was put down with oom error

2019-05-01 Thread Ayub M
Do you have search on the same nodes or is it only cassandra. In my case it was due to a memory leak bug in dse search that consumed more memory resulting in oom. On Tue, Apr 30, 2019, 2:58 AM yeomii...@gmail.com wrote: > Hello, > > I'm suffering from similar problem with OSS cassandra