mx4j Issues Following Upgrade to 2.1.6

2016-11-25 Thread Biscuit Ninja
We run some checks from our Monitoring software that rely on mx4j. The checks typically grab some xml via HTTP request and parse it. For example, CF Stats on 'MyKeySpace' and 'MyColumnFamily' are retrieved using:

Bootstrap fails on 3.10

2016-11-25 Thread Benjamin Roth
Hi! Today I wanted a new node to join the cluster. When looking at netstats on all the old nodes, it seemed like the streaming sessions did complete. They all said that all files have been transferred. But looking at the debug.log the stream sessions finished with an error. Also after all streams

Java GC pauses, reality check

2016-11-25 Thread S Ahmed
Hello! >From what I understand java GC pauses are pretty much a fact of life, but you can tune the jvm to reduce the likelihood of the frequency and length of GC pauses. When using Cassandra, how frequent or long have these pauses known to be? Even with tuning, is it safe to assume they cannot

Re: Bootstrap fails on 3.10

2016-11-25 Thread Paulo Motta
If you have an MV table It seems you're hitting https://issues.apache.org/jira/browse/CASSANDRA-12905. I will bump it's priority to critical since it can prevent or difficult bootstrap. Did you try resuming bootstrap with "nodetool bootstrap resume" after the failure? It may eventually succeed,

Re: Bootstrap fails on 3.10

2016-11-25 Thread Benjamin Roth
I proposed a quite simple fix for https://issues.apache.org/jira/browse/CASSANDRA-12905 Sorry that I don't supply a patch. I am good at analysing code but totally unexperienced with the workflows here. 2016-11-25 19:57 GMT+01:00 Benjamin Roth : > Yes, I have MVs. > >

Re: Java GC pauses, reality check

2016-11-25 Thread Chris Lohfink
No tuning will eliminate gcs. 20-30 seconds is horrific and out of the ordinary. Most likely implementing antipatterns and/or poorly configured. Sub 1s is realistic but with some workloads still may require some tuning to maintain. Some workloads are very unfriendly to GCs though (ie heavy

Re: Bootstrap fails on 3.10

2016-11-25 Thread Benjamin Roth
Yes, I have MVs. Interesting is also that in the middle of bootstrapping (cannot tell when exactly) it seemed like other nodes started to send hints to the bootstrapping node. When that happened, it seems that every single HintVerb fails also with a WTE. At least the logs are completely flooded

Re: Java GC pauses, reality check

2016-11-25 Thread Kant Kodali
+1 Chris Lohfink response I would also restate the following sentence "java GC pauses are pretty much a fact of life" to "Any GC based system pauses are pretty much a fact of life". I would be more than happy to see if someone can counter prove. On Fri, Nov 25, 2016 at 1:41 PM, Chris Lohfink

Re: Java GC pauses, reality check

2016-11-25 Thread Graham Sanderson
If you are seeing 25-30 second GC pauses then (unless you are so badly configured) seeing full GC under CMS (though G1 may have similar problems). With CMS eventual fragmentation causing promotion failure is inevitable (unless you cycle your nodes before it happens). Either your heap has way

Re: Java GC pauses, reality check

2016-11-25 Thread Martin Schröder
2016-11-25 23:38 GMT+01:00 Kant Kodali : > I would also restate the following sentence "java GC pauses are pretty much > a fact of life" to "Any GC based system pauses are pretty much a fact of > life". > > I would be more than happy to see if someone can counter prove. Azul

Re: Java GC pauses, reality check

2016-11-25 Thread Benjamin Roth
Lol. The counter proof is to use another memory Model like Arc. Thats why i personally think Java is NOT the First choice for Server Applications. But thats a philosophic discussion. Am 25.11.2016 23:38 schrieb "Kant Kodali" : > +1 Chris Lohfink response > > I would also

Re: Java GC pauses, reality check

2016-11-25 Thread Harikrishnan Pillai
Zing jvm reduces the pause under 10ms for most use cases. Sent from my iPhone On Nov 25, 2016, at 2:44 PM, Kant Kodali > wrote: +1 Chris Lohfink response I would also restate the following sentence "java GC pauses are pretty much a fact of life" to

Re: Java GC pauses, reality check

2016-11-25 Thread Oleksandr Shulgin
On Nov 25, 2016 23:47, "Graham Sanderson" wrote: If you are seeing 25-30 second GC pauses then (unless you are so badly configured) seeing full GC under CMS (though G1 may have similar problems). With CMS eventual fragmentation causing promotion failure is inevitable (unless

Re: Java GC pauses, reality check

2016-11-25 Thread Vladimir Yudovin
Hi Ahmed, obviously, 20-30 sec. pause is unacceptable. I suppose check the following: - disable swapping completely - check Java version, v8. is desirable (depending on Cassandra version) - use multiprocessor machine (it allows concurrent GC) Best regards, Vladimir Yudovin, Winguzone

Re: Java GC pauses, reality check

2016-11-25 Thread Benjamin Roth
Thanks! But getting back to the original issue: I think the GC itself is not the root cause for such a long pause. I remember having had issues with 1 minute GCs in the beginning. I also tried around with larger and smaller heap sizes and different GCs (G1, CMS), different settings but what

AW: Java GC pauses, reality check

2016-11-25 Thread Jan
https://www.azul.com/products/zing/order-zing/ At least a list price for zing I found there: 3k$ per year. - Ursprüngliche Nachricht - Von: "Work" Gesendet: ‎26.‎11.‎2016 07:53 An: "user@cassandra.apache.org" Betreff: Re: Java GC pauses,

Re: Java GC pauses, reality check

2016-11-25 Thread Harikrishnan Pillai
We are running azul zing in prod with 1 million reads/s and 100 K writes/s with azul .we never had a major gc above 10 ms . Sent from my iPhone > On Nov 25, 2016, at 3:49 PM, Martin Schröder wrote: > > 2016-11-25 23:38 GMT+01:00 Kant Kodali : >> I would

Re: Does recovery continue after truncating a table?

2016-11-25 Thread Ben Slater
Nice detective work! Seems to me that it’s a best an undocumented limitation and potentially could be viewed as a bug - maybe log another JIRA? One node - there is a nodetool truncatehints command that could be used to clear out the hints (

Re: Java GC pauses, reality check

2016-11-25 Thread Benjamin Roth
This sounds amazing but also expensive - I don't see pricing on their page. Are you able and allowed to tell a rough pricing range? Am 26.11.2016 04:33 schrieb "Harikrishnan Pillai" : > We are running azul zing in prod with 1 million reads/s and 100 K writes/s > with

Re: Java GC pauses, reality check

2016-11-25 Thread Work
I'm not affiliated with them, I've just been impressed by them. They have done amazing work in performance measurement. They discovered a major flaw in most performance testing ... I've never seen their pricing. But, recently, they made their product available for testing by developers. And the

Re: repair -pr in crontab

2016-11-25 Thread Artur Siekielski
Hi, yes, I read about how the repairing works, but the docs/blog posts lack practical recommendations and "best practices". For example, I found people having issues with running "repair -pr" simultaneously on all nodes, but it isn't clear it shouldn't be allowed. In the end I implemented

Re: repair -pr in crontab

2016-11-25 Thread Benjamin Roth
It is absolutely ok to run parallel repair -pr, if 1. the ranges do not overlap 2. if your cluster can handle the pressure - do not underestimate that. In reaper you can tweak some settings like repair intensity to give your cluster some time to breath between repair slices. 2016-11-25 11:34