Having caught a node in an undesirable state, many of my threads are reading
like this:
"SharedPool-Worker-5" #875 daemon prio=5 os_prio=0 tid=0x7f3e14196800
nid=0x96ce waiting on condition [0x7f3ddb835000]
java.lang.Thread.State: WAITING (parking)
at
>
>
> Forgive me, but what is CMS?
>
Sorry - ConcurrentMarkSweep garbage collector.
>
> No. I’ve tried some mitigations since tuning thread pool sizes and GC, but
> the problem begins with only an upgrade of Cassandra. No other system
> packages, kernels, etc.
>
>
>
>From what 2.0 version did
> On Nov 2, 2015, at 11:35 AM, Nate McCall wrote:
> Forgive me, but what is CMS?
>
> Sorry - ConcurrentMarkSweep garbage collector.
Ah, my brain was trying to think in terms of something Cassandra specific. I
have full GC logging on and since moving to G1, I haven’t
Does tpstats show unusually high counts for blocked flush writers?
As Sebastian suggests, running ttop will paint a clearer picture about what
is happening within C*. I would however recommend going back to CMS in this
case as that is the devil we all know and more folks will be able to offer
Using DSE 4.8.1 / 2.1.11.872, Java version 1.8.0_66
We upgraded our cluster this weekend and have been having issues with dropped
mutations since then. Intensely investigating a single node and toying with
settings has revealed that GC stalls don’t make up enough time to explain the
10 seconds
The thing about the CASSANDRA-9504 theory is that it was solved in 2.1.6
and Jeff's running 2.1.11.
@Jeff
How often does this happen? Can you watch ttop as soon as you notice
increased read/write latencies?
wget
> https://bintray.com/artifact/download/aragozin/generic/sjk-plus-0.3.6.jar
> java
Upgraded from 2.0.x. Using the other commit log sync method and 10 seconds.
Enabling batch mode is like swallowing a grenade.
It’s starting to look to me like it’s possibly related to brief IO spikes that
are smaller than my usual graphing granularity. It feels surprising to me that
these
Only if you actually change cassandra.yaml (that was the change in 2.1.6 which
is why it matters what version he upgraded from)
> On Oct 29, 2015, at 10:06 PM, Sebastian Estevez
> wrote:
>
> The thing about the CASSANDRA-9504 theory is that it was solved in
you didn’t say what you upgraded from, but if it is 2.0.x, then look at
CASSANDRA-9504
If so and you use
commitlog_sync: batch
Then you probably want to set
commitlog_sync_batch_window_in_ms: 1 (or 2)
Note I’m only slightly convinced this is the cause because of your READ_REPAIR
issues (though