To elaborate a bit on what Marcin said:
* Once a node starts to believe that a few other nodes are down, it seems
to stay that way for a very long time (hours). I'm not even sure it will
recover without a restart.
* I've tried to stop then start gossip with nodetool on the node that
thinks
Hey Marcin,
Are they actually going up and down repeatedly (flapping) or just down and
they never come back?
There might be different reasons for flapping nodes, but to list what I
have at the top of my head right now:
1. Network issues. I don't think it's your case, but you can read about the
Marcin ;
are all your nodes within the same Region ? If not in the same region,
what is the Snitch type that you are using ?
Jan/
On Thursday, April 2, 2015 3:28 AM, Michal Michalski
michal.michal...@boxever.com wrote:
Hey Marcin,
Are they actually going up and down
Do you happen to be using a tool like Nagios or Ganglia that are able to
report utilization (CPU, Load, disk io, network)? There are plugins for
both that will also notify you of (depending on whether you enabled the
intermediate GC logging) about what is happening.
On Thu, Apr 2, 2015 at 8:35