Re: Bootstrap failures: unable to find sufficient sources for streaming range

2014-08-15 Thread Mark Reddy
Hi Peter, At the time of the IllegalStateException, do you see the node that it should be streaming from marked as down by the failure detector? Mark On Fri, Aug 15, 2014 at 5:45 AM, Peter Haggerty peter.hagge...@librato.com wrote: When adding nodes via bootstrap to a 27 node 2.0.9 cluster

Re: Bootstrap failures: unable to find sufficient sources for streaming range

2014-08-15 Thread Peter Haggerty
Neither of the two nodes identified as having the range that IllegalStateException reports are mentioned by FailureDetector.java. There are 5 endpoints that FailureDetector says are 'unknown endpoint' but all of them are reported as UP by Gossiper.java before the schema complete, ready to

Re: Bootstrap failures: unable to find sufficient sources for streaming range

2014-08-15 Thread Peter Haggerty
I'll add that while we don't see either of the nodes with the range that failed marked as down one of them *does* appear to have a late arrival of the cluster of UP messages that happens just after the bootstrap fails: $ egrep 'JOIN|Illegal|10.11.12.14|10.11.12.13' system.log | grep -v ^DEBUG

Bootstrap failures: unable to find sufficient sources for streaming range

2014-08-14 Thread Peter Haggerty
When adding nodes via bootstrap to a 27 node 2.0.9 cluster with a cluster-wide phi_convict_threshold of 12 the nodes fail to bootstrap. This worked a half dozen times in the past few weeks as we've scaled this cluster from 21 to 24 and then to 27 nodes. There have been no configuration or