Thank you Rob this is very helpful.  I'll keep you posted on any progress.

Are others running some what large nodes on CentOS 6.4 or similar?  Using java 
7?  We are also hosted through SoftLayer?  Any help is much appreciated.

In general I think Cassandra meets our needs but this is a blocker for us.

Thanks

On Feb 5, 2014 7:04 PM, Robert Coli <rc...@eventbrite.com> wrote:
On Wed, Feb 5, 2014 at 11:18 AM, Keith Wright 
<kwri...@nanigans.com<mailto:kwri...@nanigans.com>> wrote:
Hi Rob, thanks for the response!  Interestingly if we run a repair we don’t see 
the bootstrap issue so I am considering doing the empty node repair methodology.

Weird. Bootstrap should not be more fragile than repair.

 *   Update our JRE, we are using 1.7.0_17 and I believe we’re up to 1.7.0_54

Unlikely to be the cause, but couldn't hurt.

 *   GC tuning as it does appear that we’re suffering from GC issues.  We could 
just allocate more eden space and then revert after the bootstrap succeeds

This is a generalized cause of streaming failures, so sure. I'm not so sure 
about the specific proposed solution, but yes, it's possible that tuning your 
GC will make bootstrap possible.

 *   As I mentioned, don’t load data via bootstrap but instead use repair.  
With bootstrap disabled in Vnodes, will the node still assign itself tokens?

My belief is yes, and I just re-read the code and that's what it appears to do 
in the auto_bootstrap:false-with-num_tokens_set case.

You can verify for yourself by reading the code here :

https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob;f=src/java/org/apache/cassandra/service/StorageService.java;hb=HEAD

There are other methods of doing this which would be available to you if you 
were not using vnodes. Unfortunately the use of vnodes seems to preclude any 
copy-the-sstables method of cluster shifting short of copying all sstables to 
all nodes, globally uniquing their filenames first, and then running cleanup.

***** IMPORTANT WARNING ******

https://issues.apache.org/jira/browse/CASSANDRA-6615

Affects versions of Cassandra 1.2.x before 1.2.14, including the version of 
Cassandra you are running. It WILL REMOVE NODES FROM YOUR CLUSTER AND MAKE IT 
HARD TO GET THEM BACK IN IF YOU USE AUTO_BOOTSTRAP:FALSE UNDER CERTAIN 
CIRCUMSTANCES.

If you plan to use auto_bootstrap:false to deal with your issue, I VERY 
STRONGLY RECOMMEND UPGRADING TO 1.2.14 BEFORE DOING SO.

(The above warning applies to anyone using auto_bootstrap:false in 1.2.x, 
either stop doing that or upgrade to 1.2.14 ASAP.)

***** IMPORTANT WARNING ******

=Rob

Reply via email to