Thanks Jens! this was helpful, also to avoid pending compaction buildup compaction throughput can be throttled higher. In our case however the property batchlog throttle (batchlog_replay_throttle_in_kb) was being the bottleneck increasing it to 10240k from default of 1024k reduced node addition time by atleast a factor 3x. (from this it could be inferred that batchlog is used during node addition).
An interesting hack for people on public cloud could be to get themselves a higher cpu capacity machine during bootstrap and then downgrading it once data is in place - as cpu essentially becomes bottleneck during node addition. I could find a property in docs - consistent.rangemovement, setting it to false which allows multiple node addition simultaneously. *JVM_OPTS="$JVM_OPTS -Dcassandra.consistent.rangemovement=false"* I could test with this property in a test cluster and things appeared fine. This constraint seems to be introduced in CASSANDRA-2434 <https://issues.apache.org/jira/browse/CASSANDRA-2434>. If we talk of today, what could be the possible implication of multiple node addition simultaneously, suppose if 2 min rule is taken into accord and a repair that works! Regards, Bhuvan On Mon, Sep 12, 2016 at 2:56 AM, Jens Rantil <jens.ran...@tink.se> wrote: > Yes. `nodetool setstreamthroughput` is your friend. > > > On Sunday, September 11, 2016, sai krishnam raju potturi < > pskraj...@gmail.com> wrote: > >> Make sure there is no spike in the load-avg on the existing nodes, as >> that might affect your application read request latencies. >> >> On Sun, Sep 11, 2016, 17:10 Jens Rantil <jens.ran...@tink.se> wrote: >> >>> Hi Bhuvan, >>> >>> I have done such expansion multiple times and can really recommend >>> bootstrapping a new DC and pointing your clients to it. The process is so >>> much faster and the documentation you referred to has worked out fine for >>> me. >>> >>> Cheers, >>> Jens >>> >>> >>> On Sunday, September 11, 2016, Bhuvan Rawal <bhu1ra...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> We are running Cassandra 3.6 and want to bump up Cassandra nodes in an >>>> existing datacenter from 3 to 12 (plan to move to r3.xlarge machines to >>>> leverage more memory instead of m4.2xlarge). Bootstrapping a node would >>>> take 7-8 hours. >>>> >>>> If this activity is performed serially then it will take 5-6 days. I >>>> had a look at CASSANDRA-7069 >>>> <https://issues.apache.org/jira/browse/CASSANDRA-7069> and a bit of >>>> discussion in the past at - http://grokbase.com/t/cassan >>>> dra/user/147gcqvybg/adding-more-nodes-into-the-cluster. Wanted to know >>>> if the limitation is still applicable and race condition could occur in 3.6 >>>> version. >>>> >>>> If this is not the case can we add a new datacenter as mentioned here >>>> opsAddDCToCluster >>>> <https://docs.datastax.com/en/cassandra/3.x/cassandra/operations/opsAddDCToCluster.html> >>>> and >>>> bootstrap multiple nodes simultaneously by keeping auto_bootstrap false in >>>> cassandra.yaml and rebuilding nodes simultaneously in the new dc? >>>> >>>> >>>> Thanks & Regards, >>>> Bhuvan >>>> >>> >>> >>> -- >>> Jens Rantil >>> Backend engineer >>> Tink AB >>> >>> Email: jens.ran...@tink.se >>> Phone: +46 708 84 18 32 >>> Web: www.tink.se >>> >>> Facebook <https://www.facebook.com/#!/tink.se> Linkedin >>> <http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary> >>> Twitter <https://twitter.com/tink> >>> >>> > > -- > Jens Rantil > Backend engineer > Tink AB > > Email: jens.ran...@tink.se > Phone: +46 708 84 18 32 > Web: www.tink.se > > Facebook <https://www.facebook.com/#!/tink.se> Linkedin > <http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary> > Twitter <https://twitter.com/tink> > >