Nodetool repair will take way more time than nodetool rebuild. How much data u have in your original data center? Repair should be run to make the data consistent in case of node down more than hintedhandoff period and dropped mutations. But as a thumb rule ,generally we run repair using opscenter (if using Datastax) most of the times.
So in your case run “nodetool rebuild <original data enter>” on all the nodes in new data center. For making the rebuild process fast, increase three parameters, compaction throughput , stream throughput and interdcstream throughput. Thanks Surbhi On Tue, Oct 30, 2018 at 11:29 PM Akshay Bhardwaj < akshay.bhardwaj1...@gmail.com> wrote: > Hi Jonathan, > > That makes sense. Thank you for the explanation. > > Another quick question, as the cluster is still operative and the data for > the past 2 weeks (since updating replication factor) is present in both the > data centres, should I run "nodetool rebuild" or "nodetool repair"? > > I read that nodetool rebuild is faster and is useful till the new data > centre is empty and no partition keys are present. So when is the good time > to use either of the commands and what impact can it have on the data > centre operations? > > Thanks and Regards > > Akshay Bhardwaj > +91-97111-33849 > > > On Wed, Oct 31, 2018 at 2:34 AM Jonathan Haddad <j...@jonhaddad.com> wrote: > >> You need to run "nodetool rebuild -- <existing-dc-name>" on each node in >> the new DC to get the old data to replicate. It doesn't do it >> automatically because Cassandra has no way of knowing if you're done adding >> nodes and if it were to migrate automatically, it could cause a lot of >> problems. Imagine streaming 100 nodes data to 3 nodes in the new DC, not >> fun. >> >> On Tue, Oct 30, 2018 at 1:59 PM Akshay Bhardwaj < >> akshay.bhardwaj1...@gmail.com> wrote: >> >>> Hi Experts, >>> >>> I previously had 1 Cassandra data centre in AWS Singapore region with 5 >>> nodes, with my keyspace's replication factor as 3 in Network topology. >>> >>> After this cluster has been running smoothly for 4 months (500 GB of >>> data on each node's disk), I added 2nd data centre in AWS Mumbai region >>> with yet again 5 nodes in Network topology. >>> >>> After updating my keyspace's replication factor to >>> {"AWS_Sgp":3,"AWS_Mum":3}, my expectation was that the data present in Sgp >>> region will immediately start replicating on the Mum region's nodes. >>> However even after 2 weeks I do not see historical data to be replicated, >>> but new data being written on Sgp region is present in Mum region as well. >>> >>> Any help or suggestions to debug this issue will be highly appreciated. >>> >>> Regards >>> Akshay Bhardwaj >>> +91-97111-33849 >>> >>> >>> >> >> -- >> Jon Haddad >> http://www.rustyrazorblade.com >> twitter: rustyrazorblade >> >> >> > >