Nodetool repair will take way more time than nodetool rebuild.
How much data u have in your original data center?
Repair should be run to make the data consistent in case of node down more
than hintedhandoff period and dropped mutations.
But as a thumb rule ,generally we run repair using opscenter (if using
Datastax) most of the times.

So in your case run “nodetool rebuild <original data enter>” on all the
nodes in new data center.
For making the rebuild process fast, increase three parameters, compaction
throughput , stream throughput and interdcstream  throughput.

Thanks
Surbhi
On Tue, Oct 30, 2018 at 11:29 PM Akshay Bhardwaj <
akshay.bhardwaj1...@gmail.com> wrote:

> Hi Jonathan,
>
> That makes sense. Thank you for the explanation.
>
> Another quick question, as the cluster is still operative and the data for
> the past 2 weeks (since updating replication factor) is present in both the
> data centres, should I run "nodetool rebuild" or "nodetool repair"?
>
> I read that nodetool rebuild is faster and is useful till the new data
> centre is empty and no partition keys are present. So when is the good time
> to use either of the commands and what impact can it have on the data
> centre operations?
>
> Thanks and Regards
>
> Akshay Bhardwaj
> +91-97111-33849
>
>
> On Wed, Oct 31, 2018 at 2:34 AM Jonathan Haddad <j...@jonhaddad.com> wrote:
>
>> You need to run "nodetool rebuild -- <existing-dc-name>" on each node in
>> the new DC to get the old data to replicate.  It doesn't do it
>> automatically because Cassandra has no way of knowing if you're done adding
>> nodes and if it were to migrate automatically, it could cause a lot of
>> problems. Imagine streaming 100 nodes data to 3 nodes in the new DC, not
>> fun.
>>
>> On Tue, Oct 30, 2018 at 1:59 PM Akshay Bhardwaj <
>> akshay.bhardwaj1...@gmail.com> wrote:
>>
>>> Hi Experts,
>>>
>>> I previously had 1 Cassandra data centre in AWS Singapore region with 5
>>> nodes, with my keyspace's replication factor as 3 in Network topology.
>>>
>>> After this cluster has been running smoothly for 4 months (500 GB of
>>> data on each node's disk), I added 2nd data centre in AWS Mumbai region
>>> with yet again 5 nodes in Network topology.
>>>
>>> After updating my keyspace's replication factor to
>>> {"AWS_Sgp":3,"AWS_Mum":3}, my expectation was that the data present in Sgp
>>> region will immediately start replicating on the Mum region's nodes.
>>> However even after 2 weeks I do not see historical data to be replicated,
>>> but new data being written on Sgp region is present in Mum region as well.
>>>
>>> Any help or suggestions to debug this issue will be highly appreciated.
>>>
>>> Regards
>>> Akshay Bhardwaj
>>> +91-97111-33849
>>>
>>>
>>>
>>
>> --
>> Jon Haddad
>> http://www.rustyrazorblade.com
>> twitter: rustyrazorblade
>>
>>
>>
>
>

Reply via email to