Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

2016-10-15 Thread Anuj Wadehra
Hi Leena,

Do you have a firewall between the two DCs? If yes, connection 
reset can be caused by Cassandra trying to use a TCP connection which is 
already closed by the firewall. Please make sure that you set high connection 
timeout at firewall. Also, make sure your servers are not overloaded. Please 
see 
https://developer.ibm.com/answers/questions/231996/why-do-we-get-the-error-connection-reset-by-peer-d.html

for general causes of connection reset. Also, as I told earlier, Cassandra 
troubleshooting explains it well 
https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html
 . Make sure firewall and node tcp settings are in sync such that nodes close a 
tcp connection before firewall does that.

With firewall timeout, we generally see merkle tree request/response failing 
between nodes in two DCs and then repair is hung for ever. Not sure how merkle 
tree creation  which is node specific would get impacted by multi dc setup. Are 
repairs with -local options completing without problems?

Thanks
Anuj


Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

2016-10-14 Thread Leena Ghatpande
Thank you for the update.


The repair fails with the Error 'Failed Creating merkle tree' but does not give 
any additional details.


With -pr running on all DC nodes, we see a peer connection reset error, which 
then results in hanged repair process even though the TCP connection settings 
looks good on all nodes.



From: Anuj Wadehra <anujw_2...@yahoo.co.in>
Sent: Wednesday, October 12, 2016 2:41 PM
To: user
Subject: Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair 
or repair with -pr

Hi Leena,

First thing you should be concerned about is : Why the repair -pr operation 
doesnt complete ?
Second comes the question : Which repair option is best?


One probable cause of stuck repairs is : if the firewall between DCs is closing 
TCP connections and Cassandra is trying to use such connections, repairs will 
hang. Please refer 
https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html
 . We faced that.

Also make sure you comply with basic bandwidth requirement between DCs. 
Recommended is 1000 Mb/s (1 gigabit) or greater.

Answers for specific questions:
1.As per my understanding, all replicas will not participate in dc local 
repairs and thus repair would be ineffective. You need to make sure that all 
replicas of a data in all dcs are in sync.

2. Every DC is not a ring. All DCs together form a token ring. So, I think yes 
you should run repair -pr on all nodes.

3. Yes. I dont have experience with incremental repairs. But you can run repair 
-pr on all nodes of all DCs.

Regarding Best approach of repair, you should see some repair presentations of 
Cassandra Summit 2016. All are online now.

I attended the summit and people using large clusters generally use sub range 
repairs to repair their clusters. But such large deployments are on older 
Cassandra versions and these deployments generally dont use vnodes. So people 
know easily which nodes hold which token range.



Thanks
Anuj



From: Leena Ghatpande <lghatpa...@hotmail.com>;
To: user@cassandra.apache.org <user@cassandra.apache.org>;
Subject: Repair in Multi Datacenter - Should you use -dc Datacenter repair or 
repair with -pr
Sent: Wed, Oct 12, 2016 2:15:51 PM


Please advice. Cannot find any clear documentation on what is the best strategy 
for repairing nodes on a regular basis with multiple datacenters involved.


We are running cassandra 3.7 in multi datacenter with 4 nodes in each data 
center. We are trying to run repairs every other night to keep the nodes in 
good state.We currently run repair with -pr option , but the repair process 
gets hung and does not complete gracefully. Dont see any errors in the logs 
either.


What is the best way to perform repairs on multiple data centers on large 
tables.

1. Can we run Datacenter repair using -dc option for each data center? Do we 
need to run repair on each node in that case or will it repair all nodes within 
the datacenter?

2. Is running repair with -pr across all nodes required , if we perform the 
step 1 every night?

3. Is cross data center repair required and if so whats the best option?


Thanks


Leena





Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

2016-10-13 Thread kurt Greaves
Don't do pr repairs when using incremental repair, you'll just end up with
loads of anti-compactions.

On 12 October 2016 at 19:11, Harikrishnan Pillai <hpil...@walmartlabs.com>
wrote:

> In my experience dc local repair node by node with
> Pr and par options is best .full repair increased sstables
> A lot and take days to compact it back or another
> Easy option for repair is use a spark job ,read all data with
> Consistency all and increase read repair chance to
> 100 % or use Netflix tickler
>
> Sent from my iPhone
>
> On Oct 12, 2016, at 11:44 AM, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:
>
> Hi Leena,
>
> First thing you should be concerned about is : Why the repair -pr
> operation doesnt complete ?
> Second comes the question : Which repair option is best?
>
>
> One probable cause of stuck repairs is : if the firewall between DCs is
> closing TCP connections and Cassandra is trying to use such connections,
> repairs will hang. Please refer https://docs.datastax.com/en/
> cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html . We
> faced that.
>
> Also make sure you comply with basic bandwidth requirement between DCs.
> Recommended is 1000 Mb/s (1 gigabit) or greater.
>
> Answers for specific questions:
> 1.As per my understanding, all replicas will not participate in dc local
> repairs and thus repair would be ineffective. You need to make sure that
> all replicas of a data in all dcs are in sync.
>
> 2. Every DC is not a ring. All DCs together form a token ring. So, I think
> yes you should run repair -pr on all nodes.
>
> 3. Yes. I dont have experience with incremental repairs. But you can run
> repair -pr on all nodes of all DCs.
>
> Regarding Best approach of repair, you should see some repair
> presentations of Cassandra Summit 2016. All are online now.
>
> I attended the summit and people using large clusters generally use sub
> range repairs to repair their clusters. But such large deployments are on
> older Cassandra versions and these deployments generally dont use vnodes.
> So people know easily which nodes hold which token range.
>
>
>
> Thanks
> Anuj
>
> --------------
> *From: *Leena Ghatpande <lghatpa...@hotmail.com>;
> *To: *user@cassandra.apache.org <user@cassandra.apache.org>;
> *Subject: *Repair in Multi Datacenter - Should you use -dc Datacenter
> repair or repair with -pr
> *Sent: *Wed, Oct 12, 2016 2:15:51 PM
>
> Please advice. Cannot find any clear documentation on what is the best
> strategy for repairing nodes on a regular basis with multiple datacenters
> involved.
>
>
> We are running cassandra 3.7 in multi datacenter with 4 nodes in each data
> center. We are trying to run repairs every other night to keep the nodes in
> good state.We currently run repair with -pr option , but the repair process
> gets hung and does not complete gracefully. Dont see any errors in the logs
> either.
>
>
> What is the best way to perform repairs on multiple data centers on large
> tables.
>
> 1. Can we run Datacenter repair using -dc option for each data center? Do
> we need to run repair on each node in that case or will it repair all nodes
> within the datacenter?
>
> 2. Is running repair with -pr across all nodes required , if we perform
> the step 1 every night?
>
> 3. Is cross data center repair required and if so whats the best option?
>
>
> Thanks
>
>
> Leena
>
>
>
>


-- 
Kurt Greaves
k...@instaclustr.com
www.instaclustr.com


Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

2016-10-12 Thread Harikrishnan Pillai
In my experience dc local repair node by node with
Pr and par options is best .full repair increased sstables
A lot and take days to compact it back or another
Easy option for repair is use a spark job ,read all data with
Consistency all and increase read repair chance to
100 % or use Netflix tickler

Sent from my iPhone

On Oct 12, 2016, at 11:44 AM, Anuj Wadehra 
<anujw_2...@yahoo.co.in<mailto:anujw_2...@yahoo.co.in>> wrote:

Hi Leena,

First thing you should be concerned about is : Why the repair -pr operation 
doesnt complete ?
Second comes the question : Which repair option is best?


One probable cause of stuck repairs is : if the firewall between DCs is closing 
TCP connections and Cassandra is trying to use such connections, repairs will 
hang. Please refer 
https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html
 . We faced that.

Also make sure you comply with basic bandwidth requirement between DCs. 
Recommended is 1000 Mb/s (1 gigabit) or greater.

Answers for specific questions:
1.As per my understanding, all replicas will not participate in dc local 
repairs and thus repair would be ineffective. You need to make sure that all 
replicas of a data in all dcs are in sync.

2. Every DC is not a ring. All DCs together form a token ring. So, I think yes 
you should run repair -pr on all nodes.

3. Yes. I dont have experience with incremental repairs. But you can run repair 
-pr on all nodes of all DCs.

Regarding Best approach of repair, you should see some repair presentations of 
Cassandra Summit 2016. All are online now.

I attended the summit and people using large clusters generally use sub range 
repairs to repair their clusters. But such large deployments are on older 
Cassandra versions and these deployments generally dont use vnodes. So people 
know easily which nodes hold which token range.



Thanks
Anuj



From: Leena Ghatpande <lghatpa...@hotmail.com<mailto:lghatpa...@hotmail.com>>;
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>;
Subject: Repair in Multi Datacenter - Should you use -dc Datacenter repair or 
repair with -pr
Sent: Wed, Oct 12, 2016 2:15:51 PM


Please advice. Cannot find any clear documentation on what is the best strategy 
for repairing nodes on a regular basis with multiple datacenters involved.


We are running cassandra 3.7 in multi datacenter with 4 nodes in each data 
center. We are trying to run repairs every other night to keep the nodes in 
good state.We currently run repair with -pr option , but the repair process 
gets hung and does not complete gracefully. Dont see any errors in the logs 
either.


What is the best way to perform repairs on multiple data centers on large 
tables.

1. Can we run Datacenter repair using -dc option for each data center? Do we 
need to run repair on each node in that case or will it repair all nodes within 
the datacenter?

2. Is running repair with -pr across all nodes required , if we perform the 
step 1 every night?

3. Is cross data center repair required and if so whats the best option?


Thanks


Leena





Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

2016-10-12 Thread Anuj Wadehra
Hi Leena,

First thing you should be concerned about is : Why the repair -pr operation 
doesnt complete ?
Second comes the question : Which repair option is best?


One probable cause of stuck repairs is : if the firewall between DCs is closing 
TCP connections and Cassandra is trying to use such connections, repairs will 
hang. Please refer 
https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html
 . We faced that.

Also make sure you comply with basic bandwidth requirement between DCs. 
Recommended is 1000 Mb/s (1 gigabit) or greater.

Answers for specific questions:
1.As per my understanding, all replicas will not participate in dc local 
repairs and thus repair would be ineffective. You need to make sure that all 
replicas of a data in all dcs are in sync.

2. Every DC is not a ring. All DCs together form a token ring. So, I think yes 
you should run repair -pr on all nodes.

3. Yes. I dont have experience with incremental repairs. But you can run repair 
-pr on all nodes of all DCs.

Regarding Best approach of repair, you should see some repair presentations of 
Cassandra Summit 2016. All are online now.

I attended the summit and people using large clusters generally use sub range 
repairs to repair their clusters. But such large deployments are on older 
Cassandra versions and these deployments generally dont use vnodes. So people 
know easily which nodes hold which token range.



Thanks
Anuj


RE: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

2016-10-12 Thread Anubhav Kale
Agree.

However, if we go from a world where repairs don’t run (or run very unreliably 
so C* can’t mark the SSTables as repaired anyways) to a world where repairs run 
more reliably (Spark / Tickler approach) – the impact on tombstone removal 
doesn’t become any worse (because SS Tables aren’t marked either ways).

From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com]
Sent: Wednesday, October 12, 2016 9:25 AM
To: user@cassandra.apache.org
Subject: Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair 
or repair with -pr

Note that the tickle approach doesn’t mark sstables as repaired (it’s a simpler 
version of mutation based repair in a sense), so Cassandra has no way to prove 
that the data has been repaired.

With tickets like https://issues.apache.org/jira/browse/CASSANDRA-6434, this 
has implications on tombstone removal.


From: Anubhav Kale 
<anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Wednesday, October 12, 2016 at 9:17 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: RE: Repair in Multi Datacenter - Should you use -dc Datacenter repair 
or repair with -pr

The default repair process doesn’t usually work at scale, unfortunately.

Depending on your data size, you have the following options.


Netflix Tickler: 
https://github.com/ckalantzis/cassTickler<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ckalantzis_cassTickler=DQMFAg=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=2YcSoi47BW6-V4BVz980x1Jr7cFbVwc8arJP3Qs4M-0=SIf2vucsd5X4ox-awetoQaxhIO5n3U3b4XzCTiCHT1g=>
 (Read at CL.ALL via CQL continuously :: Python)

Spotify Reaper: 
https://github.com/spotify/cassandra-reaper<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_spotify_cassandra-2Dreaper=DQMFAg=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=2YcSoi47BW6-V4BVz980x1Jr7cFbVwc8arJP3Qs4M-0=PMkQdggR0dnPHGJ8d7mY-vxxyitPWSlgSdFiLVOm8lA=>
 (Subrange repair, provides a REST endpoint and calls APIs through JMX :: Java)

List subranges: 
https://github.com/pauloricardomg/cassandra-list-subranges<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pauloricardomg_cassandra-2Dlist-2Dsubranges=DQMFAg=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=2YcSoi47BW6-V4BVz980x1Jr7cFbVwc8arJP3Qs4M-0=f7n9PVE3EeDZMk2I2LhX9MnpPWV7yTGUfPKwImjIxZU=>
 (Tool to get subranges for a given node. :: Java)

Subrange Repair: 
https://github.com/BrianGallew/cassandra_range_repair<https://urldefense.proofpoint.com/v2/url?u=https-3A__na01.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fgithub.com-252FBrianGallew-252Fcassandra-5Frange-5Frepair-26data-3D01-257C01-257CAnubhav.Kale-2540microsoft.com-257Cd8ed7c743f3a42ebac1808d3e94a97e4-257C72f988bf86f141af91ab2d7cd011db47-257C1-26sdata-3DrnOdSYfxRuV0RiXnI9HcLB220StFRDXSCMdoOQKcfvE-253D-26reserved-3D0=DQMFAg=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=2YcSoi47BW6-V4BVz980x1Jr7cFbVwc8arJP3Qs4M-0=9pPoqSUhM0LtWSO_nhHuqqtY9kvhMaoPIcg4PfFLGx0=>
 (Tool to subrange repair :: Python)

Mutation Based Repair (Not ready yet): 
https://issues.apache.org/jira/browse/CASSANDRA-8911<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_CASSANDRA-2D8911=DQMFAg=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=2YcSoi47BW6-V4BVz980x1Jr7cFbVwc8arJP3Qs4M-0=sodsKAWrUPXZ3YUR_rx2DKzeq6N6grWEhbr-JknNU0Y=>
 (C* is thinking of doing this - hot off the press)

If you have Spark in your system, you could use that to do what Netflix Tickler 
does. We’re experimenting with it and seems to be the best fit for our datasets 
over all the other options.

From: Leena Ghatpande [mailto:lghatpa...@hotmail.com]
Sent: Wednesday, October 12, 2016 7:16 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Repair in Multi Datacenter - Should you use -dc Datacenter repair or 
repair with -pr


Please advice. Cannot find any clear documentation on what is the best strategy 
for repairing nodes on a regular basis with multiple datacenters involved.



We are running cassandra 3.7 in multi datacenter with 4 nodes in each data 
center. We are trying to run repairs every other night to keep the nodes in 
good state.We currently run repair with -pr option , but the repair process 
gets hung and does not complete gracefully. Dont see any errors in the logs 
either.



What is the best way to perform repairs on multiple data centers on large 
tables.

1. Can we run Datacenter repair using -dc option for 

Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

2016-10-12 Thread Jeff Jirsa
Note that the tickle approach doesn’t mark sstables as repaired (it’s a simpler 
version of mutation based repair in a sense), so Cassandra has no way to prove 
that the data has been repaired. 

 

With tickets like https://issues.apache.org/jira/browse/CASSANDRA-6434, this 
has implications on tombstone removal.

 

 

From: Anubhav Kale <anubhav.k...@microsoft.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Wednesday, October 12, 2016 at 9:17 AM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: RE: Repair in Multi Datacenter - Should you use -dc Datacenter repair 
or repair with -pr

 

The default repair process doesn’t usually work at scale, unfortunately. 

 

Depending on your data size, you have the following options.

 

Netflix Tickler: https://github.com/ckalantzis/cassTickler (Read at CL.ALL via 
CQL continuously :: Python)

Spotify Reaper: https://github.com/spotify/cassandra-reaper (Subrange repair, 
provides a REST endpoint and calls APIs through JMX :: Java)

List subranges: https://github.com/pauloricardomg/cassandra-list-subranges 
(Tool to get subranges for a given node. :: Java)

Subrange Repair: https://github.com/BrianGallew/cassandra_range_repair (Tool to 
subrange repair :: Python)

Mutation Based Repair (Not ready yet): 
https://issues.apache.org/jira/browse/CASSANDRA-8911 (C* is thinking of doing 
this - hot off the press)

 

If you have Spark in your system, you could use that to do what Netflix Tickler 
does. We’re experimenting with it and seems to be the best fit for our datasets 
over all the other options.

 

From: Leena Ghatpande [mailto:lghatpa...@hotmail.com] 
Sent: Wednesday, October 12, 2016 7:16 AM
To: user@cassandra.apache.org
Subject: Repair in Multi Datacenter - Should you use -dc Datacenter repair or 
repair with -pr

 

Please advice. Cannot find any clear documentation on what is the best strategy 
for repairing nodes on a regular basis with multiple datacenters involved.

 

We are running cassandra 3.7 in multi datacenter with 4 nodes in each data 
center. We are trying to run repairs every other night to keep the nodes in 
good state.We currently run repair with -pr option , but the repair process 
gets hung and does not complete gracefully. Dont see any errors in the logs 
either. 

 

What is the best way to perform repairs on multiple data centers on large 
tables.

1. Can we run Datacenter repair using -dc option for each data center? Do we 
need to run repair on each node in that case or will it repair all nodes within 
the datacenter?

2. Is running repair with -pr across all nodes required , if we perform the 
step 1 every night?

3. Is cross data center repair required and if so whats the best option?

 

Thanks

 

Leena

 

 


CONFIDENTIALITY NOTE: This e-mail and any attachments are confidential and may 
be legally privileged. If you are not the intended recipient, do not disclose, 
copy, distribute, or use this email or any attachments. If you have received 
this in error please let the sender know and then delete the email and all 
attachments.


smime.p7s
Description: S/MIME cryptographic signature


RE: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

2016-10-12 Thread Anubhav Kale
The default repair process doesn't usually work at scale, unfortunately.

Depending on your data size, you have the following options.


Netflix Tickler: https://github.com/ckalantzis/cassTickler (Read at CL.ALL via 
CQL continuously :: Python)

Spotify Reaper: https://github.com/spotify/cassandra-reaper (Subrange repair, 
provides a REST endpoint and calls APIs through JMX :: Java)

List subranges: https://github.com/pauloricardomg/cassandra-list-subranges 
(Tool to get subranges for a given node. :: Java)

Subrange Repair: 
https://github.com/BrianGallew/cassandra_range_repair<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FBrianGallew%2Fcassandra_range_repair=01%7C01%7CAnubhav.Kale%40microsoft.com%7Cd8ed7c743f3a42ebac1808d3e94a97e4%7C72f988bf86f141af91ab2d7cd011db47%7C1=rnOdSYfxRuV0RiXnI9HcLB220StFRDXSCMdoOQKcfvE%3D=0>
 (Tool to subrange repair :: Python)

Mutation Based Repair (Not ready yet): 
https://issues.apache.org/jira/browse/CASSANDRA-8911 (C* is thinking of doing 
this - hot off the press)

If you have Spark in your system, you could use that to do what Netflix Tickler 
does. We're experimenting with it and seems to be the best fit for our datasets 
over all the other options.

From: Leena Ghatpande [mailto:lghatpa...@hotmail.com]
Sent: Wednesday, October 12, 2016 7:16 AM
To: user@cassandra.apache.org
Subject: Repair in Multi Datacenter - Should you use -dc Datacenter repair or 
repair with -pr


Please advice. Cannot find any clear documentation on what is the best strategy 
for repairing nodes on a regular basis with multiple datacenters involved.



We are running cassandra 3.7 in multi datacenter with 4 nodes in each data 
center. We are trying to run repairs every other night to keep the nodes in 
good state.We currently run repair with -pr option , but the repair process 
gets hung and does not complete gracefully. Dont see any errors in the logs 
either.



What is the best way to perform repairs on multiple data centers on large 
tables.

1. Can we run Datacenter repair using -dc option for each data center? Do we 
need to run repair on each node in that case or will it repair all nodes within 
the datacenter?

2. Is running repair with -pr across all nodes required , if we perform the 
step 1 every night?

3. Is cross data center repair required and if so whats the best option?



Thanks



Leena






Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

2016-10-12 Thread Leena Ghatpande
Please advice. Cannot find any clear documentation on what is the best strategy 
for repairing nodes on a regular basis with multiple datacenters involved.


We are running cassandra 3.7 in multi datacenter with 4 nodes in each data 
center. We are trying to run repairs every other night to keep the nodes in 
good state.We currently run repair with -pr option , but the repair process 
gets hung and does not complete gracefully. Dont see any errors in the logs 
either.


What is the best way to perform repairs on multiple data centers on large 
tables.

1. Can we run Datacenter repair using -dc option for each data center? Do we 
need to run repair on each node in that case or will it repair all nodes within 
the datacenter?

2. Is running repair with -pr across all nodes required , if we perform the 
step 1 every night?

3. Is cross data center repair required and if so whats the best option?


Thanks


Leena