-D does not do what you think it does. I've quoted the relevant documentation from the README:
> > <https://github.com/BrianGallew/cassandra_range_repair#multiple-datacenters>Multiple > Datacenters > > If you have multiple datacenters in your ring, then you MUST specify the > name of the datacenter containing the node you are repairing as part of the > command-line options (--datacenter=DCNAME). Failure to do so will result in > only a subset of your data being repaired (approximately > data/number-of-datacenters). This is because nodetool has no way to > determine the relevant DC on its own, which in turn means it will use the > tokens from every ring member in every datacenter. > On 11 August 2016 at 12:24, Paulo Motta <pauloricard...@gmail.com> wrote: > > if we want to use -pr option ( which i suppose we should to prevent > duplicate checks) in 2.0 then if we run the repair on all nodes in a single > DC then it should be sufficient and we should not need to run it on all > nodes across DC's? > > No, because the primary ranges of the nodes in other DCs will be missing > repair, so you should either run with -pr in all nodes in all DCs, or > restrict repair to a specific DC with -local (and have duplicate checks). > Combined -pr and -local are only supported on 2.1 > > > 2016-08-11 1:29 GMT-03:00 Anishek Agarwal <anis...@gmail.com>: > >> ok thanks, so if we want to use -pr option ( which i suppose we should to >> prevent duplicate checks) in 2.0 then if we run the repair on all nodes in >> a single DC then it should be sufficient and we should not need to run it >> on all nodes across DC's ? >> >> >> >> On Wed, Aug 10, 2016 at 5:01 PM, Paulo Motta <pauloricard...@gmail.com> >> wrote: >> >>> On 2.0 repair -pr option is not supported together with -local, -hosts >>> or -dc, since it assumes you need to repair all nodes in all DCs and it >>> will throw and error if you try to run with nodetool, so perhaps there's >>> something wrong with range_repair options parsing. >>> >>> On 2.1 it was added support to simultaneous -pr and -local options on >>> CASSANDRA-7450, so if you need that you can either upgade to 2.1 or >>> backport that to 2.0. >>> >>> >>> 2016-08-10 5:20 GMT-03:00 Anishek Agarwal <anis...@gmail.com>: >>> >>>> Hello, >>>> >>>> We have 2.0.17 cassandra cluster(*DC1*) with a cross dc setup with a >>>> smaller cluster(*DC2*). After reading various blogs about >>>> scheduling/running repairs looks like its good to run it with the following >>>> >>>> >>>> -pr for primary range only >>>> -st -et for sub ranges >>>> -par for parallel >>>> -dc to make sure we can schedule repairs independently on each Data >>>> centre we have. >>>> >>>> i have configured the above using the repair utility @ >>>> https://github.com/BrianGallew/cassandra_range_repair.git >>>> >>>> which leads to the following command : >>>> >>>> ./src/range_repair.py -k [keyspace] -c [columnfamily name] -v -H >>>> localhost -p -D* DC1* >>>> >>>> but looks like the merkle tree is being calculated on nodes which are >>>> part of other *DC2.* >>>> >>>> why does this happen? i thought it should only look at the nodes in >>>> local cluster. however on nodetool the* -pr* option cannot be used >>>> with *-local* according to docs @https://docs.datastax.com/en/ >>>> cassandra/2.0/cassandra/tools/toolsRepair.html >>>> >>>> so i am may be missing something, can someone help explain this please. >>>> >>>> thanks >>>> anishek >>>> >>> >>> >> > -- Kurt Greaves k...@instaclustr.com www.instaclustr.com