[ 
https://issues.apache.org/jira/browse/CASSANDRA-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14013260#comment-14013260
 ] 

Nick Bailey commented on CASSANDRA-7317:
----------------------------------------

Perhaps it is a difference in mental models, but let me try to explain my 
reasoning. In general, I think datacenter's should be considered as separate 
rings. Replication is configured separately for each datacenter, tokens are 
balanced separately in each datacenter (for non vnodes), and consistency levels 
can be specified with specific datacenter requirements.

To somewhat further illustrate my point, Cassandra agrees with me when you 
modify the schema of the system_traces keyspace to only exist in dc 1:

{noformat}
[Nicks-MacBook-Pro:22:26:41 cassandra-2.0] cassandra$ bin/nodetool -p 7100 
repair -pr system_traces
[2014-05-29 22:26:46,958] Starting repair command #3, repairing 2 ranges for 
keyspace system_traces
[2014-05-29 22:26:47,148] Repair session 3ae1cce0-e7aa-11e3-aaee-5f8011daec21 
for range (0,-9223372036854775808] finished
[2014-05-29 22:26:47,149] Repair session 3afc80d0-e7aa-11e3-aaee-5f8011daec21 
for range (-10,0] finished
[2014-05-29 22:26:47,149] Repair command #3 finished
[Nicks-MacBook-Pro:22:26:47 cassandra-2.0] cassandra$ bin/nodetool -p 7300 
repair -pr system_traces
[2014-05-29 22:26:54,907] Nothing to repair for keyspace 'system_traces'
[Nicks-MacBook-Pro:22:34:55 cassandra-2.0] cassandra$ bin/nodetool -p 7100 
repair -st -9223372036854775808 -et 10 system_traces
[2014-05-29 22:35:02,604] Starting repair command #6, repairing 1 ranges for 
keyspace system_traces
[2014-05-29 22:35:02,604] Starting repair command #6, repairing 1 ranges for 
keyspace system_traces
[2014-05-29 22:35:02,604] Repair command #6 finished
{noformat}

Repairing the 'primary range' of node1 actually repairs two ranges (although 
those two ranges are really just one). Repairing the primary range of node3 
does nothing. And asking C* to repair the entire range that it just repaired as 
two separate ranges still fails.

> Repair range validation and calculation is off
> ----------------------------------------------
>
>                 Key: CASSANDRA-7317
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7317
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Nick Bailey
>             Fix For: 2.0.9
>
>         Attachments: Untitled Diagram(1).png
>
>
> From what I can tell the calculation (using the -pr option) and validation of 
> tokens for repairing ranges is broken. Or at least should be improved. Using 
> an example with ccm:
> Nodetool ring:
> {noformat}
> Datacenter: dc1
> ==========
> Address    Rack        Status State   Load            Owns                
> Token
>                                                                           -10
> 127.0.0.1  r1          Up     Normal  188.96 KB       50.00%              
> -9223372036854775808
> 127.0.0.2  r1          Up     Normal  194.77 KB       50.00%              -10
> Datacenter: dc2
> ==========
> Address    Rack        Status State   Load            Owns                
> Token
>                                                                           0
> 127.0.0.4  r1          Up     Normal  160.58 KB       0.00%               
> -9223372036854775798
> 127.0.0.3  r1          Up     Normal  139.46 KB       0.00%               0
> {noformat}
> Schema:
> {noformat}
> CREATE KEYSPACE system_traces WITH replication = {
>   'class': 'NetworkTopologyStrategy',
>   'dc2': '2',
>   'dc1': '2'
> };
> {noformat}
> Repair -pr:
> {noformat}
> [Nicks-MacBook-Pro:21:35:58 cassandra-2.0] cassandra$ bin/nodetool -p 7100 
> repair -pr system_traces
> [2014-05-28 21:36:01,977] Starting repair command #12, repairing 1 ranges for 
> keyspace system_traces
> [2014-05-28 21:36:02,207] Repair session f984d290-e6d9-11e3-9edc-5f8011daec21 
> for range (0,-9223372036854775808] finished
> [2014-05-28 21:36:02,207] Repair command #12 finished
> [Nicks-MacBook-Pro:21:36:02 cassandra-2.0] cassandra$ bin/nodetool -p 7200 
> repair -pr system_traces
> [2014-05-28 21:36:14,086] Starting repair command #1, repairing 1 ranges for 
> keyspace system_traces
> [2014-05-28 21:36:14,406] Repair session 00bd45b0-e6da-11e3-98fc-5f8011daec21 
> for range (-9223372036854775798,-10] finished
> [2014-05-28 21:36:14,406] Repair command #1 finished
> {noformat}
> Note that repairing both nodes in dc1, leaves very small ranges unrepaired. 
> For example (-10,0]. Repairing the 'primary range' in dc2 will repair those 
> small ranges. Maybe that is the behavior we want but it seems 
> counterintuitive.
> The behavior when manually trying to repair the full range of 127.0.0.01 
> definitely needs improvement though.
> Repair command:
> {noformat}
> [Nicks-MacBook-Pro:21:50:44 cassandra-2.0] cassandra$ bin/nodetool -p 7100 
> repair -st -10 -et -9223372036854775808 system_traces
> [2014-05-28 21:50:55,803] Starting repair command #17, repairing 1 ranges for 
> keyspace system_traces
> [2014-05-28 21:50:55,804] Starting repair command #17, repairing 1 ranges for 
> keyspace system_traces
> [2014-05-28 21:50:55,804] Repair command #17 finished
> [Nicks-MacBook-Pro:21:50:56 cassandra-2.0] cassandra$ echo $?
> 1
> {noformat}
> system.log:
> {noformat}
> ERROR [Thread-96] 2014-05-28 21:40:05,921 StorageService.java (line 2621) 
> Repair session failed:
> java.lang.IllegalArgumentException: Requested range intersects a local range 
> but is not fully contained in one; this would lead to imprecise repair
> {noformat}
> * The actual output of the repair command doesn't really indicate that there 
> was an issue. Although the command does return with a non zero exit status.
> * The error here is invisible if you are using the synchronous jmx repair 
> api. It will appear as though the repair completed successfully.
> * Personally, I believe that should be a valid repair command. For the 
> system_traces keyspace, 127.0.0.1 is responsible for this range (and I would 
> argue the 'primary range' of the node).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to