Nick Bailey created CASSANDRA-7317:
--------------------------------------
Summary: Repair range validation and calculation is off
Key: CASSANDRA-7317
URL: https://issues.apache.org/jira/browse/CASSANDRA-7317
Project: Cassandra
Issue Type: Bug
Reporter: Nick Bailey
Fix For: 1.2.17, 2.0.8, 2.1 rc1
>From what I can tell the calculation (using the -pr option) and validation of
>tokens for repairing ranges is broken. Or at least should be improved. Using
>an example with ccm:
Nodetool ring:
{noformat}
Datacenter: dc1
==========
Address Rack Status State Load Owns Token
-10
127.0.0.1 r1 Up Normal 188.96 KB 50.00%
-9223372036854775808
127.0.0.2 r1 Up Normal 194.77 KB 50.00% -10
Datacenter: dc2
==========
Address Rack Status State Load Owns Token
0
127.0.0.4 r1 Up Normal 160.58 KB 0.00%
-9223372036854775798
127.0.0.3 r1 Up Normal 139.46 KB 0.00% 0
{noformat}
Schema:
{noformat}
CREATE KEYSPACE system_traces WITH replication = {
'class': 'NetworkTopologyStrategy',
'dc2': '2',
'dc1': '2'
};
{noformat}
Repair -pr:
{noformat}
[Nicks-MacBook-Pro:21:35:58 cassandra-2.0] cassandra$ bin/nodetool -p 7100
repair -pr system_traces
[2014-05-28 21:36:01,977] Starting repair command #12, repairing 1 ranges for
keyspace system_traces
[2014-05-28 21:36:02,207] Repair session f984d290-e6d9-11e3-9edc-5f8011daec21
for range (0,-9223372036854775808] finished
[2014-05-28 21:36:02,207] Repair command #12 finished
[Nicks-MacBook-Pro:21:36:02 cassandra-2.0] cassandra$ bin/nodetool -p 7200
repair -pr system_traces
[2014-05-28 21:36:14,086] Starting repair command #1, repairing 1 ranges for
keyspace system_traces
[2014-05-28 21:36:14,406] Repair session 00bd45b0-e6da-11e3-98fc-5f8011daec21
for range (-9223372036854775798,-10] finished
[2014-05-28 21:36:14,406] Repair command #1 finished
{noformat}
Note that repairing both nodes in dc1, leaves very small ranges unrepaired. For
example (-10,0]. Repairing the 'primary range' in dc2 will repair those small
ranges. Maybe that is the behavior we want but it seems counterintuitive.
The behavior when manually trying to repair the full range of 127.0.0.01
definitely needs improvement though.
Repair command:
{noformat}
[Nicks-MacBook-Pro:21:50:44 cassandra-2.0] cassandra$ bin/nodetool -p 7100
repair -st -10 -et -9223372036854775808 system_traces
[2014-05-28 21:50:55,803] Starting repair command #17, repairing 1 ranges for
keyspace system_traces
[2014-05-28 21:50:55,804] Starting repair command #17, repairing 1 ranges for
keyspace system_traces
[2014-05-28 21:50:55,804] Repair command #17 finished
[Nicks-MacBook-Pro:21:50:56 cassandra-2.0] cassandra$ echo $?
1
{noformat}
system.log:
{noformat}
ERROR [Thread-96] 2014-05-28 21:40:05,921 StorageService.java (line 2621)
Repair session failed:
java.lang.IllegalArgumentException: Requested range intersects a local range
but is not fully contained in one; this would lead to imprecise repair
{noformat}
* The actual output of the repair command doesn't really indicate that there
was an issue. Although the command does return with a non zero exit status.
* The error here is invisible if you are using the synchronous jmx repair api.
It will appear as though the repair completed successfully.
* Personally, I believe that should be a valid repair command. For the
system_traces keyspace, 127.0.0.1 is responsible for this range (and I would
argue the 'primary range' of the node.
--
This message was sent by Atlassian JIRA
(v6.2#6252)