[
https://issues.apache.org/jira/browse/CASSANDRA-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13635389#comment-13635389
]
Ryan McGuire commented on CASSANDRA-5178:
-----------------------------------------
I'm trying to simply this process a bit from what you've described, so far I
have not been able to reproduce this behaviour on 1.1.7. Here's my process so
far:
Bring up 4 node cluster with two datacenters:
{code}
Address DC Rack Status State Load Owns
Token
85070591730234615865843651857942052964
192.168.1.141 dc1 r1 Up Normal 11.13 KB 50.00%
0
192.168.1.145 dc2 r1 Up Normal 11.1 KB 0.00%
100
192.168.1.143 dc1 r1 Up Normal 11.11 KB 50.00%
85070591730234615865843651857942052864
192.168.1.133 dc2 r1 Up Normal 11.1 KB 0.00%
85070591730234615865843651857942052964
{code}
Manually shutdown dc2.
{code}
Address DC Rack Status State Load Owns
Token
85070591730234615865843651857942052964
192.168.1.141 dc1 r1 Up Normal 11.13 KB 50.00%
0
192.168.1.145 dc2 r1 Down Normal 15.53 KB 0.00%
100
192.168.1.143 dc1 r1 Up Normal 15.88 KB 50.00%
85070591730234615865843651857942052864
192.168.1.133 dc2 r1 Down Normal 15.53 KB 0.00%
85070591730234615865843651857942052964
{code}
Create schema:
{code}
CREATE KEYSPACE ryan WITH strategy_class = 'NetworkTopologyStrategy' AND
strategy_options:dc1 = '2';
CREATE TABLE ryan.test (n int primary key, x int);
{code}
Create data to import:
{code}
seq 500000 | sed 's/$/,1/' | split -l 250000 - data_
{code}
Write the first data set to dc1:
{code}
COPY ryan.test FROM 'data_aa';
{code}
Verify dc1 has all the data written:
{code}
SELECT count(*) FROM ryan.test limit 99999999;
count
--------
250000
{code}
Bring up dc2, then add it to the replication stategy:
{code}
ALTER KEYSPACE ryan WITH strategy_class = 'NetworkTopologyStrategy' AND
strategy_options:dc1 = '2' AND strategy_options:dc2 = '2';
{code}
Verify dc2 has no data written:
{code}
SELECT count(*) FROM ryan.test limit 99999999;
count
--------
0
{code}
Verify dc1 has all the data written:
{code}
SELECT count(*) FROM ryan.test limit 99999999;
count
--------
250000
{code}
Write the second data set to dc1 with local_quorum consistency:
{code}
COPY ryan.test FROM 'data_ab';
{code}
{code}
Address DC Rack Status State Load
Effective-Ownership Token
85070591730234615865843651857942052964
192.168.1.141 dc1 r1 Up Normal 12.39 MB 100.00%
0
192.168.1.145 dc2 r1 Up Normal 6.33 MB 100.00%
100
192.168.1.143 dc1 r1 Up Normal 12.72 MB 100.00%
85070591730234615865843651857942052864
192.168.1.133 dc2 r1 Up Normal 6.33 MB 100.00%
85070591730234615865843651857942052964
{code}
Verify dc1 has all the data written:
{code}
SELECT count(*) FROM ryan.test limit 99999999;
count
--------
500000
{code}
Verify dc2 has only half the data written:
{code}
SELECT count(*) FROM ryan.test limit 99999999;
count
--------
250000
{code}
run repair from dc1:
{code}
nodetool repair
{code}
{code}
Address DC Rack Status State Load
Effective-Ownership Token
85070591730234615865843651857942052964
192.168.1.141 dc1 r1 Up Normal 27.12 MB 100.00%
0
192.168.1.145 dc2 r1 Up Normal 22.78 MB 100.00%
100
192.168.1.143 dc1 r1 Up Normal 12.72 MB 100.00%
85070591730234615865843651857942052864
192.168.1.133 dc2 r1 Up Normal 16.44 MB 100.00%
85070591730234615865843651857942052964
{code}
Verify that dc2 has all the data:
{code}
SELECT count(*) FROM ryan.test limit 99999999;
count
--------
500000
{code}
I'll try adding more nodes and settings to try to approximate your setup.
> Sometimes repair process doesn't work properly
> ----------------------------------------------
>
> Key: CASSANDRA-5178
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5178
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.1.7
> Reporter: Vladimir Barinov
> Assignee: Ryan McGuire
> Priority: Minor
>
> Pre-conditions:
> 1. We have two separate datacenters called "DC1" and "DC2" correspondingly.
> Each of them contains of 6 nodes.
> 2. DC2 is disabled.
> 3. Tokens for DC1 are calculated via
> https://raw.github.com/riptano/ComboAMI/2.2/tokentoolv2.py. Tokens for DC2
> are the same as for DC1 but they have an offset: +100. So for token 0 in DC1
> we'll have token 100 in DC2 and so on.
> 4. We have a test data set (1 billion keys).
> *Steps to reproduce:*
> *Step 1:*
> Lets check current configuration.
> nodetool ring:
>
> {quote}
> {noformat}
> <ip> DC1 RAC1 Up Normal 44,53 KB 33,33%
> 0
> <ip> DC1 RAC1 Up Normal 51,8 KB 33,33%
> 28356863910078205288614550619314017621
> <ip> DC1 RAC1 Up Normal 21,82 KB 33,33%
> 56713727820156410577229101238628035242
> <ip> DC1 RAC1 Up Normal 21,82 KB 33,33%
> 85070591730234615865843651857942052864
> <ip> DC1 RAC1 Up Normal 51,8 KB 33,33%
> 113427455640312821154458202477256070485
> <ip> DC1 RAC1 Up Normal 21,82 KB 33,33%
> 141784319550391026443072753096570088106
> {noformat}
> {quote}
> *Current schema:*
> {quote}
> {noformat}
> create keyspace benchmarks
> with placement_strategy = 'NetworkTopologyStrategy'
> *and strategy_options = \{DC1 : 2};*
> use benchmarks;
> create column family test_family
> with compaction_strategy =
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
> ...
> and compaction_strategy_options = \{'sstable_size_in_mb' : '20'}
> and compression_options = \{'chunk_length_kb' : '32',
> 'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'};
> {noformat}
> {quote}
> *STEP 2:*
> Write first part of test data set (500 000 of keys) to DC1 with LOCAL_QUORUM
> consistency level.
> *STEP 3:*
> Update cassandra.yaml,cassandra-topology.properties with new IP's from DC2
> and current keyspace schema with *strategy_options = \{DC1 : 2, DC2 : 0};*
> *STEP 4:*
> Start all nodes from DC2.
> Check that nodes are started successfully:
> {quote}
> {noformat}
> <ip> DC1 RAC1 Up Normal 11,4 MB 33,33%
> 0
> <ip> DC2 RAC2 Up Normal 27,7 KB 0,00%
> 100
> <ip> DC1 RAC1 Up Normal 11,34 MB 33,33%
> 28356863910078205288614550619314017621
> <ip> DC2 RAC2 Up Normal 42,69 KB 0,00%
> 28356863910078205288614550619314017721
> <ip> DC1 RAC1 Up Normal 11,37 MB 33,33%
> 56713727820156410577229101238628035242
> <ip> DC2 RAC2 Up Normal 52,02 KB 0,00%
> 56713727820156410577229101238628035342
> <ip> DC1 RAC1 Up Normal 11,4 MB 33,33%
> 85070591730234615865843651857942052864
> <ip> DC2 RAC2 Up Normal 42,69 KB 0,00%
> 85070591730234615865843651857942052964
> <ip> DC1 RAC1 Up Normal 11,43 MB 33,33%
> 113427455640312821154458202477256070485
> <ip> DC2 RAC2 Up Normal 42,69 KB 0,00%
> 113427455640312821154458202477256070585
> <ip> DC1 RAC1 Up Normal 11,39 MB 33,33%
> 141784319550391026443072753096570088106
> <ip> DC2 RAC2 Up Normal 42,69 KB 0,00%
> 141784319550391026443072753096570088206
> {noformat}
> {quote}
> *STEP 5:*
> Update keyspace schema with *strategy_options = \{DC1 : 2, DC2 : 2};*
> STEP 6:
> Write last 500 000 keys of the test data set to DC1 with *LOCAL_QUORUM*
> consistency level.
> STEP 7:
> Check that first part of the test data set (first 500 000 of keys) was
> written correct to DC1.
> Check that last part of the test data set (last 500 000 of keys) was written
> correct to both datacenters.
> STEP 8:
> Run *nodetool repair* on each node of DC2 and wait until it's completed.
> STEP 9:
> Current nodetool ring:
> {quote}
> {noformat}
> <ip> DC1 RAC1 Up Normal 21,45 MB 33,33%
> 0
> <ip> DC2 RAC2 Up Normal 23,5 MB 33,33%
> 100
> <ip> DC1 RAC1 Up Normal 20,67 MB 33,33%
> 28356863910078205288614550619314017621
> <ip> DC2 RAC2 Up Normal 23,55 MB 33,33%
> 28356863910078205288614550619314017721
> <ip> DC1 RAC1 Up Normal 21,18 MB 33,33%
> 56713727820156410577229101238628035242
> <ip> DC2 RAC2 Up Normal 23,5 MB 33,33%
> 56713727820156410577229101238628035342
> <ip> DC1 RAC1 Up Normal 23,5 MB 33,33%
> 85070591730234615865843651857942052864
> <ip> DC2 RAC2 Up Normal 23,55 MB 33,33%
> 85070591730234615865843651857942052964
> <ip> DC1 RAC1 Up Normal 21,44 MB 33,33%
> 113427455640312821154458202477256070485
> <ip> DC2 RAC2 Up Normal 23,46 MB 33,33%
> 113427455640312821154458202477256070585
> <ip> DC1 RAC1 Up Normal 20,53 MB 33,33%
> 141784319550391026443072753096570088106
> <ip> DC2 RAC2 Up Normal 23,55 MB 33,33%
> 141784319550391026443072753096570088206
> {noformat}
> {quote}
> Check that full test data set has been written to both datacenters.
> Resulit :
> Full test data set was successfully written to DC1. *24448* of them are not
> present on DC1.
> Repeating *nodetool repair* doesn’t help.
> Result:
> It seems that problem is related with the process of identifying keys which
> must be repaired when current datacenter already had some keys.
> If we start empty DC2 nodes after DC1 have got all 1 000 000 - *nodetool
> repair* works fine, without missing keys.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira