Re: Unbalanced ring mystery multi-DC issue with 1.1.11

2013-10-01 Thread Aaron Morton
Check the logs for messages about nodes going up and down, and also look at the 
MessagingService MBean for timeouts. If the node in DR 2 times out replying to 
DR1 the DR1 node will store a hint. 

Also when hints are stored they are TTL'd to the gc_grace_seconds for the CF 
(IIRC). If that's low the hints may not have been delivered. 

Am not aware of any specific tracking for failed hints other than log messages. 

A

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 28/09/2013, at 12:01 AM, Oleg Dulin oleg.du...@gmail.com wrote:

 Here is some more information.
 
 I am running full repair on one of the nodes and I am observing strange 
 behavior.
 
 Both DCs were up during the data load. But repair is reporting a lot of 
 out-of-sync data. Why would that be ? Is there a way for me to tell that WAN 
 may be dropping hinted handoff traffic ?
 
 Regards,
 Oleg
 
 On 2013-09-27 10:35:34 +, Oleg Dulin said:
 
 Wanted to add one more thing:
 I can also tell that the numbers are not consistent across DRs this way -- I 
 have a column family with really wide rows (a couple million columns).
 DC1 reports higher column counts than DC2. DC2 only becomes consistent after 
 I do the command a couple of times and trigger a read-repair. But why would 
 nodetool repair logs show that everything is in sync ?
 Regards,
 Oleg
 On 2013-09-27 10:23:45 +, Oleg Dulin said:
 Consider this output from nodetool ring:
 Address DC  RackStatus State   Load
 Effective-Ownership Token
 127605887595351923798765477786913079396
 dc1.5  DC1  RAC1Up Normal  32.07 GB50.00%0
 dc2.100DC2 RAC1Up Normal  8.21 GB 50.00%100
 dc1.6  DC1 RAC1Up Normal  32.82 GB50.00%
 42535295865117307932921825928971026432
 dc2.101DC2 RAC1Up Normal  12.41 GB50.00%
 42535295865117307932921825928971026532
 dc1.7  DC1 RAC1Up Normal  28.37 GB50.00%
 85070591730234615865843651857942052864
 dc2.102DC2 RAC1Up Normal  12.27 GB50.00%
 85070591730234615865843651857942052964
 dc1.8  DC1 RAC1Up Normal  27.34 GB50.00%
 127605887595351923798765477786913079296
 dc2.103DC2 RAC1Up Normal  13.46 GB50.00%
 127605887595351923798765477786913079396
 I concealed IPs and DC names for confidentiality.
 All of the data loading was happening against DC1 at a pretty brisk rate, 
 of, say, 200K writes per minute.
 Note how my tokens are offset by 100. Shouldn't that mean that load on each 
 node should be roughly identical ? In DC1 it is roughly around 30 G on each 
 node. In DC2 it is almost 1/3rd of the nearest DC1 node by token range.
 To verify that the nodes are in sync, I ran nodetool -h localhost repair 
 MyKeySpace --partitioner-range on each node in DC2. Watching the logs, I 
 see that the repair went really quick and all column families are in sync!
 I need help making sense of this. Is this because DC1 is not fully 
 compacted ? Is it because DC2 is not fully synced and I am not checking 
 correctly ? How can I tell that there is still replication going on in 
 progress (note, I started my load yesterday at 9:50am).
 
 
 -- 
 Regards,
 Oleg Dulin
 http://www.olegdulin.com
 
 



Unbalanced ring mystery multi-DC issue with 1.1.11

2013-09-27 Thread Oleg Dulin

Consider this output from nodetool ring:

Address DC  RackStatus State   Load
Effective-Ownership Token
   
  127605887595351923798765477786913079396
dc1.5  DC1 	RAC1Up Normal  32.07 GB50.00%   
  0
dc2.100DC2 RAC1Up Normal  8.21 GB 50.00%
 100
dc1.6  DC1 RAC1Up Normal  32.82 GB50.00%
 42535295865117307932921825928971026432
dc2.101DC2 RAC1Up Normal  12.41 GB50.00%
 42535295865117307932921825928971026532
dc1.7  DC1 RAC1Up Normal  28.37 GB50.00%
 85070591730234615865843651857942052864
dc2.102DC2 RAC1Up Normal  12.27 GB50.00%
 85070591730234615865843651857942052964
dc1.8  DC1 RAC1Up Normal  27.34 GB50.00%
 127605887595351923798765477786913079296
dc2.103DC2 RAC1Up Normal  13.46 GB50.00%
 127605887595351923798765477786913079396


I concealed IPs and DC names for confidentiality.

All of the data loading was happening against DC1 at a pretty brisk 
rate, of, say, 200K writes per minute.


Note how my tokens are offset by 100. Shouldn't that mean that load on 
each node should be roughly identical ? In DC1 it is roughly around 30 
G on each node. In DC2 it is almost 1/3rd of the nearest DC1 node by 
token range.


To verify that the nodes are in sync, I ran nodetool -h localhost 
repair MyKeySpace --partitioner-range on each node in DC2. Watching the 
logs, I see that the repair went really quick and all column families 
are in sync!


I need help making sense of this. Is this because DC1 is not fully 
compacted ? Is it because DC2 is not fully synced and I am not checking 
correctly ? How can I tell that there is still replication going on in 
progress (note, I started my load yesterday at 9:50am).


--
Regards,
Oleg Dulin
http://www.olegdulin.com




Re: Unbalanced ring mystery multi-DC issue with 1.1.11

2013-09-27 Thread Oleg Dulin

Wanted to add one more thing:

I can also tell that the numbers are not consistent across DRs this way 
-- I have a column family with really wide rows (a couple million 
columns).


DC1 reports higher column counts than DC2. DC2 only becomes consistent 
after I do the command a couple of times and trigger a read-repair. But 
why would nodetool repair logs show that everything is in sync ?


Regards,
Oleg

On 2013-09-27 10:23:45 +, Oleg Dulin said:


Consider this output from nodetool ring:

Address DC  RackStatus State   Load
Effective-Ownership Token

   127605887595351923798765477786913079396
dc1.5  DC1 	RAC1Up Normal  32.07 GB50.00%   
   0
dc2.100DC2 RAC1Up Normal  8.21 GB 50.00%
  100
dc1.6  DC1 RAC1Up Normal  32.82 GB50.00%
  42535295865117307932921825928971026432
dc2.101DC2 RAC1Up Normal  12.41 GB50.00%
  42535295865117307932921825928971026532
dc1.7  DC1 RAC1Up Normal  28.37 GB50.00%
  85070591730234615865843651857942052864
dc2.102DC2 RAC1Up Normal  12.27 GB50.00%
  85070591730234615865843651857942052964
dc1.8  DC1 RAC1Up Normal  27.34 GB50.00%
  127605887595351923798765477786913079296
dc2.103DC2 RAC1Up Normal  13.46 GB50.00%
  127605887595351923798765477786913079396


I concealed IPs and DC names for confidentiality.

All of the data loading was happening against DC1 at a pretty brisk 
rate, of, say, 200K writes per minute.


Note how my tokens are offset by 100. Shouldn't that mean that load on 
each node should be roughly identical ? In DC1 it is roughly around 30 
G on each node. In DC2 it is almost 1/3rd of the nearest DC1 node by 
token range.


To verify that the nodes are in sync, I ran nodetool -h localhost 
repair MyKeySpace --partitioner-range on each node in DC2. Watching the 
logs, I see that the repair went really quick and all column families 
are in sync!


I need help making sense of this. Is this because DC1 is not fully 
compacted ? Is it because DC2 is not fully synced and I am not checking 
correctly ? How can I tell that there is still replication going on in 
progress (note, I started my load yesterday at 9:50am).



--
Regards,
Oleg Dulin
http://www.olegdulin.com




Re: Unbalanced ring mystery multi-DC issue with 1.1.11

2013-09-27 Thread Oleg Dulin

Here is some more information.

I am running full repair on one of the nodes and I am observing strange 
behavior.


Both DCs were up during the data load. But repair is reporting a lot of 
out-of-sync data. Why would that be ? Is there a way for me to tell 
that WAN may be dropping hinted handoff traffic ?


Regards,
Oleg

On 2013-09-27 10:35:34 +, Oleg Dulin said:


Wanted to add one more thing:

I can also tell that the numbers are not consistent across DRs this way 
-- I have a column family with really wide rows (a couple million 
columns).


DC1 reports higher column counts than DC2. DC2 only becomes consistent 
after I do the command a couple of times and trigger a read-repair. But 
why would nodetool repair logs show that everything is in sync ?


Regards,
Oleg

On 2013-09-27 10:23:45 +, Oleg Dulin said:


Consider this output from nodetool ring:

Address DC  RackStatus State   Load
Effective-Ownership Token


127605887595351923798765477786913079396
dc1.5  DC1  RAC1Up Normal  32.07 GB50.00%   0
dc2.100DC2 RAC1Up Normal  8.21 GB 50.00%100
dc1.6  DC1 RAC1Up Normal  32.82 GB50.00%
42535295865117307932921825928971026432
dc2.101DC2 RAC1Up Normal  12.41 GB50.00%
42535295865117307932921825928971026532
dc1.7  DC1 RAC1Up Normal  28.37 GB50.00%
85070591730234615865843651857942052864
dc2.102DC2 RAC1Up Normal  12.27 GB50.00%
85070591730234615865843651857942052964
dc1.8  DC1 RAC1Up Normal  27.34 GB50.00%
127605887595351923798765477786913079296
dc2.103DC2 RAC1Up Normal  13.46 GB50.00%
127605887595351923798765477786913079396


I concealed IPs and DC names for confidentiality.

All of the data loading was happening against DC1 at a pretty brisk 
rate, of, say, 200K writes per minute.


Note how my tokens are offset by 100. Shouldn't that mean that load on 
each node should be roughly identical ? In DC1 it is roughly around 30 
G on each node. In DC2 it is almost 1/3rd of the nearest DC1 node by 
token range.


To verify that the nodes are in sync, I ran nodetool -h localhost 
repair MyKeySpace --partitioner-range on each node in DC2. Watching the 
logs, I see that the repair went really quick and all column families 
are in sync!


I need help making sense of this. Is this because DC1 is not fully 
compacted ? Is it because DC2 is not fully synced and I am not checking 
correctly ? How can I tell that there is still replication going on in 
progress (note, I started my load yesterday at 9:50am).



--
Regards,
Oleg Dulin
http://www.olegdulin.com