Re: Unbalanced ring mystery multi-DC issue with 1.1.11

2013-10-01 Thread Aaron Morton
Check the logs for messages about nodes going up and down, and also look at the MessagingService MBean for timeouts. If the node in DR 2 times out replying to DR1 the DR1 node will store a hint. Also when hints are stored they are TTL'd to the gc_grace_seconds for the CF (IIRC). If that's low

Re: Unbalanced ring mystery multi-DC issue with 1.1.11

2013-09-27 Thread Oleg Dulin
Here is some more information. I am running full repair on one of the nodes and I am observing strange behavior. Both DCs were up during the data load. But repair is reporting a lot of out-of-sync data. Why would that be ? Is there a way for me to tell that WAN may be dropping hinted handoff

Re: Unbalanced ring mystery multi-DC issue with 1.1.11

2013-09-27 Thread Oleg Dulin
Wanted to add one more thing: I can also tell that the numbers are not consistent across DRs this way -- I have a column family with really wide rows (a couple million columns). DC1 reports higher column counts than DC2. DC2 only becomes consistent after I do the command a couple of times an

Unbalanced ring mystery multi-DC issue with 1.1.11

2013-09-27 Thread Oleg Dulin
Consider this output from nodetool ring: Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079396 dc1.5 DC1 RAC1