Check the logs for messages about nodes going up and down, and also look at the
MessagingService MBean for timeouts. If the node in DR 2 times out replying to
DR1 the DR1 node will store a hint.
Also when hints are stored they are TTL'd to the gc_grace_seconds for the CF
(IIRC). If that's low
Here is some more information.
I am running full repair on one of the nodes and I am observing strange
behavior.
Both DCs were up during the data load. But repair is reporting a lot of
out-of-sync data. Why would that be ? Is there a way for me to tell
that WAN may be dropping hinted handoff
Wanted to add one more thing:
I can also tell that the numbers are not consistent across DRs this way
-- I have a column family with really wide rows (a couple million
columns).
DC1 reports higher column counts than DC2. DC2 only becomes consistent
after I do the command a couple of times an
Consider this output from nodetool ring:
Address DC RackStatus State Load
Effective-Ownership Token
127605887595351923798765477786913079396
dc1.5 DC1 RAC1