My findings - would be nice if somebody can please verify. Critical for our eval to verify HintedHandOff, ReadRepair and AntiEntropy works as we think it does
Node 1, 2, 3, 4 RF=3 All nodes up - Node 2 is responsible for key 1005 Write CL=ONE, Insert key 1005, value=123 in Node 1 Node 2, 3, 4 gets data Read CL=ONE Read on all 4 nodes gets value 123 Node 2 is down now Write CL=ONE, Insert key 1005, value=A123 in Node 1 Node 3, 4 gets data Node 4 adds Hint for Node 2 Read CL=ONE Read on all 3 Node 1, 3, 4 gets value A123. Node 2 is brought back up Hint for 1 row is Handed off by Node 4 Read CL=ONE Read on Node 1, 2, gets old value 123. See log entries: Received responses in DataRepairHandler : ID : 72 FROM:/Node 3 TYPE:RESPONSE_STAGE VERB:READ_RESPONSE Received responses in DataRepairHandler : ID : 72 FROM:/Node 4 TYPE:RESPONSE_STAGE VERB:READ_RESPONSE Read CL=ONE Read on Node 3, 4, gets new value A123. Most of the times after a Read on Node 3 or 4 Node 1 and 2 start showing the latest updated A123 (updated when Node 2 was down) My expectation was even though Node 2 was down key written to Node 3 or 4 should be updated in Node 2 using Hint and the subsequent reads to Node 1 or Node 2 itself should have got the latest value On Mon, Nov 1, 2010 at 4:06 PM, Joe Alex <joe.m.a...@gmail.com> wrote: > To keep the question simple, > If an insert or remove Key happens when the responsible Node is down > (RF=3) what is the expected behavior when the Node comes back up ? > > For example Key 1005 was removed when Node 2 was down. When Node 2 > came back up it started showing back ? > > On Mon, Nov 1, 2010 at 2:22 PM, Joe Alex <joe.m.a...@gmail.com> wrote: >> I am running cassandra 0.6.6 >> 4 nodes with RF=3 >> Have set the InitialTokens manually >> Loaded around 4 million records >> >> Had a question why the following is happening >> >> Node 4 was down when a new key 1005 was added (value 123). >> Node 2 which is responsible for the key added a Hint for Node 4 >> Node 4 was brought back up and noticed the Hints Handed off and data >> started showing up in Node 4 >> Noticed a ReadRepair also happenning >> All fine so far >> >> did a get and the value is 123 >> Node 2 returned the data, with background digest checks on Node 3 and >> Node 4 (RF=3) >> >> Now Node 2 (responsible for key 1005) was taken down >> Key 1005 value was updated to A123 (ApplyRowMutation on Node 3 and Node 4). >> Node 4 added a hint for Node 2 >> >> did a get and the value is A123 >> Node 3 returned the data, with background digest checks on Node 4 >> (RF=3 and Node 2 is down) >> >> Now Node 2 is back up >> Hints were handed off by Node 4 >> >> did a get and the value is the old value 123 >> Node 2 returned the data, with background digest checks on Node 3 and >> Node 4 (RF=3) >> >> Was expecting the latest write wins - A123 written on Node 4 to be in Node 2. >> Any ideas ? >> >> Now if Node 2 is down the old value A123 will be returned >> Tried a repair when Node 2 was up and all Nodes got updated to the old data >> >