[GitHub] [ozone] ChenSammi edited a comment on pull request #2538: HDDS-5619. Ozone data corruption issue on follower node.

GitBox Mon, 23 Aug 2021 03:00:04 -0700


ChenSammi edited a comment on pull request #2538:
URL: https://github.com/apache/ozone/pull/2538#issuecomment-903594803



   > 
   > 
   > The issue eventually turned out to be race among write chunk/readChunk 
threads all using the same file channel. With concurrent writers and readers on 
the same file, the behaviour seems unpredictable and leads to block file 
sparseness on disk.
   
   @bshashikant ,  for ContainerStateMachine#read function，I'm not clear about 
one thing，when  stateMachineDataCache.get(entry.getIndex()) return null， why we 
cannot think that the data is already written to the disk. 
   
   Thanks @ChenSammi . Usually, the entry will be in stateMachine cahe till the 
write chunk is applied which means, if a redaStateMachine comes in between till 
write chunk is not finished, it should be in the cache. But the cache has the 
entry added only for the leader. Now assume a case like this: 3 servers s1, s2 
and s3.  S1 being leader, s2 is fast follower and s3 being the slow follower. 
S1 sends entry 1 - 100 to S2 and S3 and then S1 dies, S2 has received all the 
log entries, (yet to complete the write to disk part ) being a follower doesn't 
update the data cache. Now S2 will be sending append Entries to S3,. S3 might 
ask for entries from s2 for which the write to disk is not completed neither 
these entries are in cache, as this node was follower when it received those 
entries. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [ozone] ChenSammi edited a comment on pull request #2538: HDDS-5619. Ozone data corruption issue on follower node.

Reply via email to