ChenSammi edited a comment on pull request #2538: URL: https://github.com/apache/ozone/pull/2538#issuecomment-903594803
> > > The issue eventually turned out to be race among write chunk/readChunk threads all using the same file channel. With concurrent writers and readers on the same file, the behaviour seems unpredictable and leads to block file sparseness on disk. @bshashikant , for ContainerStateMachine#read function,I'm not clear about one thing,when stateMachineDataCache.get(entry.getIndex()) return null, why we cannot think that the data is already written to the disk. Thanks @ChenSammi . Usually, the entry will be in stateMachine cahe till the write chunk is applied which means, if a redaStateMachine comes in between till write chunk is not finished, it should be in the cache. But the cache has the entry added only for the leader. Now assume a case like this: 3 servers s1, s2 and s3. S1 being leader, s2 is fast follower and s3 being the slow follower. S1 sends entry 1 - 100 to S2 and S3 and then S1 dies, S2 has received all the log entries, (yet to complete the write to disk part ) being a follower doesn't update the data cache. Now S2 will be sending append Entries to S3,. S3 might ask for entries from s2 for which the write to disk is not completed neither these entries are in cache, as this node was follower when it received those entries. BTW, if read chunk gets executed on a thread which is different than the thread which has done write chunk, it doesn't get the same view of the underlying file as the file handle is not closed on write. So, even if Read chunk gets executed post the write chunk completes, it may not necessarily see the same data unless its the same thread. That's why its important to remove the concurrency across multiple chunk writes/read for the same block. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
