Re: [PR] KAFKA-14467:Fixed an issue where local incorrect snapshot files might occur due to first pulling the snapshot file and then truncate [kafka]

via GitHub Wed, 01 Nov 2023 04:09:23 -0700


hudeqi commented on PR #14652:
URL: https://github.com/apache/kafka/pull/14652#issuecomment-1788778264


   > Heya @hudeqi, thank you for the contribution! I have been reviewing this 
code change and I am a bit uncertain to its purpose so I wanted to ask some 
follow up questions. As far as I understand the current code flow roughly does 
the following:
   > 
   > 1. Download a snapshot file from remote storage
   > 2. Reset data structures within ProducerStateManager and delete snapshots 
present in the _snapshots_ data structure via the `truncateFullyAndStartAt`
   > 3. Read all snapshot files on disk and repopulate the data structures 
inside the ProducerStateManager
   > 
   > However, since downloading the snapshot from remote storage does not 
update the _snapshots_ data structure I do not see how the new file will be 
deleted as part of the call to `truncateFullyAndStartAt`.
   > 
   > I also found the JIRA description a bit confusing because it kept on 
linking to comments people made, but none of them detailed how this could be 
happening.
   > 
   > Could you elaborate how the call to `truncateFullyAndStartAt` will delete 
the newly downloaded file? Alternatively have I misunderstood what you mean to 
do with this pull request?
   
   Sorry for not stating it clearly in this jira. This jira was originally for 
adding a unit test to validate the transactional state after processing the 
OFFSET_MOVED_TO_TIERED_STORAGE error. When I was adding unit tests for this 
logic, I discovered "pulling snapshots from the remote and the file may then be 
cleaned" issue failed in testing, this issue has not yet been reflected in jira.
   
   As for how this issue occurs in the original logic: `snapshots` is a map, 
and the key is a long type offset. If the name (offset value) of the snapshot 
file first pulled from the remote storage and constructed happens to be in the 
keyset of local `snapshots`, there may be problems with being cleaned up later. 
@clolov 
   
   I don't know if my understanding and processing are correct. @satishd please 
help to confirm. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] KAFKA-14467:Fixed an issue where local incorrect snapshot files might occur due to first pulling the snapshot file and then truncate [kafka]

Reply via email to