vcrfxia commented on code in PR #13364:
URL: https://github.com/apache/kafka/pull/13364#discussion_r1152353341


##########
streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBVersionedStoreSegmentValueFormatter.java:
##########
@@ -341,8 +345,10 @@ public void insertAsLatest(final long validFrom, final 
long validTo, final byte[
                 // detected inconsistency edge case where older segment has 
[a,b) while newer store
                 // has [a,c), due to [b,c) having failed to write to newer 
store.
                 // remove entries from this store until the overlap is 
resolved.

Review Comment:
   I actually think the examples you gave are great examples for why we _need_ 
this logic to be more generic -- if those are "valid" cases which could be 
encountered during processing/re-processing, then we need the store to be able 
to properly handle them after encountering a failure. 
   
   The cleanup logic is safe because the only type of failure we can have is 
duplicated data, i.e., two segments (or one segment and the latest value store) 
contain overlapping records. When this happens, we know it is safe to truncate 
from the older segment because the data being truncated is also present in the 
newer segment. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to