vcrfxia commented on code in PR #13364:
URL: https://github.com/apache/kafka/pull/13364#discussion_r1152370089


##########
streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBVersionedStoreSegmentValueFormatter.java:
##########
@@ -495,6 +501,41 @@ private boolean isLastIndex(final int index) {
             return unpackedReversedTimestampAndValueSizes.get(index).timestamp 
== minTimestamp;
         }
 
+        private void truncateRecordsToTimestamp(final long timestamp) {
+            if (timestamp <= minTimestamp) {
+                // delete everything in this current segment by replacing it 
with a degenerate segment
+                initializeWithRecord(new ValueAndValueSize(null), timestamp, 
timestamp);
+                return;
+            }
+
+            final SegmentSearchResult searchResult = find(timestamp, false);
+            // all records with later timestamps should be removed
+            int fullRecordsToTruncate = searchResult.index();

Review Comment:
   As the code stands today, it's safe yeah. The safety relies on some 
properties of the current usage of this class, though, such as the fact that 
`insertAsLatest()` is only called in certain cases, that we always insert into 
the older segment first when two segments must both be updated, etc. If these 
change in the future then it could become unsafe. So I guess the 
"future-proofing" that I intended with these more generic updates creates its 
own risk for un-future-proof changes as well. 
   
   Maybe the question then is what we think is worse -- if we don't allow 
generic truncation but instead require that only one full record can be 
truncated (which is what we expect to happen if there is a failure and then the 
same record(s) are re-processed right afterward), then the store will throw 
IllegalStateException when the assumption is violated. Users will have to 
delete local instances of the store and restore from changelog to recover. 
Alternatively, if we keep the generic changes, then that is not necessary (and 
is safe for now), but something could change in the future to make it unsafe 
(unlikely but possible).
   
   I see the merits of both. Maybe it's better to start with the stricter 
requirement (do not support generic truncation) first and relax the requirement 
if users report hitting IllegalStateException. WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to