vvcephei commented on code in PR #13364:
URL: https://github.com/apache/kafka/pull/13364#discussion_r1153437348


##########
streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBVersionedStoreSegmentValueFormatter.java:
##########
@@ -495,6 +501,41 @@ private boolean isLastIndex(final int index) {
             return unpackedReversedTimestampAndValueSizes.get(index).timestamp 
== minTimestamp;
         }
 
+        private void truncateRecordsToTimestamp(final long timestamp) {
+            if (timestamp <= minTimestamp) {
+                // delete everything in this current segment by replacing it 
with a degenerate segment
+                initializeWithRecord(new ValueAndValueSize(null), timestamp, 
timestamp);
+                return;
+            }
+
+            final SegmentSearchResult searchResult = find(timestamp, false);
+            // all records with later timestamps should be removed
+            int fullRecordsToTruncate = searchResult.index();

Review Comment:
   Agreed; it's a tricky question. It seems like either way, we need to clearly 
document/enforce the contract so that future generations can have a reasonably 
good chance of not breaking the system.
   
   Ideally, there would be at one test verifying the contract is followed (eg, 
the order of inserts into the underlying segments), so that when someone breaks 
it, they'll also break the test, which can have a name and comment explaining 
the contract. The next-best thing would be to structure the code in a way 
that's pretty self-explanatory, though a sequence of refactors can destroy this 
property without any specific step being obviously wrong.
   
   I don't think I like the idea of just throwing an IllegalStateException in 
production code. It puts the user in a really bad position where they've 
upgraded to a new version that is now crashing in production, and they have to 
decide whether it's safe to downgrade or whether they need to wait for a fix, 
and all the while they're incurring downtime. Note that the IllegalState 
condition may take a while to manifest; imagine what you'd be going through if 
your app started crashing with this error a month after your last upgrade.
   
   OTOH, we do have recoverable exceptions in Streams, where you can signal to 
the runtime that the task is corrupted, triggering the 
purge-and-rebuild-from-changelog automatically. Perhaps that, coupled with an 
ERROR log telling the user to file a bug ticket, would be the best path forward.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to