dulu98Kurz commented on code in PR #15090:
URL: https://github.com/apache/druid/pull/15090#discussion_r1350578049


##########
processing/src/main/java/org/apache/druid/timeline/partition/OvershadowableManager.java:
##########
@@ -418,9 +420,11 @@ private Iterator<Entry<RootPartitionRange, 
Short2ObjectSortedMap<AtomicUpdateGro
       TreeMap<RootPartitionRange, Short2ObjectSortedMap<AtomicUpdateGroup<T>>> 
stateMap
   )
   {
-    final RootPartitionRange lowFench = new RootPartitionRange(partitionId, 
partitionId);
+    // remediate submap `fromKey > toKey` issue when partitionId overflows
+    final short partitionIdLowFence = partitionId < 0 ? Short.MAX_VALUE : 
partitionId;

Review Comment:
   Hi @abhishekagarwal87 , thanks for checking on this!
   You are right our investigation suggesting both late-messages from upstream 
and compactions falling behind, specifically we found there were random 
late-messages mixed in the kafka topics, it keep adding tiny segments to 
finalized trunk and eventually goes beyond `short` range and broke live 
ingestion tasks of new data, setting rejection period was not ideal because it 
means we will lose data, and because compaction falling behind we can`t afford 
to wait for it to catch up , I end up hard deleting the problematic time-trunk 
and then I realized solely relying on compaction seems inadequate.
   
   Admittedly it is not an ideal use-case for Druid to handle random late 
messages, but it was a really difficult choice when user had to chose between 
letting ingestion broke vs deleting problematic time trunk.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to