dulu98Kurz commented on code in PR #15090:
URL: https://github.com/apache/druid/pull/15090#discussion_r1348855469


##########
processing/src/main/java/org/apache/druid/timeline/partition/OvershadowableManager.java:
##########
@@ -418,9 +420,11 @@ private Iterator<Entry<RootPartitionRange, 
Short2ObjectSortedMap<AtomicUpdateGro
       TreeMap<RootPartitionRange, Short2ObjectSortedMap<AtomicUpdateGroup<T>>> 
stateMap
   )
   {
-    final RootPartitionRange lowFench = new RootPartitionRange(partitionId, 
partitionId);
+    // remediate submap `fromKey > toKey` issue when partitionId overflows
+    final short partitionIdLowFence = partitionId < 0 ? Short.MAX_VALUE : 
partitionId;

Review Comment:
   Thanks for checking on this! @kfaraz , the 
[constructor](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/timeline/partition/OvershadowableManager.java#L1055C19-L1055C19)
 of `RootPartitionRange` that takes short type of `startPartitionId` was 
defined as `private` and we are forced to use :
   ```
   static RootPartitionRange of(int startPartitionId, int endPartitionId)
       {
         return new RootPartitionRange((short) startPartitionId, (short) 
endPartitionId);
       }
   ```
   Since `startPartitionId` is in Integer range, when `startPartitionId` > 
Short.MAV_VALUE ( 32767 ), the casting from int -> short start producing 
negative number, for example 
   <img width="304" alt="image" 
src="https://github.com/apache/druid/assets/14854898/26e91ef5-6f1e-4a5f-acbc-d99ea5ceb791";>
   And the loop repeats when `startPartitionId` continued going because casting 
from Int to Short losses precision:
   <img width="201" alt="image" 
src="https://github.com/apache/druid/assets/14854898/9f6da102-ad6c-41d5-958c-817962ff6794";>
   We happen to ran into this short overflow scenario described in 
https://github.com/apache/druid/issues/15091 and our ingestion task for new 
data was completely broken because of this, throwing exception would still make 
the ingestion fail.
   Here I'm making `startPartitionId` to be Short.MAX_VALUE so that it won't 
produce `java.lang.IllegalArgumentException: fromKey > toKey` and broke 
ingestion when we do `stateMap.subMap(lowFence, false, highFence, false)` , 
this is just an remediation.
   
   I believe a better way to handle this is to allow `startPartitionId` and 
`endPartitionId` to be integer and avoid the problematic precision-loss 
casting, I can send another PR to this solution if you can confirm why we had 
this `short` limit originally and it is appropriate to do so.
   
   Best,
   Dun



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to