Copilot commented on code in PR #18541:
URL: https://github.com/apache/druid/pull/18541#discussion_r2357220533


##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/RowBasedGrouperHelper.java:
##########
@@ -1619,6 +1627,11 @@ public boolean putToKeyBuffer(RowBasedKey key, int idx)
         final Object obj = key.getKey()[idx];
         int id = getReverseDictionary().getInt(obj);
         if (id == DimensionDictionary.ABSENT_VALUE_ID) {
+          int size = estimatedKeySize(key.getKey());
+          if (currentEstimatedSize + size > maxDictionarySize) {
+            return false;
+          }
+          currentEstimatedSize += size;

Review Comment:
   The size estimation is calculated for the entire key array, but only the 
current dimension (at index `idx`) should be added to the dictionary. This 
leads to incorrect size accounting as it includes sizes of dimensions that may 
already be in the dictionary.



##########
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/RowBasedGrouperHelper.java:
##########
@@ -1692,6 +1707,24 @@ private void validateColumnType(TypeSignature<ValueType> 
columnType)
         }
       }
 
+      @Override
+      int estimatedKeySize(Object[] key)
+      {
+        int size = 0;
+        for (Object obj : key) {
+          if (obj instanceof StructuredData) {
+            Object o = ((StructuredData) obj).getValue();
+            if (o instanceof String) {
+              size += (int) estimateStringKeySize((String) o);
+            } else {
+              // fall back to string representation for other types, this 
might be under-estimating for map
+              size += (int) estimateStringKeySize(String.valueOf(o));
+            }

Review Comment:
   The method only processes StructuredData objects but ignores other types in 
the key array. This could lead to underestimating the total size when the key 
contains non-StructuredData objects.
   ```suggestion
               }
             } else if (obj != null) {
               // For non-StructuredData objects, estimate size using their 
string representation
               size += (int) estimateStringKeySize(String.valueOf(obj));
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to