imply-cheddar commented on code in PR #14422:
URL: https://github.com/apache/druid/pull/14422#discussion_r1229122715


##########
processing/src/main/java/org/apache/druid/segment/AutoTypeColumnIndexer.java:
##########
@@ -557,7 +613,14 @@ public FieldTypeInfo.MutableTypeSet getTypes()
 
     public boolean isSingleType()
     {
-      return typeSet.getSingleType() != null;
+      ColumnType singleType = typeSet.getSingleType();

Review Comment:
   If the only thing seen is the empty array, then I believe that this will 
still return null, no?



##########
processing/src/main/java/org/apache/druid/segment/nested/FieldTypeInfo.java:
##########
@@ -283,6 +295,10 @@ private static ColumnType getSingleType(byte types)
       singleType = ColumnType.DOUBLE_ARRAY;
       count++;
     }
+    // if type isn't an array and the empty array bit is set, increment the 
count
+    if (((types & EMPTY_ARRAY_MASK) > 0) && !(singleType != null && 
singleType.isArray())) {
+      count++;
+    }

Review Comment:
   I think the `getSingleType()` implementation might be simplified by using 
`Integer.bitCount` and `Integer.numberOfTrailingZeros()`.  I.e. you can do
   
   ```
   int intType = 0xff & types;
   if (Integer.bitCount(intType) == 1) {
     switch (Integer.numberOfTrailingZeros(intType)) {
       case 0: return ColumnType.STRING;
       case 1: return ColumnType.LONG;
       ...
     }
   } else {
     return null;
   }
   ```
   
   You can also eliminate the switch by either making the masks an enum (and 
then use index lookup on the enum) OR build a static array of the types and use 
the value from the switch as the index into that array.



##########
processing/src/main/java/org/apache/druid/segment/AutoTypeColumnIndexer.java:
##########
@@ -557,7 +613,14 @@ public FieldTypeInfo.MutableTypeSet getTypes()
 
     public boolean isSingleType()
     {
-      return typeSet.getSingleType() != null;
+      ColumnType singleType = typeSet.getSingleType();
+      if (singleType != null) {
+        if (typeSet.hasEmptyArray()) {

Review Comment:
   I looked at FieldTypeInfo and convinced myself that this should be checked 
when `singleType` is null, not when it's not null.  Did I just confuse myself?



##########
processing/src/main/java/org/apache/druid/segment/nested/FieldTypeInfo.java:
##########
@@ -169,6 +167,20 @@ public MutableTypeSet add(ColumnType type)
       return this;
     }
 
+    public MutableTypeSet addEmptyArray()
+    {
+      if (types < 0) {
+        return this;
+      }

Review Comment:
   random nit (and perhaps over-thinking), but given that the `|` operation is 
idempotent and fast, I wonder if the branch added by this if really buys us 
anything



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to