imply-cheddar commented on code in PR #14422:
URL: https://github.com/apache/druid/pull/14422#discussion_r1229122715
##########
processing/src/main/java/org/apache/druid/segment/AutoTypeColumnIndexer.java:
##########
@@ -557,7 +613,14 @@ public FieldTypeInfo.MutableTypeSet getTypes()
public boolean isSingleType()
{
- return typeSet.getSingleType() != null;
+ ColumnType singleType = typeSet.getSingleType();
Review Comment:
If the only thing seen is the empty array, then I believe that this will
still return null, no?
##########
processing/src/main/java/org/apache/druid/segment/nested/FieldTypeInfo.java:
##########
@@ -283,6 +295,10 @@ private static ColumnType getSingleType(byte types)
singleType = ColumnType.DOUBLE_ARRAY;
count++;
}
+ // if type isn't an array and the empty array bit is set, increment the
count
+ if (((types & EMPTY_ARRAY_MASK) > 0) && !(singleType != null &&
singleType.isArray())) {
+ count++;
+ }
Review Comment:
I think the `getSingleType()` implementation might be simplified by using
`Integer.bitCount` and `Integer.numberOfTrailingZeros()`. I.e. you can do
```
int intType = 0xff & types;
if (Integer.bitCount(intType) == 1) {
switch (Integer.numberOfTrailingZeros(intType)) {
case 0: return ColumnType.STRING;
case 1: return ColumnType.LONG;
...
}
} else {
return null;
}
```
You can also eliminate the switch by either making the masks an enum (and
then use index lookup on the enum) OR build a static array of the types and use
the value from the switch as the index into that array.
##########
processing/src/main/java/org/apache/druid/segment/AutoTypeColumnIndexer.java:
##########
@@ -557,7 +613,14 @@ public FieldTypeInfo.MutableTypeSet getTypes()
public boolean isSingleType()
{
- return typeSet.getSingleType() != null;
+ ColumnType singleType = typeSet.getSingleType();
+ if (singleType != null) {
+ if (typeSet.hasEmptyArray()) {
Review Comment:
I looked at FieldTypeInfo and convinced myself that this should be checked
when `singleType` is null, not when it's not null. Did I just confuse myself?
##########
processing/src/main/java/org/apache/druid/segment/nested/FieldTypeInfo.java:
##########
@@ -169,6 +167,20 @@ public MutableTypeSet add(ColumnType type)
return this;
}
+ public MutableTypeSet addEmptyArray()
+ {
+ if (types < 0) {
+ return this;
+ }
Review Comment:
random nit (and perhaps over-thinking), but given that the `|` operation is
idempotent and fast, I wonder if the branch added by this if really buys us
anything
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]