clintropolis commented on code in PR #15012:
URL: https://github.com/apache/druid/pull/15012#discussion_r1330923205


##########
processing/src/main/java/org/apache/druid/query/dimension/ListFilteredDimensionSpec.java:
##########
@@ -92,15 +92,17 @@ public static IdMapping buildAllowListIdMapping(
       IndexedGetter<String> fn
   )
   {
-    final IdMapping.Builder builder = 
IdMapping.Builder.ofCardinality(values.size());
+    IdMapping.Builder builder;
     if (idLookup != null) {
+      builder = IdMapping.Builder.ofCardinality(values.size());
       for (String value : values) {
         int i = idLookup.lookupId(value);
         if (i >= 0) {
           builder.addMapping(i);
         }
       }
     } else {
+      builder = IdMapping.Builder.ofCardinality(cardinality);

Review Comment:
   Hmm, I don't think we probably want to do this like this, rather, I think we 
wan to switch [`IdMapping.Builder` to use a growable collection like `IntList` 
instead of `int[]` for the 
`reverseMapping`](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/segment/IdMapping.java#L102)
 to handle this case where the ids are a many to one mapping because of 
something like a lookup extraction fn being used. The reason is because for 
something like `MV_FILTER_ONLY`, the cardinality of the column might be like 1 
million, while the cardinality of the values supplied to the function might 
just be a few values, so allocating a whole `int[]` for the complete 
cardinality of the column seems pretty dramatic.
   
   We could potentially optimize this in some manner by checking for 
extractionFn, and [if the extractionFn extraction type is one to one, and only 
do this path if it is many to one, but I don't think its really 
needed](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/query/extraction/ExtractionFn.java#L118).
   
   This matters a lot less for a deny list, which likely will still have a 
cardinality close to the base column.



##########
processing/src/main/java/org/apache/druid/query/dimension/ListFilteredDimensionSpec.java:
##########
@@ -92,15 +92,17 @@ public static IdMapping buildAllowListIdMapping(
       IndexedGetter<String> fn
   )
   {
-    final IdMapping.Builder builder = 
IdMapping.Builder.ofCardinality(values.size());
+    IdMapping.Builder builder;
     if (idLookup != null) {
+      builder = IdMapping.Builder.ofCardinality(values.size());

Review Comment:
   i think this could also be impacted the same way as the other branch, so I 
don't really think this fixes things (though it does the virtual column since 
it uses the 'else' branch here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to