clintropolis commented on code in PR #15012:
URL: https://github.com/apache/druid/pull/15012#discussion_r1330923205
##########
processing/src/main/java/org/apache/druid/query/dimension/ListFilteredDimensionSpec.java:
##########
@@ -92,15 +92,17 @@ public static IdMapping buildAllowListIdMapping(
IndexedGetter<String> fn
)
{
- final IdMapping.Builder builder =
IdMapping.Builder.ofCardinality(values.size());
+ IdMapping.Builder builder;
if (idLookup != null) {
+ builder = IdMapping.Builder.ofCardinality(values.size());
for (String value : values) {
int i = idLookup.lookupId(value);
if (i >= 0) {
builder.addMapping(i);
}
}
} else {
+ builder = IdMapping.Builder.ofCardinality(cardinality);
Review Comment:
Hmm, I don't think we probably want to do this like this, rather, I think we
wan to switch [`IdMapping.Builder` to use a growable collection like `IntList`
instead of `int[]` for the
`reverseMapping`](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/segment/IdMapping.java#L102)
to handle this case where the ids are a many to one mapping because of
something like a lookup extraction fn being used. The reason is because for
something like `MV_FILTER_ONLY`, the cardinality of the column might be like 1
million, while the cardinality of the values supplied to the function might
just be a few values, so allocating a whole `int[]` for the complete
cardinality of the column seems pretty dramatic.
We could potentially optimize this in some manner by checking for
extractionFn, and [if the extractionFn extraction type is one to one, and only
do this path if it is many to one, but I don't think its really
needed](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/query/extraction/ExtractionFn.java#L118).
This matters a lot less for a deny list, which likely will still have a
cardinality close to the base column.
##########
processing/src/main/java/org/apache/druid/query/dimension/ListFilteredDimensionSpec.java:
##########
@@ -92,15 +92,17 @@ public static IdMapping buildAllowListIdMapping(
IndexedGetter<String> fn
)
{
- final IdMapping.Builder builder =
IdMapping.Builder.ofCardinality(values.size());
+ IdMapping.Builder builder;
if (idLookup != null) {
+ builder = IdMapping.Builder.ofCardinality(values.size());
Review Comment:
i think this could also be impacted the same way as the other branch, so I
don't really think this fixes things (though it does the virtual column since
it uses the 'else' branch here
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]