gianm commented on PR #15626:
URL: https://github.com/apache/druid/pull/15626#issuecomment-1878149526
> this is great optimization to avoid extra lookup calls. We can take one
step further and could potentially reverse indexing of lookup values (Value->
List) at lookup creation and use it to in non injective cases in filters.
> `LOOKUP(sku, 'sku_to_name') = 'WhizBang Sprocket'` could be rewritten to
`sku in ('WB00013', 'WB00014', 'WB00015')`
The patch does do this rewrite -- check out the documentation table and some
of the cases in the new test file `CalciteLookupFunctionQueryTest`. Because
there isn't a reverse index made at lookup creation time, in most cases it does
it by iterating the lookup entries (see `MapLookupExtractor#unapplyAll` --
AFAIK this is the most popular lookup impl in prod). At first I thought this
might perform poorly, but in some benchmarking I found that the impact wasn't
that high. (Even a 5 million row lookup didn't take much time to iterate,
relative to the rest of SQL planning.)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]