tarun11Mavani commented on code in PR #18368:
URL: https://github.com/apache/pinot/pull/18368#discussion_r3259118236
##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/datasource/MapDataSource.java:
##########
@@ -33,6 +33,19 @@ public interface MapDataSource extends DataSource {
/// Returns the DataSource for the given map key's values.
DataSource getDataSource(String key);
+ /// Returns whether this segment has per-key index data for the given key.
Columnar segments
+ /// return an exact answer (O(1) lookup into the materialized key set).
Blob-only segments
+ /// return {@code true} conservatively because determining key presence
requires deserialization.
Review Comment:
`containsKey` is used as a fast-path guard in `MapFilterOperator` and
`AggregationPlanNode`. For columnar segments it's O(1) lookup — skips
`DataSource` construction and returns
`EmptyFilterOperator`/`MatchAllFilterOperator` directly. For IS_NULL predicates
on absent keys, it return MatchAll without touching any index.
Making `getDataSource` return `@Nullable` would work but is more expensive —
callers already need to handle the absent-key case with different operators
(MatchAll vs Empty), not just null-check. containsKey keeps that decision
cheap. Updating `getDataSource` also means that I need to update existing
callers to handle null vs `NullDataSource`.
For blob segments, returning true is conservative and correct — the caller
falls through to the expression filter path which handles it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]