924060929 commented on code in PR #64440:
URL: https://github.com/apache/doris/pull/64440#discussion_r3434163226
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AccessPathExpressionCollector.java:
##########
@@ -380,20 +380,62 @@ private static boolean
isFunctionNullCheckPath(List<String> suffixPath) {
@Override
public Void visitMapContainsKey(MapContainsKey mapContainsKey,
CollectorContext context) {
Review Comment:
A note on documenting the *why* here: these two branches encode two
different query shapes, and a short SQL example in the comments would make them
self-explanatory (the same reasoning applies to `visitMapContainsValue` /
`visitMapContainsEntry`).
**Main body — must visit the search arg.** When `map_contains_key(...)` is
used as a value, the key's value determines the result, so its data access path
must be collected:
```sql
-- s STRUCT<a:STRING,b:INT>, m MAP<STRING,INT>; row: s.a='hello',
m={'hello':1}
SELECT map_contains_key(m, element_at(s,'a')) FROM t; -- expected: true
```
Without visiting `getArgument(1)`, `s.a` gets no data path. If
`element_at(s,'a') IS NULL` is also in the query, its `[s,a,NULL]` becomes the
only path for `s.a` and pruning collapses it to null-flag-only → at runtime
`element_at(s,'a')` no longer reads `'hello'` → returns `false`. (The bug this
PR fixes.)
**`isFunctionNullCheckPath` branch — intentionally skips the search arg.**
When the whole call sits under `IS NULL`, the result is NULL iff the map is
NULL: these functions declare `nullable() == child(0).nullable()`, and on BE
the result null flag is taken solely from the map's null map (a NULL search key
just matches NULL keys — it is never SQL-unknown). So the key's value is
irrelevant:
```sql
SELECT map_contains_key(m, element_at(s,'a')) IS NULL FROM t; -- equivalent
to (m IS NULL)
```
Hence we route the NULL to the map (`[m, NULL]`) and skip the key. This
stays safe across other SELECT items: the collector visits every expression
independently and unions paths per slot, so if `s.a` is needed as data
elsewhere, that expression registers `[s,a]` on its own — this branch only
omits its own contribution, it never removes another's. Without this branch we
would emit a malformed `[m, KEYS, NULL]` (the null flag of the *keys*
sub-column); routing to the map mirrors `visitMapKeys` / `visitMapValues`.
Could a trimmed version of these two examples be added to the code comments
for the three functions? It would make the "visit vs. skip the search arg"
split obvious to future readers.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]