924060929 commented on code in PR #64440:
URL: https://github.com/apache/doris/pull/64440#discussion_r3434163226


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AccessPathExpressionCollector.java:
##########
@@ -380,20 +380,62 @@ private static boolean 
isFunctionNullCheckPath(List<String> suffixPath) {
 
     @Override
     public Void visitMapContainsKey(MapContainsKey mapContainsKey, 
CollectorContext context) {

Review Comment:
   A note on documenting the *why* here: these two branches encode two 
different query shapes, and a short SQL example in the comments would make them 
self-explanatory (the same reasoning applies to `visitMapContainsValue` / 
`visitMapContainsEntry`).
   
   **Main body — must visit the search arg.** When `map_contains_key(...)` is 
used as a value, the key's value determines the result, so its data access path 
must be collected:
   ```sql
   -- s STRUCT<a:STRING,b:INT>, m MAP<STRING,INT>; row: s.a='hello', 
m={'hello':1}
   SELECT map_contains_key(m, element_at(s,'a')) FROM t;  -- expected: true
   ```
   Without visiting `getArgument(1)`, `s.a` gets no data path. If 
`element_at(s,'a') IS NULL` is also in the query, its `[s,a,NULL]` becomes the 
only path for `s.a` and pruning collapses it to null-flag-only → at runtime 
`element_at(s,'a')` no longer reads `'hello'` → returns `false`. (The bug this 
PR fixes.)
   
   **`isFunctionNullCheckPath` branch — intentionally skips the search arg.** 
When the whole call sits under `IS NULL`, the result is NULL iff the map is 
NULL: these functions declare `nullable() == child(0).nullable()`, and on BE 
the result null flag is taken solely from the map's null map (a NULL search key 
just matches NULL keys — it is never SQL-unknown). So the key's value is 
irrelevant:
   ```sql
   SELECT map_contains_key(m, element_at(s,'a')) IS NULL FROM t;  -- equivalent 
to (m IS NULL)
   ```
   Hence we route the NULL to the map (`[m, NULL]`) and skip the key. This 
stays safe across other SELECT items: the collector visits every expression 
independently and unions paths per slot, so if `s.a` is needed as data 
elsewhere, that expression registers `[s,a]` on its own — this branch only 
omits its own contribution, it never removes another's. Without this branch we 
would emit a malformed `[m, KEYS, NULL]` (the null flag of the *keys* 
sub-column); routing to the map mirrors `visitMapKeys` / `visitMapValues`.
   
   Could a trimmed version of these two examples be added to the code comments 
for the three functions? It would make the "visit vs. skip the search arg" 
split obvious to future readers.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to