airborne12 opened a new pull request, #61092:
URL: https://github.com/apache/doris/pull/61092

   ### What problem does this PR solve?
   
   Issue Number: close #xxx
   
   Problem Summary:
   
   In FULL OUTER JOIN queries, MATCH expressions in the SELECT list cannot be 
pushed down as filters (this would violate join semantics by incorrectly 
filtering rows). This means the inverted index cannot be used for MATCH 
evaluation, resulting in slow-path expression evaluation.
   
   This PR enables MATCH expressions used as **projections** to be pushed down 
as virtual columns on OlapScan, allowing the BE to evaluate them via inverted 
index using the existing `fast_execute()` caching mechanism.
   
   **Example:**
   ```sql
   -- Before: MATCH evaluated via slow path (no index)
   SELECT A.k1, A.content MATCH_ANY 'hello' as match_result
   FROM A FULL OUTER JOIN B ON A.k1 = B.k1;
   
   -- After: MATCH pushed as virtual column, evaluated via inverted index
   ```
   
   **FE changes:**
   - `Match.java`: Add `PreferPushDownProject` interface so `PushDownProject` 
rule moves MATCH from join output into scan projections
   - `PushDownMatchProjectionAsVirtualColumn.java`: New rewrite rule converting 
MATCH projections to virtual columns on OlapScan
   - `RuleType.java` + `Rewriter.java`: Rule registration
   
   **BE changes (segment_iterator.cpp):**
   - `_construct_compound_expr_context()`: Set shared `IndexExecContext` on 
virtual column exprs
   - `_apply_index_expr()`: Evaluate inverted index for virtual column MATCH 
(bitmap only, no row filtering)
   - `_output_index_result_column_for_expr()`: Convert bitmap to UInt8 column 
for all index contexts (common exprs + virtual column exprs)
   
   The bitmap result is cached in `IndexExecContext`, and when 
`_materialization_of_virtual_column()` calls `VirtualSlotRef::execute_column()` 
→ MATCH's `fast_execute()`, it returns the pre-computed column directly.
   
   ### Release note
   
   Support MATCH expressions as projections pushed down to OlapScan as virtual 
columns, enabling inverted index evaluation for MATCH in contexts where it 
cannot be pushed as a filter (e.g., FULL OUTER JOIN).
   
   ### Check List (For Author)
   
   - Test
       - [x] Regression test
       - [x] Unit Test
       - [x] Manual test (add detailed scripts or steps below)
           - Deployed locally, verified with EXPLAIN VERBOSE showing 
`virtualColumn=id MATCH_ANY 'hello'`
           - Tested 7 query scenarios: simple projection, FULL OUTER JOIN, 
multiple MATCH projections, projection with filter, MATCH_PHRASE, filter 
regression, INNER JOIN filter
       - [ ] No need to test or manual test.
   
   - Behavior changed:
       - [x] No.
   
   - Does this need documentation?
       - [x] No.
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to