xiangfu0 opened a new pull request, #18114:
URL: https://github.com/apache/pinot/pull/18114

   ## Summary
   - **Backend capability model**: `VectorBackendCapabilities` declares 5 
query-time capabilities per backend (topKAnn, filterAwareSearch, 
approximateRadius, exactRerank, runtimeSearchParams), wired into 
`VectorBackendType.getCapabilities()`
   - **Execution mode enum**: `VectorExecutionMode` defines 8 explicit modes 
(ANN_TOP_K, ANN_TOP_K_WITH_RERANK, ANN_THEN_FILTER, 
ANN_THEN_FILTER_THEN_RERANK, FILTER_THEN_ANN, ANN_THRESHOLD_SCAN, 
ANN_THRESHOLD_THEN_FILTER, EXACT_SCAN) with centralized selection logic
   - **Filtered ANN semantics**: `FilterPlanNode` detects 
AND(VECTOR_SIMILARITY, ...) patterns and over-fetches ANN candidates (2x) to 
compensate for post-filter loss; execution mode is explicit in explain output
   - **Threshold/radius search**: New `vectorDistanceThreshold` query option 
enables distance-based filtering via ANN candidate generation + exact threshold 
refinement from forward index; works in both indexed and exact-scan fallback 
paths
   - **Compound retrieval**: Filter + top-K, filter + threshold, and top-K + 
threshold patterns all wired with correct execution mode reporting
   - **Explain/debug**: Execution mode now visible in both human-readable and 
structured explain output for all vector queries
   
   ## Design
   See `docs/design/vector-backends-phase3.md` for the full design note 
covering execution modes, capability model, mode selection rules, query 
options, and limitations.
   
   ## Backward Compatibility
   All existing VECTOR_SIMILARITY queries work unchanged. No SQL, schema, table 
config, or wire protocol changes. The new `vectorDistanceThreshold` query 
option is purely additive.
   
   ## Test plan
   - [x] `VectorBackendCapabilitiesTest` — capability model for all backends (8 
tests)
   - [x] `VectorExecutionModeTest` — mode properties and flag consistency (6 
tests)
   - [x] `VectorBackendTypeTest` — existing + new capability integration (8 
tests)
   - [x] `VectorQueryExecutionContextTest` — mode selection logic for all query 
shapes (16 tests)
   - [x] `VectorSearchParamsTest` — threshold parsing, negative thresholds for 
dot-product (19 tests)
   - [x] `VectorSimilarityFilterOperatorTest` — filtered ANN over-fetch, 
threshold refinement, execution mode reporting (21 tests)
   - [x] `VectorCompoundQueryTest` — compound patterns: filter+topK, 
filter+threshold, backward compat (9 tests)
   - [x] Checkstyle, spotless, and license checks pass on all modified modules
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to