JingsongLi opened a new pull request, #8271: URL: https://github.com/apache/paimon/pull/8271
## Summary Replaces the unreleased multi-vector search API with a more general hybrid search API that can combine multiple vector and full-text routes. Spark exposes the new `hybrid_search(...)` table-valued function and returns ranked results with `__paimon_search_score`. ## Changes - Rename the multi-vector search model, table wrapper, builder, and ranker to HybridSearch equivalents. - Add vector and full-text route support to the hybrid search builder and route model. - Wire hybrid search through core table scans and Spark scan planning. - Add Spark SQL parsing for `hybrid_search(table_name, vector_routes, full_text_routes, limit[, ranker])`, including vector-only and vector plus full-text routes. - Update tests and multimodal global-index docs, including two-vector route examples under Hybrid Search. ## Testing - [x] `git diff --cached --check` - [x] `mvn -pl paimon-common -Pfast-build -Dtest=HybridSearchRankerTest test` - [x] `mvn -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=VectorSearchBuilderTest#testHybridSearchBuilderExposesRouteBuilders,FullTextSearchBuilderTest#testHybridSearchBuilderWithFullTextRoute test` - [x] `mvn -pl paimon-spark/paimon-spark-common -am -Pspark3,fast-build -DfailIfNoTests=false -DwildcardSuites=org.apache.paimon.spark.catalyst.plans.logical.VectorSearchQueryTest -Dtest=none test` - [x] `mvn -pl paimon-spark/paimon-spark3-common,paimon-spark/paimon-spark-ut -am -Pspark3,fast-build -DfailIfNoTests=false -DwildcardSuites=org.apache.paimon.spark.sql.HybridSearchTest -Dtest=none test` - [x] `mvn -pl paimon-spark/paimon-spark-3.2,paimon-spark/paimon-spark-3.3 -am -Pspark3,fast-build -DskipTests -DfailIfNoTests=false compile` ## Notes This intentionally breaks the unreleased MultiVectorSearch API and Spark TVF naming in favor of HybridSearch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
