JingsongLi opened a new pull request, #8271:
URL: https://github.com/apache/paimon/pull/8271

   ## Summary
   
   Replaces the unreleased multi-vector search API with a more general hybrid 
search API that can combine multiple vector and full-text routes. Spark exposes 
the new `hybrid_search(...)` table-valued function and returns ranked results 
with `__paimon_search_score`.
   
   ## Changes
   
   - Rename the multi-vector search model, table wrapper, builder, and ranker 
to HybridSearch equivalents.
   - Add vector and full-text route support to the hybrid search builder and 
route model.
   - Wire hybrid search through core table scans and Spark scan planning.
   - Add Spark SQL parsing for `hybrid_search(table_name, vector_routes, 
full_text_routes, limit[, ranker])`, including vector-only and vector plus 
full-text routes.
   - Update tests and multimodal global-index docs, including two-vector route 
examples under Hybrid Search.
   
   ## Testing
   
   - [x] `git diff --cached --check`
   - [x] `mvn -pl paimon-common -Pfast-build -Dtest=HybridSearchRankerTest test`
   - [x] `mvn -pl paimon-core -am -Pfast-build -DfailIfNoTests=false 
-Dtest=VectorSearchBuilderTest#testHybridSearchBuilderExposesRouteBuilders,FullTextSearchBuilderTest#testHybridSearchBuilderWithFullTextRoute
 test`
   - [x] `mvn -pl paimon-spark/paimon-spark-common -am -Pspark3,fast-build 
-DfailIfNoTests=false 
-DwildcardSuites=org.apache.paimon.spark.catalyst.plans.logical.VectorSearchQueryTest
 -Dtest=none test`
   - [x] `mvn -pl 
paimon-spark/paimon-spark3-common,paimon-spark/paimon-spark-ut -am 
-Pspark3,fast-build -DfailIfNoTests=false 
-DwildcardSuites=org.apache.paimon.spark.sql.HybridSearchTest -Dtest=none test`
   - [x] `mvn -pl paimon-spark/paimon-spark-3.2,paimon-spark/paimon-spark-3.3 
-am -Pspark3,fast-build -DskipTests -DfailIfNoTests=false compile`
   
   ## Notes
   
   This intentionally breaks the unreleased MultiVectorSearch API and Spark TVF 
naming in favor of HybridSearch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to