Re: [PR] [spark] support return score on vector search [paimon]

via GitHub Mon, 01 Jun 2026 23:55:45 -0700


JingsongLi commented on code in PR #8068:
URL: https://github.com/apache/paimon/pull/8068#discussion_r3339236735



##########
paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/PaimonSparkTableBase.scala:
##########
@@ -118,6 +118,9 @@ abstract class PaimonSparkTableBase(val table: Table)
       _metadataColumns.append(PaimonMetadataColumn.ROW_ID)
       _metadataColumns.append(PaimonMetadataColumn.SEQUENCE_NUMBER)
     }
+    if (!coreOptions.vectorField().isEmpty) {
+      _metadataColumns.append(PaimonMetadataColumn.VECTOR_SEARCH_SCORE)

Review Comment:
   This advertises `__paimon_vector_search_score` on every table with a vector 
field, but the read path can only populate it when a vector-search scan has 
produced `ScoreRecordIterator`s. A normal scan can still resolve this metadata 
column, for example:
   
   ```sql
   SELECT gid, __paimon_vector_search_score
   FROM my_db1.vector_test
   WHERE date = '20260420'
   ```
   
   That scan has no pushed vector search, so `PaimonRecordReaderIterator` 
reaches the score branch and casts the normal file iterator to 
`ScoreRecordIterator`, which should fail at runtime with `ClassCastException`. 
Can we gate this metadata column to `VectorSearchTable` / active vector-search 
scans, or reject it explicitly when no vector search is present?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [spark] support return score on vector search [paimon]

Reply via email to