alessandrobenedetti commented on code in PR #4532:
URL: https://github.com/apache/solr/pull/4532#discussion_r3504884871


##########
solr/solr-ref-guide/modules/query-guide/pages/dense-vector-search.adoc:
##########
@@ -814,7 +814,37 @@ Some use cases where `includeTags` and/or `excludeTags` 
may be more useful then
 
 
 
-=== Usage in Re-Ranking Query
+[[vector-reranking]]
+== Usage in Re-Ranking Query
+
+Dense vector similarity scores can be used to 
xref:query-guide:query-re-ranking.adoc[re-rank] first pass query results.
+Possible use cases include:
+
+* Re-ranking approximate results from a quantized vector field using full 
fidelity float vectors.
+* Re-ranking lexical search results with dense vector similarity scores.
+
+Details about using the ReRank Query Parser can be found in the 
xref:query-guide:query-re-ranking.adoc[Query Re-Ranking] section.
+
+=== Re-Ranking with vectorSimilarity Function Query
+
+The 
xref:query-guide:function-queries.adoc#vectorsimilarity-function[vectorSimilarity()]
 function can be used with the `{!func}` query parser to re-rank by vector 
similarity.
+When used as a function query, `vectorSimilarity()` computes the exact 
similarity for only the candidate documents selected for re-ranking, without 
traversing the index graph.
+
+Here is an example of re-ranking a lexical query using a `DenseVectorField` 
named `vector`:
+
+[source,text]
+?q=title:phone&rq={!rerank reRankQuery=$rqq reRankDocs=100 
reRankWeight=1}&rqq={!func}vectorSimilarity(vector,[1.0,2.0,3.0,4.0])
+
+NOTE: The default `reRankOperator` is `add`, which sums the first-pass score 
and the vector similarity score.
+Since these scores may differ in magnitude, you can adjust `reRankWeight` to 
control the balance between them, or use `reRankOperator=replace` to score 
re-ranked documents by vector similarity alone.
+
+When using a quantized vector field type (such as 
`ScalarQuantizedDenseVectorField`), the KNN first pass scores are computed on 
the quantized vectors.
+Here is an example of re-ranking those results with exact float similarity 
scores, where `topK` matches `reRankDocs`:
+
+[source,text]
+?q={!knn f=vector topK=100}[1.0,2.0,3.0,4.0]&rq={!rerank reRankQuery=$rqq 
reRankDocs=100 reRankWeight=1 
reRankOperator=replace}&rqq={!func}vectorSimilarity(vector,[1.0,2.0,3.0,4.0])

Review Comment:
   After reviewing the code again (with the help of Mr Claude):
   In short: quantised distances are used for both graph construction and graph 
traversal at query time.
     Full-precision distances are only available through an explicit opt-in
     (FullPrecisionFloatVectorSimilarityValuesSource) That is separate from the 
ANN search.
     
   In Apache Solr:
   VectorSimilaritySourceParser (VectorSimilaritySourceParser.java:108) builds 
v1 via
     DenseVectorField.getValueSource() → FloatKnnVectorFieldSource, which reads
         final FloatVectorValues vectorValues = 
reader.getFloatVectorValues(fieldName);
    and calls
     vectorValues.vectorValue(iterator.index()) — the raw, full-precision 
stored vector, not
     anything from the HNSW graph or the quantized codec.
     FullPrecisionFloatVectorSimilarityValuesSource.getValues() does exactly 
the same thing
     (ctx.reader().getFloatVectorValues(fieldName) + 
vectorSimilarityFunction.compare(...)).
     Both bypass quantization entirely — they're the same data path, just 
wrapped in two
     different Lucene abstractions (ValueSource/FunctionValues vs. 
DoubleValuesSource).
   
     So {!func}vectorSimilarity(vector_field, [query]) is already an exact 
full-precision
     similarity computation, usable today for reranking.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to