Mirko Sertic created SOLR-17056:
-----------------------------------
Summary: KnnVectorQuery: Wrong explain data when running in
cloud-mode
Key: SOLR-17056
URL: https://issues.apache.org/jira/browse/SOLR-17056
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 9.4, 9.2.1, 9.3, 9.1.1, 9.2, 9.1
Reporter: Mirko Sertic
The explain information for the KnnVectorQuery is wrong when running Solr in
cloud-mode. To be more specific, it seems to be wrong when querying a
collection with multiple shards.
Given the following documents spread over two shards:
{code:java}
{
id: 'Position1',
TESTEMBEDDING_EU_3: [0, 0, 0]
}
{
id: 'Position2',
TESTEMBEDDING_EU_3: [0.1, 0.1, 0.1]
}
{
id: 'Position3',
TESTEMBEDDING_EU_3: [0.2, 0.2, 0.2]
}
{
id: 'Position4',
TESTEMBEDDING_EU_3: [0.3, 0.3, 0.3]
}
{
id: 'Position5',
TESTEMBEDDING_EU_3: [0.4, 0.4, 0.4]
}
{
id: 'Position6',
TESTEMBEDDING_EU_3: [0.5, 0.5, 0.5]
}
{
id: 'Position7',
TESTEMBEDDING_EU_3: [0.6, 0.6, 0.6]
}
{
id: 'Position8',
TESTEMBEDDING_EU_3: [0.7, 0.7, 0.7]
}
{
id: 'Position9',
TESTEMBEDDING_EU_3: [0.8, 0.8, 0.8]
}
{
id: 'Position10',
TESTEMBEDDING_EU_3: [0.9, 0.9, 0.9]
}
{
id: 'Position11',
TESTEMBEDDING_EU_3: [1.0, 1.0, 1.0]
} {code}
and the following query:
{noformat}
{!knn f=TESTEMBEDDING_EU_3 topK=3}[1.0,1.0,1.0]{noformat}
results in the following explain information:
{noformat}
"explain": {
"Position11": {
"match": false,
"value": 0.0,
"description": "not in top 3"
},
"Position10": {
"match": false,
"value": 0.0,
"description": "not in top 3"
},
"Position9": {
"match": false,
"value": 0.0,
"description": "not in top 3"
},
"Position8": {
"match": false,
"value": 0.0,
"description": "not in top 3"
},
"Position7": {
"match": false,
"value": 0.0,
"description": "not in top 3"
},
"Position6": {
"match": false,
"value": 0.0,
"description": "not in top 3"
}
}
{noformat}
All matches are part of the search result, and none of them is marked with
match = true, none of them has a value(score) reported in explain, and none of
them is explained as in top 3, which is basically wrong for document ids
Position11, Position10 and Position9.
The number of search results is also wrong, but this is already reported as
SOLR-17055.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]