Guillaume Jactat created SOLR-17487:
---------------------------------------
Summary: Wrong dense vector length after deserialization
Key: SOLR-17487
URL: https://issues.apache.org/jira/browse/SOLR-17487
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: UpdateRequestProcessors
Affects Versions: 9.6.1, 9.7
Reporter: Guillaume Jactat
Attachments: vector-768.json
Hello,
I'm using Solr 9.7 as a vector database. I've come across something I can't
explain : I POST my documents as JSON and I've got a vector field of dimension
{*}768{*}.
The JSON document I POST has a vector field, which is an array of length 768.
Each value is a float.
Solr complains that my array is only *767* long...
I've compared the JSON I POST and the array parsed by Solr and written in the
logs.... And indeed, one of the 768 values has simply disappeared in the
process.
I'm pretty sure it is realted to some JSON array parsing issue on Solr side but
I don't know how to fix this :/
The problem can easily be reproduced. All you have to do is :
* In your "schema.xml", declare the following dense vector field type :
{code:java}
<fieldType name="knn_vector_768" class="solr.DenseVectorField"
vectorDimension="768" similarityFunction="cosine"/>{code}
* In your schema.xml, declare the followig dense vector dynamic field :
{code:java}
<dynamicField name="*_vector_768" type="knn_vector_768" indexed="true"
stored="true"/>{code}
* Use the Solr Admin UI to post the attached document to your Solr core.
* You should get the following error : "{*}incorrect vector dimension. The
vector value has size 767 while it is expected a vector with size 768"{*}
* Furthermore, while the POSTed vector has 768 size, the vector written in the
logs is only 767... One value is missing. You can easily spot the missing value
with a simple diff.
Thanks for reading
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]