Chris M. Hostetter created SOLR-17045:
-----------------------------------------
Summary: DenseVectorField w/ vectorDimension > 1024 no longer
works by default
Key: SOLR-17045
URL: https://issues.apache.org/jira/browse/SOLR-17045
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 9.3
Reporter: Chris M. Hostetter
Starting with Solr 9.4, configuring a {{DenseVectorField}} w/
{{vectorDimension}} > 1024 no longer works by default. There is no error on
startup, but when indexing you'll get errors like...
{noformat}
2> => org.apache.solr.common.SolrException: Exception writing
document id 2 to the index; possible analysis error: Field [vector]vector's
dimensions must be <= [1024]; got 1600
2> at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:329)
2> org.apache.solr.common.SolrException: Exception writing document id 2 to
the index; possible analysis error: Field [vector]vector's dimensions must be
<= [1024]; got 1600
2> at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:329)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]
2> at
org.apache.solr.update.processor.RunUpdateProcessorFactory$RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:76)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]
...
2> Caused by: java.lang.IllegalArgumentException: Field [vector]vector's
dimensions must be <= [1024]; got 1600
2> at
org.apache.lucene.index.IndexingChain.validateMaxVectorDimension(IndexingChain.java:843)
~[lucene-core-9.8.0.jar:9.8.0 d914b3722bd5b8ef31ccf7e8ddc638a87fd648db -
2023-09-21 21:57:47]
...
{noformat}
This is because Lucene 9.8 moved the dimension size limitation to the codec –
and while Solr 9.4's {{SchemaCodecFactory}} was updated to implement a
per-field {{SolrDelegatingKnnVectorsFormat}} that respected the
{{vectorDimension}} configured for each {{DenseVectorField}} *the
{{SchemaCodecFactory}} _is not implicitly used by default in Solr_* – nor is it
explicitly configured in the {{_default}} configset.
{panel:title=Known Work Around}
Existing {{DenseVectorField}} who encounter this error when upgrading to Solr
>= 9.4, or new user attempting to use {{DenseVectorField}} with
{{vectorDimension}} > 1024, need to explicitly configure
{{SchemaCodecFactory}} in the solrconfig.xml for each collection.
{panel}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]