Alex created SOLR-16977:
---------------------------

             Summary: DenseVector queries fail with unclear error message when 
either query or document is an all 0’s vector
                 Key: SOLR-16977
                 URL: https://issues.apache.org/jira/browse/SOLR-16977
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: query
    Affects Versions: 9.2.1
         Environment: Solr 9.2.1
            Reporter: Alex
         Attachments: solr_vector_error_example-1.py

When doing a dense vector search, if the query passes in an all-zero vector 
Solr will fail with uninformative error message.

 

e.g.

q=\{!knn f=vector topK=1}[0.0, 0.0, 0.0, 0.0]

 

Error Message from solr:

```

{'error': \{'msg': 'docID must be >= 0 and < maxDoc=2 (got docID=2147483647)', 
'trace': 'java.lang.IllegalArgumentException: docID must be >= 0 and < maxDoc=2 
(got docID=2147483647)\n\tat 
org.apache.lucene.index.BaseCompositeReader.readerIndex(BaseCompositeReader.java:225)\n\tat
 
org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:153)\n\tat
 
org.apache.solr.search.SolrDocumentFetcher.docNC(SolrDocumentFetcher.java:274)\n\tat
 
org.apache.solr.search.SolrDocumentFetcher.lambda$doc$0(SolrDocumentFetcher.java:258)\n\tat
 
org.apache.solr.search.CaffeineCache.computeAsync(CaffeineCache.java:234)\n\tat 
org.apache.solr.search.CaffeineCache.computeIfAbsent(CaffeineCache.java:250)\n\tat
 
org.apache.solr.search.SolrDocumentFetcher.doc(SolrDocumentFetcher.java:258)\n\tat
 
org.apache.solr.search.SolrDocumentFetcher$RetrieveFieldsOptimizer.getSolrDoc(SolrDocumentFetcher.java:855)\n\tat
 
org.apache.solr.search.SolrDocumentFetcher.solrDoc(SolrDocumentFetcher.java:307)\n\tat
 org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:94)\n\tat 
org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:56)\n\tat 
org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:257)\n\tat
 
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:196)\n\tat
 org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:47)\n\tat 
org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:403)\n\tat
 
org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:311)\n\tat
 org.apache.solr.response.JSONWriter.writeResponse(JSONWriter.java:77)\n\tat 
org.apache.solr.response.JSONResponseWriter.write(JSONResponseWriter.java:63)\n\tat
 
org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:71)\n\tat
 
org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:988)\n\tat 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:593)\n\tat 
org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:252)\n\tat
 
org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter.java:220)\n\tat
 
org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:257)\n\tat
 
org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:227)\n\tat
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:215)\n\tat
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197)\n\tat
 org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:210)\n\tat 
org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)\n\tat
 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131)\n\tat
 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:578)\n\tat
 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223)\n\tat
 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1570)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1383)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)\n\tat
 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484)\n\tat 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1543)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1305)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:149)\n\tat
 
org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:228)\n\tat
 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:141)\n\tat
 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)\n\tat
 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:301)\n\tat
 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)\n\tat
 
org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:822)\n\tat
 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)\n\tat
 org.eclipse.jetty.server.Server.handle(Server.java:563)\n\tat 
org.eclipse.jetty.server.HttpChannel.lambda$handle$0(HttpChannel.java:505)\n\tat
 org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:762)\n\tat 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:497)\n\tat 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:282)\n\tat
 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)\n\tat
 org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)\n\tat 
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:558)\n\tat
 
org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:379)\n\tat 
org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:146)\n\tat
 org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)\n\tat 
org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)\n\tat
 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:416)\n\tat
 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:385)\n\tat
 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:272)\n\tat
 
org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:140)\n\tat
 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)\n\tat
 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:934)\n\tat
 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1078)\n\tat
 java.base/java.lang.Thread.run(Thread.java:833)\n', 'code': 500}}

```

 

Similarly, even if query contains non-zero vector but a document contains an 
all zero embedding vector similar error will occur.  None of the test cases 
[https://github.com/apache/solr/blob/bee3accb8a38b7420da0919ebacf05ef6060fc94/solr/core/src/test/org/apache/solr/search/neural/KnnQParserTest.java#L420]
 have any examples with all zeros vector.

 

This is tested on solr 9.2.1 with a DenseVector field using cosine similarity.  
Cosine similarity divides by the vector norm and in the case of an all 0s 
vector that norm will be zero.  Still, this issue of an all zero vector can 
occur from underflow issues when generating the embedding vector.

 

Minimal example in python:

[^solr_vector_error_example-1.py]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to