Christoph Hack created SOLR-10993:
-------------------------------------
Summary: lots of empty highlight entries
Key: SOLR-10993
URL: https://issues.apache.org/jira/browse/SOLR-10993
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: highlighter
Affects Versions: 6.6
Reporter: Christoph Hack
I have indexed documents with lots of different text fields representing
different properties in Solr (version 6.6). Those text fields are indexed with
storeOffsetsWithPositions=true and termVectors=true to speed up highlighting
using the UnifiedHighlighter.
During a search, i would like to highlight those properties and I have set
hl.fl to wildcard match all properties. Everything is working fine, except that
the responses are huge.
Every document only has a small set of properties (let's say 10 in total, with
1-2 matching ones), but Solr returns in the highlighting section, a dictionary
with every possible property (about 10k) for every item. Nearly all of the
entries are empty, but decoding the keys of the map takes a considerable amount
of time.
In fact, the time spent decoding this unnecessary entries is enormous. Solr
takes about 174ms for the search + encoding (i expect that the timing could be
much better) and decoding the response in Go (using the default JSON package
from the standard library) takes 695ms.
I guess the offending line is:
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java#L175
Why is Solr generating map entries for missing values in the first place?
The question had been posted on stackoverflow before:
https://stackoverflow.com/questions/44846220/solr-huge-and-slow-highlighting-response-with-mostly-empty-fields
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]