[jira] [Updated] (SOLR-10993) lots of empty highlight entries

Christoph Hack (JIRA) Fri, 30 Jun 2017 20:50:49 -0700

     [ 
https://issues.apache.org/jira/browse/SOLR-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Christoph Hack updated SOLR-10993:
----------------------------------
    Description: 
I have indexed documents with lots of different text fields representing 
different properties in Solr (version 6.6). Those text fields are indexed with 
storeOffsetsWithPositions=true and termVectors=true to speed up highlighting 
using the UnifiedHighlighter.

During a search, i would like to highlight those properties and I have set 
hl.fl to wildcard match all properties. Everything is working fine, except that 
the responses are huge.

Every document only has a small set of properties (let's say 10 in total, with 
1-2 matching ones), but Solr returns in the highlighting section, a dictionary 
with every possible property (about 10k) for every item. Nearly all of the 
entries are empty, but decoding the keys of the map takes a considerable amount 
of time.

In fact, the time spent decoding this unnecessary entries is enormous. Solr 
takes about 174ms for the search + encoding (i expect that the timing could be 
much better) and decoding the response in Go (using the default JSON package 
from the standard library) takes 695ms.

I guess the offending line is somewhere around:
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java#L175

Why is Solr generating map entries for missing values in the first place?

The question had been posted on stackoverflow before:
https://stackoverflow.com/questions/44846220/solr-huge-and-slow-highlighting-response-with-mostly-empty-fields


  was:
I have indexed documents with lots of different text fields representing 
different properties in Solr (version 6.6). Those text fields are indexed with 
storeOffsetsWithPositions=true and termVectors=true to speed up highlighting 
using the UnifiedHighlighter.

During a search, i would like to highlight those properties and I have set 
hl.fl to wildcard match all properties. Everything is working fine, except that 
the responses are huge.

Every document only has a small set of properties (let's say 10 in total, with 
1-2 matching ones), but Solr returns in the highlighting section, a dictionary 
with every possible property (about 10k) for every item. Nearly all of the 
entries are empty, but decoding the keys of the map takes a considerable amount 
of time.

In fact, the time spent decoding this unnecessary entries is enormous. Solr 
takes about 174ms for the search + encoding (i expect that the timing could be 
much better) and decoding the response in Go (using the default JSON package 
from the standard library) takes 695ms.

I guess the offending line is:
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java#L175

Why is Solr generating map entries for missing values in the first place?

The question had been posted on stackoverflow before:
https://stackoverflow.com/questions/44846220/solr-huge-and-slow-highlighting-response-with-mostly-empty-fields



> lots of empty highlight entries
> -------------------------------
>
>                 Key: SOLR-10993
>                 URL: https://issues.apache.org/jira/browse/SOLR-10993
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: highlighter
>    Affects Versions: 6.6
>            Reporter: Christoph Hack
>
> I have indexed documents with lots of different text fields representing 
> different properties in Solr (version 6.6). Those text fields are indexed 
> with storeOffsetsWithPositions=true and termVectors=true to speed up 
> highlighting using the UnifiedHighlighter.
> During a search, i would like to highlight those properties and I have set 
> hl.fl to wildcard match all properties. Everything is working fine, except 
> that the responses are huge.
> Every document only has a small set of properties (let's say 10 in total, 
> with 1-2 matching ones), but Solr returns in the highlighting section, a 
> dictionary with every possible property (about 10k) for every item. Nearly 
> all of the entries are empty, but decoding the keys of the map takes a 
> considerable amount of time.
> In fact, the time spent decoding this unnecessary entries is enormous. Solr 
> takes about 174ms for the search + encoding (i expect that the timing could 
> be much better) and decoding the response in Go (using the default JSON 
> package from the standard library) takes 695ms.
> I guess the offending line is somewhere around:
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java#L175
> Why is Solr generating map entries for missing values in the first place?
> The question had been posted on stackoverflow before:
> https://stackoverflow.com/questions/44846220/solr-huge-and-slow-highlighting-response-with-mostly-empty-fields



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-10993) lots of empty highlight entries

Reply via email to