[jira] [Commented] (SOLR-10321) Unified highlighter returns empty fields when using glob

2017-07-11 Thread Christoph Hack (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082299#comment-16082299
 ] 

Christoph Hack commented on SOLR-10321:
---

I think not sending empty entries at all (even if there is a field in the 
document) might be a good option, since transferring and decoding the keys can 
take a considerable amount of time. It's always possible to look at the 
retrieved document to see if the field is available or not. Unfortunately, 
changing the default might break some clients that are currently depending on 
this behavior and I am not sure if it's worth breaking them (and forcing them 
to fix a potential performance problem). The other option would be to introduce 
yet another highlighting option.

> Unified highlighter returns empty fields when using glob
> 
>
> Key: SOLR-10321
> URL: https://issues.apache.org/jira/browse/SOLR-10321
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 6.4.2
>Reporter: Markus Jelsma
>Priority: Minor
> Fix For: 7.0
>
>
> {code}
> q=lama=unified=content_*
> {code}
> returns:
> {code}
>name="http://www.nu.nl/weekend/3771311/dalai-lama-inspireert-westen.html;>
> 
> 
>   Nobelprijs Voorafgaand aan zijn bezoek aan Nederland is de dalai 
> emlama/em in Noorwegen om te vieren dat 25 jaar geleden de 
> Nobelprijs voor de Vrede aan hem werd toegekend. Anders dan in Nederland 
> wordt de dalai emlama/em niet ontvangen in het Noorse 
> parlement. 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>   
> {code}
> FastVector and original do not emit: 
> {code}
> 
> 
> 
> 
> 
> 
> 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-10993) lots of empty highlight entries

2017-07-11 Thread Christoph Hack (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082268#comment-16082268
 ] 

Christoph Hack edited comment on SOLR-10993 at 7/11/17 2:25 PM:


Thanks for your reply, but I am not asking a question... I have already looked 
at the source and have confirmed that it is a bug, as I have written before.

Here is a simple example to reconstruct the behavior:

1. Create a new core "bin/solr create -c bug"

2. Index some documents:

{code|title=Example Data}
{"id": "D1", "prop01_txt": "foo", "prop03_txt": "foo"}
{"id": "D2", "prop02_txt": "foo", "prop04_txt": "foo"}
{"id": "D3", "prop02_txt": "foo", "prop05_txt": "foo"}
{"id": "D4", "prop03_txt": "foo", "prop06_txt": "foo"}
{"id": "D5", "prop03_txt": "foo", "prop07_txt": "foo"}
{code}


3. Query the database with the unified highlighter:

{code:title=Query}
http://localhost:8983/solr/bug/select?hl.fl=prop*_txt=unified=on=on=foo=json
{code}

{code:title=Response}
{
  "responseHeader":{
"status":0,
"QTime":20,
"params":{
  "q":"foo",
  "hl":"on",
  "indent":"on",
  "hl.fl":"prop*_txt",
  "hl.method":"unified",
  "wt":"json"}},
  "response":{"numFound":5,"start":0,"docs":[
  {
"id":"D1",
"prop01_txt":["foo"],
"prop03_txt":["foo"],
"_version_":1572635524573691904},
  {
"id":"D2",
"prop02_txt":["foo"],
"prop04_txt":["foo"],
"_version_":1572635532961251328},
  {
"id":"D3",
"prop02_txt":["foo"],
"prop05_txt":["foo"],
"_version_":1572635545661603840},
  {
"id":"D4",
"prop03_txt":["foo"],
"prop06_txt":["foo"],
"_version_":1572635551479103488},
  {
"id":"D5",
"prop03_txt":["foo"],
"prop07_txt":["foo"],
"_version_":1572635557318623232}]
  },
  "highlighting":{
"D1":{
  "prop01_txt":["foo"],
  "prop03_txt":["foo"],
  "prop02_txt":[],
  "prop04_txt":[],
  "prop05_txt":[],
  "prop06_txt":[],
  "prop07_txt":[]},
"D2":{
  "prop01_txt":[],
  "prop03_txt":[],
  "prop02_txt":["foo"],
  "prop04_txt":["foo"],
  "prop05_txt":[],
  "prop06_txt":[],
  "prop07_txt":[]},
"D3":{
  "prop01_txt":[],
  "prop03_txt":[],
  "prop02_txt":["foo"],
  "prop04_txt":[],
  "prop05_txt":["foo"],
  "prop06_txt":[],
  "prop07_txt":[]},
"D4":{
  "prop01_txt":[],
  "prop03_txt":["foo"],
  "prop02_txt":[],
  "prop04_txt":[],
  "prop05_txt":[],
  "prop06_txt":["foo"],
  "prop07_txt":[]},
"D5":{
  "prop01_txt":[],
  "prop03_txt":["foo"],
  "prop02_txt":[],
  "prop04_txt":[],
  "prop05_txt":[],
  "prop06_txt":[],
  "prop07_txt":["foo"]}}}
{code}

As you can see, the highlighting response contains far too many entries. In my 
example, I get about 10k entries per result item which is painfully slow.


was (Author: tux21b):
Thanks for your reply, but I am not asking a question... I have already looked 
at the source and have confirmed that it is a bug, as I have written before.

Here is a simple example to reconstruct the behavior:

1. Create a new core "bin/solr create -c bug"

2. Index some documents:

{code:json|title=Example Data}
{"id": "D1", "prop01_txt": "foo", "prop03_txt": "foo"}
{"id": "D2", "prop02_txt": "foo", "prop04_txt": "foo"}
{"id": "D3", "prop02_txt": "foo", "prop05_txt": "foo"}
{"id": "D4", "prop03_txt": "foo", "prop06_txt": "foo"}
{"id": "D5", "prop03_txt": "foo", "prop07_txt": "foo"}
{code}


3. Query the database with the unified highlighter:

{code|title=Query}
http://localhost:8983/solr/bug/select?hl.fl=prop*_txt=unified=on=on=foo=json
{code}

{code:json|title=Response}
{
  "responseHeader":{
"status":0,
"QTime":20,
"params":{
  "q":"foo",
  "hl":"on",
  "indent":"on",
  "hl.fl":"prop*_txt",
  "hl.method":"unified",
  "wt":"json"}},
  "response":{"numFound":5,"start":0,"docs":[
  {
"id":"D1",
"prop01_txt":["foo"],
"prop03_txt":["foo"],
"_version_":1572635524573691904},
  {
"id":"D2",
"prop02_txt":["foo"],
"prop04_txt":["foo"],
"_version_":1572635532961251328},
  {
"id":"D3",
"prop02_txt":["foo"],
"prop05_txt":["foo"],
"_version_":1572635545661603840},
  {
"id":"D4",
"prop03_txt":["foo"],
"prop06_txt":["foo"],
"_version_":1572635551479103488},
  {
"id":"D5",
"prop03_txt":["foo"],
"prop07_txt":["foo"],
"_version_":1572635557318623232}]
  },
  "highlighting":{
"D1":{
  "prop01_txt":["foo"],
  "prop03_txt":["foo"],
  "prop02_txt":[],
  "prop04_txt":[],
  "prop05_txt":[],
  "prop06_txt":[],
  "prop07_txt":[]},
"D2":{
  

[jira] [Resolved] (SOLR-10993) lots of empty highlight entries

2017-07-11 Thread Christoph Hack (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christoph Hack resolved SOLR-10993.
---
Resolution: Duplicate

Ah, many thanks David. I haven't seen that issue before, sorry.

> lots of empty highlight entries
> ---
>
> Key: SOLR-10993
> URL: https://issues.apache.org/jira/browse/SOLR-10993
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 6.6
>Reporter: Christoph Hack
>
> I have indexed documents with lots of different text fields representing 
> different properties in Solr (version 6.6). Those text fields are indexed 
> with storeOffsetsWithPositions=true and termVectors=true to speed up 
> highlighting using the UnifiedHighlighter.
> During a search, i would like to highlight those properties and I have set 
> hl.fl to wildcard match all properties. Everything is working fine, except 
> that the responses are huge.
> Every document only has a small set of properties (let's say 10 in total, 
> with 1-2 matching ones), but Solr returns in the highlighting section, a 
> dictionary with every possible property (about 10k) for every item. Nearly 
> all of the entries are empty, but decoding the keys of the map takes a 
> considerable amount of time.
> In fact, the time spent decoding this unnecessary entries is enormous. Solr 
> takes about 174ms for the search + encoding (i expect that the timing could 
> be much better) and decoding the response in Go (using the default JSON 
> package from the standard library) takes 695ms.
> I guess the offending line is somewhere around:
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java#L175
> Why is Solr generating map entries for missing values in the first place?
> The question had been posted on stackoverflow before:
> https://stackoverflow.com/questions/44846220/solr-huge-and-slow-highlighting-response-with-mostly-empty-fields



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-10993) lots of empty highlight entries

2017-07-11 Thread Christoph Hack (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christoph Hack reopened SOLR-10993:
---

> lots of empty highlight entries
> ---
>
> Key: SOLR-10993
> URL: https://issues.apache.org/jira/browse/SOLR-10993
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 6.6
>Reporter: Christoph Hack
>
> I have indexed documents with lots of different text fields representing 
> different properties in Solr (version 6.6). Those text fields are indexed 
> with storeOffsetsWithPositions=true and termVectors=true to speed up 
> highlighting using the UnifiedHighlighter.
> During a search, i would like to highlight those properties and I have set 
> hl.fl to wildcard match all properties. Everything is working fine, except 
> that the responses are huge.
> Every document only has a small set of properties (let's say 10 in total, 
> with 1-2 matching ones), but Solr returns in the highlighting section, a 
> dictionary with every possible property (about 10k) for every item. Nearly 
> all of the entries are empty, but decoding the keys of the map takes a 
> considerable amount of time.
> In fact, the time spent decoding this unnecessary entries is enormous. Solr 
> takes about 174ms for the search + encoding (i expect that the timing could 
> be much better) and decoding the response in Go (using the default JSON 
> package from the standard library) takes 695ms.
> I guess the offending line is somewhere around:
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java#L175
> Why is Solr generating map entries for missing values in the first place?
> The question had been posted on stackoverflow before:
> https://stackoverflow.com/questions/44846220/solr-huge-and-slow-highlighting-response-with-mostly-empty-fields



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-10993) lots of empty highlight entries

2017-07-11 Thread Christoph Hack (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082268#comment-16082268
 ] 

Christoph Hack edited comment on SOLR-10993 at 7/11/17 2:26 PM:


Thanks for your reply, but I am not asking a question... I have already looked 
at the source and have confirmed that it is a bug, as I have written before.

Here is a simple example to reconstruct the behavior:

1. Create a new core "bin/solr create -c bug"

2. Index some documents:

{code:title=Example Data}
{"id": "D1", "prop01_txt": "foo", "prop03_txt": "foo"}
{"id": "D2", "prop02_txt": "foo", "prop04_txt": "foo"}
{"id": "D3", "prop02_txt": "foo", "prop05_txt": "foo"}
{"id": "D4", "prop03_txt": "foo", "prop06_txt": "foo"}
{"id": "D5", "prop03_txt": "foo", "prop07_txt": "foo"}
{code}


3. Query the database with the unified highlighter:

{code:title=Query}
http://localhost:8983/solr/bug/select?hl.fl=prop*_txt=unified=on=on=foo=json
{code}

{code:title=Response}
{
  "responseHeader":{
"status":0,
"QTime":20,
"params":{
  "q":"foo",
  "hl":"on",
  "indent":"on",
  "hl.fl":"prop*_txt",
  "hl.method":"unified",
  "wt":"json"}},
  "response":{"numFound":5,"start":0,"docs":[
  {
"id":"D1",
"prop01_txt":["foo"],
"prop03_txt":["foo"],
"_version_":1572635524573691904},
  {
"id":"D2",
"prop02_txt":["foo"],
"prop04_txt":["foo"],
"_version_":1572635532961251328},
  {
"id":"D3",
"prop02_txt":["foo"],
"prop05_txt":["foo"],
"_version_":1572635545661603840},
  {
"id":"D4",
"prop03_txt":["foo"],
"prop06_txt":["foo"],
"_version_":1572635551479103488},
  {
"id":"D5",
"prop03_txt":["foo"],
"prop07_txt":["foo"],
"_version_":1572635557318623232}]
  },
  "highlighting":{
"D1":{
  "prop01_txt":["foo"],
  "prop03_txt":["foo"],
  "prop02_txt":[],
  "prop04_txt":[],
  "prop05_txt":[],
  "prop06_txt":[],
  "prop07_txt":[]},
"D2":{
  "prop01_txt":[],
  "prop03_txt":[],
  "prop02_txt":["foo"],
  "prop04_txt":["foo"],
  "prop05_txt":[],
  "prop06_txt":[],
  "prop07_txt":[]},
"D3":{
  "prop01_txt":[],
  "prop03_txt":[],
  "prop02_txt":["foo"],
  "prop04_txt":[],
  "prop05_txt":["foo"],
  "prop06_txt":[],
  "prop07_txt":[]},
"D4":{
  "prop01_txt":[],
  "prop03_txt":["foo"],
  "prop02_txt":[],
  "prop04_txt":[],
  "prop05_txt":[],
  "prop06_txt":["foo"],
  "prop07_txt":[]},
"D5":{
  "prop01_txt":[],
  "prop03_txt":["foo"],
  "prop02_txt":[],
  "prop04_txt":[],
  "prop05_txt":[],
  "prop06_txt":[],
  "prop07_txt":["foo"]}}}
{code}

As you can see, the highlighting response contains far too many entries. In my 
example, I get about 10k entries per result item which is painfully slow.


was (Author: tux21b):
Thanks for your reply, but I am not asking a question... I have already looked 
at the source and have confirmed that it is a bug, as I have written before.

Here is a simple example to reconstruct the behavior:

1. Create a new core "bin/solr create -c bug"

2. Index some documents:

{code|title=Example Data}
{"id": "D1", "prop01_txt": "foo", "prop03_txt": "foo"}
{"id": "D2", "prop02_txt": "foo", "prop04_txt": "foo"}
{"id": "D3", "prop02_txt": "foo", "prop05_txt": "foo"}
{"id": "D4", "prop03_txt": "foo", "prop06_txt": "foo"}
{"id": "D5", "prop03_txt": "foo", "prop07_txt": "foo"}
{code}


3. Query the database with the unified highlighter:

{code:title=Query}
http://localhost:8983/solr/bug/select?hl.fl=prop*_txt=unified=on=on=foo=json
{code}

{code:title=Response}
{
  "responseHeader":{
"status":0,
"QTime":20,
"params":{
  "q":"foo",
  "hl":"on",
  "indent":"on",
  "hl.fl":"prop*_txt",
  "hl.method":"unified",
  "wt":"json"}},
  "response":{"numFound":5,"start":0,"docs":[
  {
"id":"D1",
"prop01_txt":["foo"],
"prop03_txt":["foo"],
"_version_":1572635524573691904},
  {
"id":"D2",
"prop02_txt":["foo"],
"prop04_txt":["foo"],
"_version_":1572635532961251328},
  {
"id":"D3",
"prop02_txt":["foo"],
"prop05_txt":["foo"],
"_version_":1572635545661603840},
  {
"id":"D4",
"prop03_txt":["foo"],
"prop06_txt":["foo"],
"_version_":1572635551479103488},
  {
"id":"D5",
"prop03_txt":["foo"],
"prop07_txt":["foo"],
"_version_":1572635557318623232}]
  },
  "highlighting":{
"D1":{
  "prop01_txt":["foo"],
  "prop03_txt":["foo"],
  "prop02_txt":[],
  "prop04_txt":[],
  "prop05_txt":[],
  "prop06_txt":[],
  "prop07_txt":[]},
"D2":{
  "prop01_txt":[],
  

[jira] [Commented] (SOLR-10993) lots of empty highlight entries

2017-07-11 Thread Christoph Hack (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082268#comment-16082268
 ] 

Christoph Hack commented on SOLR-10993:
---

Thanks for your reply, but I am not asking a question... I have already looked 
at the source and have confirmed that it is a bug, as I have written before.

Here is a simple example to reconstruct the behavior:

1. Create a new core "bin/solr create -c bug"

2. Index some documents:

{code:json|title=Example Data}
{"id": "D1", "prop01_txt": "foo", "prop03_txt": "foo"}
{"id": "D2", "prop02_txt": "foo", "prop04_txt": "foo"}
{"id": "D3", "prop02_txt": "foo", "prop05_txt": "foo"}
{"id": "D4", "prop03_txt": "foo", "prop06_txt": "foo"}
{"id": "D5", "prop03_txt": "foo", "prop07_txt": "foo"}
{code}


3. Query the database with the unified highlighter:

{code|title=Query}
http://localhost:8983/solr/bug/select?hl.fl=prop*_txt=unified=on=on=foo=json
{code}

{code:json|title=Response}
{
  "responseHeader":{
"status":0,
"QTime":20,
"params":{
  "q":"foo",
  "hl":"on",
  "indent":"on",
  "hl.fl":"prop*_txt",
  "hl.method":"unified",
  "wt":"json"}},
  "response":{"numFound":5,"start":0,"docs":[
  {
"id":"D1",
"prop01_txt":["foo"],
"prop03_txt":["foo"],
"_version_":1572635524573691904},
  {
"id":"D2",
"prop02_txt":["foo"],
"prop04_txt":["foo"],
"_version_":1572635532961251328},
  {
"id":"D3",
"prop02_txt":["foo"],
"prop05_txt":["foo"],
"_version_":1572635545661603840},
  {
"id":"D4",
"prop03_txt":["foo"],
"prop06_txt":["foo"],
"_version_":1572635551479103488},
  {
"id":"D5",
"prop03_txt":["foo"],
"prop07_txt":["foo"],
"_version_":1572635557318623232}]
  },
  "highlighting":{
"D1":{
  "prop01_txt":["foo"],
  "prop03_txt":["foo"],
  "prop02_txt":[],
  "prop04_txt":[],
  "prop05_txt":[],
  "prop06_txt":[],
  "prop07_txt":[]},
"D2":{
  "prop01_txt":[],
  "prop03_txt":[],
  "prop02_txt":["foo"],
  "prop04_txt":["foo"],
  "prop05_txt":[],
  "prop06_txt":[],
  "prop07_txt":[]},
"D3":{
  "prop01_txt":[],
  "prop03_txt":[],
  "prop02_txt":["foo"],
  "prop04_txt":[],
  "prop05_txt":["foo"],
  "prop06_txt":[],
  "prop07_txt":[]},
"D4":{
  "prop01_txt":[],
  "prop03_txt":["foo"],
  "prop02_txt":[],
  "prop04_txt":[],
  "prop05_txt":[],
  "prop06_txt":["foo"],
  "prop07_txt":[]},
"D5":{
  "prop01_txt":[],
  "prop03_txt":["foo"],
  "prop02_txt":[],
  "prop04_txt":[],
  "prop05_txt":[],
  "prop06_txt":[],
  "prop07_txt":["foo"]}}}
{code}

As you can see, the highlighting response contains far too many entries. In my 
example, I get about 10k entries per result item which is painfully slow.

> lots of empty highlight entries
> ---
>
> Key: SOLR-10993
> URL: https://issues.apache.org/jira/browse/SOLR-10993
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 6.6
>Reporter: Christoph Hack
>
> I have indexed documents with lots of different text fields representing 
> different properties in Solr (version 6.6). Those text fields are indexed 
> with storeOffsetsWithPositions=true and termVectors=true to speed up 
> highlighting using the UnifiedHighlighter.
> During a search, i would like to highlight those properties and I have set 
> hl.fl to wildcard match all properties. Everything is working fine, except 
> that the responses are huge.
> Every document only has a small set of properties (let's say 10 in total, 
> with 1-2 matching ones), but Solr returns in the highlighting section, a 
> dictionary with every possible property (about 10k) for every item. Nearly 
> all of the entries are empty, but decoding the keys of the map takes a 
> considerable amount of time.
> In fact, the time spent decoding this unnecessary entries is enormous. Solr 
> takes about 174ms for the search + encoding (i expect that the timing could 
> be much better) and decoding the response in Go (using the default JSON 
> package from the standard library) takes 695ms.
> I guess the offending line is somewhere around:
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java#L175
> Why is Solr generating map entries for missing values in the first place?
> The question had been posted on stackoverflow before:
> https://stackoverflow.com/questions/44846220/solr-huge-and-slow-highlighting-response-with-mostly-empty-fields



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SOLR-10993) lots of empty highlight entries

2017-07-11 Thread Christoph Hack (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081991#comment-16081991
 ] 

Christoph Hack commented on SOLR-10993:
---

ping?

> lots of empty highlight entries
> ---
>
> Key: SOLR-10993
> URL: https://issues.apache.org/jira/browse/SOLR-10993
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 6.6
>Reporter: Christoph Hack
>
> I have indexed documents with lots of different text fields representing 
> different properties in Solr (version 6.6). Those text fields are indexed 
> with storeOffsetsWithPositions=true and termVectors=true to speed up 
> highlighting using the UnifiedHighlighter.
> During a search, i would like to highlight those properties and I have set 
> hl.fl to wildcard match all properties. Everything is working fine, except 
> that the responses are huge.
> Every document only has a small set of properties (let's say 10 in total, 
> with 1-2 matching ones), but Solr returns in the highlighting section, a 
> dictionary with every possible property (about 10k) for every item. Nearly 
> all of the entries are empty, but decoding the keys of the map takes a 
> considerable amount of time.
> In fact, the time spent decoding this unnecessary entries is enormous. Solr 
> takes about 174ms for the search + encoding (i expect that the timing could 
> be much better) and decoding the response in Go (using the default JSON 
> package from the standard library) takes 695ms.
> I guess the offending line is somewhere around:
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java#L175
> Why is Solr generating map entries for missing values in the first place?
> The question had been posted on stackoverflow before:
> https://stackoverflow.com/questions/44846220/solr-huge-and-slow-highlighting-response-with-mostly-empty-fields



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10993) lots of empty highlight entries

2017-06-30 Thread Christoph Hack (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christoph Hack updated SOLR-10993:
--
Description: 
I have indexed documents with lots of different text fields representing 
different properties in Solr (version 6.6). Those text fields are indexed with 
storeOffsetsWithPositions=true and termVectors=true to speed up highlighting 
using the UnifiedHighlighter.

During a search, i would like to highlight those properties and I have set 
hl.fl to wildcard match all properties. Everything is working fine, except that 
the responses are huge.

Every document only has a small set of properties (let's say 10 in total, with 
1-2 matching ones), but Solr returns in the highlighting section, a dictionary 
with every possible property (about 10k) for every item. Nearly all of the 
entries are empty, but decoding the keys of the map takes a considerable amount 
of time.

In fact, the time spent decoding this unnecessary entries is enormous. Solr 
takes about 174ms for the search + encoding (i expect that the timing could be 
much better) and decoding the response in Go (using the default JSON package 
from the standard library) takes 695ms.

I guess the offending line is somewhere around:
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java#L175

Why is Solr generating map entries for missing values in the first place?

The question had been posted on stackoverflow before:
https://stackoverflow.com/questions/44846220/solr-huge-and-slow-highlighting-response-with-mostly-empty-fields


  was:
I have indexed documents with lots of different text fields representing 
different properties in Solr (version 6.6). Those text fields are indexed with 
storeOffsetsWithPositions=true and termVectors=true to speed up highlighting 
using the UnifiedHighlighter.

During a search, i would like to highlight those properties and I have set 
hl.fl to wildcard match all properties. Everything is working fine, except that 
the responses are huge.

Every document only has a small set of properties (let's say 10 in total, with 
1-2 matching ones), but Solr returns in the highlighting section, a dictionary 
with every possible property (about 10k) for every item. Nearly all of the 
entries are empty, but decoding the keys of the map takes a considerable amount 
of time.

In fact, the time spent decoding this unnecessary entries is enormous. Solr 
takes about 174ms for the search + encoding (i expect that the timing could be 
much better) and decoding the response in Go (using the default JSON package 
from the standard library) takes 695ms.

I guess the offending line is:
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java#L175

Why is Solr generating map entries for missing values in the first place?

The question had been posted on stackoverflow before:
https://stackoverflow.com/questions/44846220/solr-huge-and-slow-highlighting-response-with-mostly-empty-fields



> lots of empty highlight entries
> ---
>
> Key: SOLR-10993
> URL: https://issues.apache.org/jira/browse/SOLR-10993
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 6.6
>Reporter: Christoph Hack
>
> I have indexed documents with lots of different text fields representing 
> different properties in Solr (version 6.6). Those text fields are indexed 
> with storeOffsetsWithPositions=true and termVectors=true to speed up 
> highlighting using the UnifiedHighlighter.
> During a search, i would like to highlight those properties and I have set 
> hl.fl to wildcard match all properties. Everything is working fine, except 
> that the responses are huge.
> Every document only has a small set of properties (let's say 10 in total, 
> with 1-2 matching ones), but Solr returns in the highlighting section, a 
> dictionary with every possible property (about 10k) for every item. Nearly 
> all of the entries are empty, but decoding the keys of the map takes a 
> considerable amount of time.
> In fact, the time spent decoding this unnecessary entries is enormous. Solr 
> takes about 174ms for the search + encoding (i expect that the timing could 
> be much better) and decoding the response in Go (using the default JSON 
> package from the standard library) takes 695ms.
> I guess the offending line is somewhere around:
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java#L175
> Why is Solr generating map entries for missing values in the first place?
> The question had been posted on stackoverflow before:
> 

[jira] [Created] (SOLR-10993) lots of empty highlight entries

2017-06-30 Thread Christoph Hack (JIRA)
Christoph Hack created SOLR-10993:
-

 Summary: lots of empty highlight entries
 Key: SOLR-10993
 URL: https://issues.apache.org/jira/browse/SOLR-10993
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: highlighter
Affects Versions: 6.6
Reporter: Christoph Hack


I have indexed documents with lots of different text fields representing 
different properties in Solr (version 6.6). Those text fields are indexed with 
storeOffsetsWithPositions=true and termVectors=true to speed up highlighting 
using the UnifiedHighlighter.

During a search, i would like to highlight those properties and I have set 
hl.fl to wildcard match all properties. Everything is working fine, except that 
the responses are huge.

Every document only has a small set of properties (let's say 10 in total, with 
1-2 matching ones), but Solr returns in the highlighting section, a dictionary 
with every possible property (about 10k) for every item. Nearly all of the 
entries are empty, but decoding the keys of the map takes a considerable amount 
of time.

In fact, the time spent decoding this unnecessary entries is enormous. Solr 
takes about 174ms for the search + encoding (i expect that the timing could be 
much better) and decoding the response in Go (using the default JSON package 
from the standard library) takes 695ms.

I guess the offending line is:
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/highlight/UnifiedSolrHighlighter.java#L175

Why is Solr generating map entries for missing values in the first place?

The question had been posted on stackoverflow before:
https://stackoverflow.com/questions/44846220/solr-huge-and-slow-highlighting-response-with-mostly-empty-fields




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org