[
https://issues.apache.org/jira/browse/SOLR-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17214207#comment-17214207
]
Mitchell Kotler commented on SOLR-10321:
----------------------------------------
I just ran into this issue and it is not just a nuisance for me, but a complete
non-starter. I store text in one field per page (page_no_1, page_no_2, etc,
set hl.fl to page_no_*). This allows me to re-use the same index to find which
page a query appears on as well as searching across documents. This works fine
with the original highlighter, until we had a very large document which the
original highlighter was much too slow for. The unified highlighter is much
quicker to highlight snippets (with the proper indexes set up), but it will
return a field for the highest page in the index, not in the document. So if I
have one document with 60,000 pages, the unified highlighter will return up to
60,000 empty arrays per document, which makes the response multiple orders of
magnitude larger than it needs to be. Fixing this bug or having a work around
would be very helpful for us. I am not sure if using the fastvector
highlighter in the mean time would be our best bet.
> Unified highlighter returns empty fields when using glob
> --------------------------------------------------------
>
> Key: SOLR-10321
> URL: https://issues.apache.org/jira/browse/SOLR-10321
> Project: Solr
> Issue Type: Bug
> Components: highlighter
> Affects Versions: 6.4.2
> Reporter: Markus Jelsma
> Priority: Minor
> Fix For: 7.0
>
>
> {code}
> q=lama&hl.method=unified&hl.fl=content_*
> {code}
> returns:
> {code}
> <lst
> name="http://www.nu.nl/weekend/3771311/dalai-lama-inspireert-westen.html">
> <arr name="content_en"/>
> <arr name="content_nl">
> <str>Nobelprijs Voorafgaand aan zijn bezoek aan Nederland is de dalai
> <em>lama</em> in Noorwegen om te vieren dat 25 jaar geleden de
> Nobelprijs voor de Vrede aan hem werd toegekend. Anders dan in Nederland
> wordt de dalai <em>lama</em> niet ontvangen in het Noorse
> parlement. </str>
> </arr>
> <arr name="content_general"/>
> <arr name="content_de"/>
> <arr name="content_fr"/>
> <arr name="content_es"/>
> <arr name="content_pt"/>
> <arr name="content_ja"/>
> <arr name="content_zh-cn"/>
> <arr name="content_th"/>
> <arr name="content_ar"/>
> </lst>
> {code}
> FastVector and original do not emit:
> {code}
> <arr name="content_de"/>
> <arr name="content_fr"/>
> <arr name="content_es"/>
> <arr name="content_pt"/>
> <arr name="content_ja"/>
> <arr name="content_zh-cn"/>
> <arr name="content_th"/>
> <arr name="content_ar"/>
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]