Tommaso Teofili created OAK-4368:
------------------------------------
Summary: Excerpt extracted from the Lucene index should be more
selective
Key: OAK-4368
URL: https://issues.apache.org/jira/browse/OAK-4368
Project: Jackrabbit Oak
Issue Type: Improvement
Components: lucene
Affects Versions: 1.5.2, 1.4.2, 1.2.14, 1.0.30
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
Fix For: 1.5.3
Lucene index can be used in order to extract _rep:excerpt_ using
{{Highlighter}}.
The current implementation may suffer performance issues when the result set of
the original query contains a lot of results, each of them possibly containing
lots of (stored) properties that get passed to the highlighter in order to try
to extract the excerpt; such a process doesn't stop as soon as the first
excerpt is found so that excerpt is composed using text from all stored
properties in all results (if there's a match on the query).
While we can accept some cost of extracting excerpt at query time (whereas it
was generated at excerpt retrieval time, e.g. via
_row.getValue("rep:excerpt")_) , that should be bounded and mitigated as much
as possible.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)