Dirk Rudolph created OAK-7071:
---------------------------------
Summary: PostingsHighlighter, Highlighter and
SimpleExcerptProvider return all different formats for excerpts
Key: OAK-7071
URL: https://issues.apache.org/jira/browse/OAK-7071
Project: Jackrabbit Oak
Issue Type: Bug
Components: lucene
Affects Versions: 1.6.7, 1.8
Reporter: Dirk Rudolph
*PostingsHighligher* returns for example
{quote}
[my text with any <b>highlighting</b> followed by more text]
{quote}
because the PostingsHighligher itself returns for each field a {{String[]}} of
phrases limited by the beforehand given max phrases. This String[] is the
transformed to String using {{Arrays.toString()}} at
[LucenePropertyIndex.java#L688|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LucenePropertyIndex.java#L688]
causing the value to be wrapped in square brackets.
*Highlighter* returns
{quote}
my text with any <strong>highlighting</strong> followed by more text
{quote}
*SimpleExcerptProvider* returns
{quote}
my text with any <div><span>highlighting</span></div> followed by more text
{quote}
As the PostingsHighligher cannot get any custom prefix or suffix, I would
suggest set <b></b> as default for the others as well to prevent any further
text transformation post extracting the excerpts.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)