Dirk Rudolph created OAK-7071:
---------------------------------

             Summary: PostingsHighlighter, Highlighter and 
SimpleExcerptProvider return all different formats for excerpts
                 Key: OAK-7071
                 URL: https://issues.apache.org/jira/browse/OAK-7071
             Project: Jackrabbit Oak
          Issue Type: Bug
          Components: lucene
    Affects Versions: 1.6.7, 1.8
            Reporter: Dirk Rudolph


*PostingsHighligher* returns for example 
{quote} 
[my text with any <b>highlighting</b> followed by more text]
{quote}
because the PostingsHighligher itself returns for each field a {{String[]}} of 
phrases limited by the beforehand given max phrases. This String[] is the 
transformed to String using {{Arrays.toString()}} at 
[LucenePropertyIndex.java#L688|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LucenePropertyIndex.java#L688]
 causing the value to be wrapped in square brackets.

*Highlighter* returns 
{quote}
my text with any <strong>highlighting</strong> followed by more text 
{quote}

*SimpleExcerptProvider* returns
{quote}
my text with any <div><span>highlighting</span></div> followed by more text 
{quote}

As the PostingsHighligher cannot get any custom prefix or suffix, I would 
suggest set <b></b> as default for the others as well to prevent any further 
text transformation post extracting the excerpts.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to