[
https://issues.apache.org/jira/browse/OAK-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146602#comment-16146602
]
Chetan Mehrotra commented on OAK-6597:
--------------------------------------
This indeed is an oversight. Looking at current flow there are few
inconsistency around stored fields (which is required for excerpt support)
# ":fulltext" fields created by Binary text extraction are always stored
(BinaryTextExtractor#newBinary)
# ":fulltext" fields created by nodeScopeIndex marked fields are not stored
# ":fulltext" fields created by aggregated fields are also not stored
One way would be to expose a index config "excerptEnabled" which if enabled
would enable storage of ":fulltext" field created in any of of the above way.
It would have following behaviour
# If not set then status remains same. #1 is enabled and #2 and #3 disabled
# If set then if true then all modes are enabled else disabled
This would ensure that config value keeps backward compatibility
[~catholicon] [~teofili] Thoughts?
> rep:excerpt not working for content indexed by aggregation in lucene
> --------------------------------------------------------------------
>
> Key: OAK-6597
> URL: https://issues.apache.org/jira/browse/OAK-6597
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: lucene
> Affects Versions: 1.6.1, 1.7.6
> Reporter: Dirk Rudolph
> Fix For: 1.8
>
> Attachments: excerpt-with-aggregation-test.patch
>
>
> I mentioned that properties that got indexed due to an aggregation are not
> considered for excerpts (highlighting) as they are not indexed as stored
> fields.
> See the attached patch that implements a test for excerpts in
> {{LuceneIndexAggregationTest2}}.
> It creates the following structure:
> {code}
> /content/foo [test:Page]
> + bar (String)
> - jcr:content [test:PageContent]
> + bar (String)
> {code}
> where both strings (the _bar_ property at _foo_ and the _bar_ property at
> _jcr:content_) contain different text.
> Afterwards it queries for 2 terms ("tinc*" and "aliq*") that either exist in
> _/content/foo/bar_ or _/content/foo/jcr:content/bar_ but not in both. For the
> former one the excerpt is properly provided for the later one it isn't.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)