[
https://issues.apache.org/jira/browse/OAK-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146835#comment-16146835
]
Dirk Rudolph edited comment on OAK-6597 at 8/30/17 7:59 AM:
------------------------------------------------------------
{quote}
which if enabled would enable storage of ":fulltext" field created in any of of
the above way
{quote}
That would mean that the excerpt is created from a stored field containing all
indexed properties of all nested nodes right? If so there could be the corner
case that the excerpt would contain weird text on the boundaries of a single
property value, no?
Example:
{code}
/content/foo
+ jcr:content
- text1 = "My fancy text"
- text2 = "This isn't so fancy"
{code}
If I'm right that would cause an excerpt like "My fancy <b>text</b> This isn't
so fancy" or even worse without the space: "My fancy <b>text</b>This isn't so
fancy". Wouldn't it make sense to store each and every nested property in its
own analyzed field (full:_jcr_content/text1) or similar?
Do we have any insights what will be the impact on the index size and with that
the impact on query performance against one index that has that feature
enabled?
was (Author: diru):
Do we have any insights what will be the impact on the index size and with that
the impact on query performance against one index that has that feature
enabled?
> rep:excerpt not working for content indexed by aggregation in lucene
> --------------------------------------------------------------------
>
> Key: OAK-6597
> URL: https://issues.apache.org/jira/browse/OAK-6597
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: lucene
> Affects Versions: 1.6.1, 1.7.6
> Reporter: Dirk Rudolph
> Fix For: 1.8
>
> Attachments: excerpt-with-aggregation-test.patch
>
>
> I mentioned that properties that got indexed due to an aggregation are not
> considered for excerpts (highlighting) as they are not indexed as stored
> fields.
> See the attached patch that implements a test for excerpts in
> {{LuceneIndexAggregationTest2}}.
> It creates the following structure:
> {code}
> /content/foo [test:Page]
> + bar (String)
> - jcr:content [test:PageContent]
> + bar (String)
> {code}
> where both strings (the _bar_ property at _foo_ and the _bar_ property at
> _jcr:content_) contain different text.
> Afterwards it queries for 2 terms ("tinc*" and "aliq*") that either exist in
> _/content/foo/bar_ or _/content/foo/jcr:content/bar_ but not in both. For the
> former one the excerpt is properly provided for the later one it isn't.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)