[
https://issues.apache.org/jira/browse/OAK-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14521400#comment-14521400
]
Thomas Mueller commented on OAK-2599:
-------------------------------------
Chetan and me discussed this on the phone. I think the rules (1. - 4.) are
quite complicated, and don't match how I thought it should work; but if we
change the rules, then the flag forceIndexOnPathRestrictionMismatch would need
to be set in too many cases. Both would not be good.
An alternative to the forceIndexOnPathRestrictionMismatch option would be to
add a multi-value property "queryPaths", to make it explicit for which queries
the index is used. There would be 3 properties:
* {{includedPaths}} set of paths to be indexed
* {{excludedPaths}} set of paths to _not_ be indexed
* {{queryPaths}} set of paths in a query where this index is used
In an ideal world, {{excludePaths}} would not be set. {{includePaths}} and
{{queryPaths}} would be both set to (for example) [ "/content/a", "/content/b"
]. The index would be used for queries below "/content/a" as well as for
queries below "/content/b". But not for queries without path restriction, or
for queries below "/content/c".
In a not so ideal (but more real) world, {{excludePaths}} would be set to for
example [ "/jcr:system", "/var" ], and {{queryPaths}} to [ "/" ]. That way,
the index would not contain "/jcr:system" and "/var", but be used for all kinds
of queries (even those without path restriction). I guess if this property is
not set, the index should always be used (same as [ "/" ] ). That would mean,
the index is also used for queries with isdecendantnode('/var'), in which case
it would never return anything, because those paths are excluded.
> Allow excluding certain paths from getting indexed for particular index
> -----------------------------------------------------------------------
>
> Key: OAK-2599
> URL: https://issues.apache.org/jira/browse/OAK-2599
> Project: Jackrabbit Oak
> Issue Type: New Feature
> Components: core, query
> Reporter: Chetan Mehrotra
> Labels: performance
> Fix For: 1.3.0
>
> Attachments: OAK-2599-1.patch, OAK-2599-v2.patch
>
>
> Currently an {{IndexEditor}} gets to index all nodes under the tree where it
> is defined (post OAK-1980). Due to this IndexEditor would traverse the whole
> repo (or subtree if configured in non root path) to perform reindex.
> Depending on the repo size this process can take quite a bit of time. It
> would be faster if an IndexEditor can exclude certain paths from traversal
> Consider an application like Adobe AEM and an index which only index
> dam:Asset or the default full text index. For a fulltext index it might make
> sense to avoid indexing the versionStore. So if the index editor skips such
> path then lots of redundant traversal can be avoided.
> Also see http://markmail.org/thread/4cuuicakagi6av4v
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)