[jira] [Updated] (OAK-5499) IndexUpdate can do mulitple traversal of a content tree during initial index when there are sub-root indices

2017-11-06 Thread Alex Deparvu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Deparvu updated OAK-5499:
--
Fix Version/s: (was: 1.8)

> IndexUpdate can do mulitple traversal of a content tree during initial index 
> when there are sub-root indices
> 
>
> Key: OAK-5499
> URL: https://issues.apache.org/jira/browse/OAK-5499
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: indexing
>Reporter: Vikas Saurabh
>Priority: Minor
> Attachments: OAK-5499-v2-demo.patch, OAK-5499-v2-fix.patch, 
> OAK-5499.patch
>
>
> In case we've index defs such as:
> {noformat}
> /oak:index/foo1Index
> /content
>/oak:index/foo2Index
> {noformat}
> then initial indexing process \[0] would traverse tree under {{/content}} 
> twice - once while indexing for top-level indices and next when it starts to 
> index newly discovered {{foo2Index}} while traversing {{/content/oak:index}}.
> What we can do is that while first diff processes {{/content}} and discovers 
> a node named {{oak:index}}, it can actively go in that tree and peek into 
> index defs from under it and register as required. The diff can then proceed 
> under {{/content}} while the new indices would also get diffs (avoiding 
> another traversal)
> \[0] first time indexing or in case {{/:async}} gets deleted or checkpoint 
> for async index couldn't be retrieved



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-5499) IndexUpdate can do mulitple traversal of a content tree during initial index when there are sub-root indices

2017-03-26 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-5499:
-
Component/s: (was: core)
 indexing

> IndexUpdate can do mulitple traversal of a content tree during initial index 
> when there are sub-root indices
> 
>
> Key: OAK-5499
> URL: https://issues.apache.org/jira/browse/OAK-5499
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: indexing
>Reporter: Vikas Saurabh
>Assignee: Alex Parvulescu
>Priority: Minor
> Fix For: 1.8
>
> Attachments: OAK-5499.patch, OAK-5499-v2-demo.patch, 
> OAK-5499-v2-fix.patch
>
>
> In case we've index defs such as:
> {noformat}
> /oak:index/foo1Index
> /content
>/oak:index/foo2Index
> {noformat}
> then initial indexing process \[0] would traverse tree under {{/content}} 
> twice - once while indexing for top-level indices and next when it starts to 
> index newly discovered {{foo2Index}} while traversing {{/content/oak:index}}.
> What we can do is that while first diff processes {{/content}} and discovers 
> a node named {{oak:index}}, it can actively go in that tree and peek into 
> index defs from under it and register as required. The diff can then proceed 
> under {{/content}} while the new indices would also get diffs (avoiding 
> another traversal)
> \[0] first time indexing or in case {{/:async}} gets deleted or checkpoint 
> for async index couldn't be retrieved



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (OAK-5499) IndexUpdate can do mulitple traversal of a content tree during initial index when there are sub-root indices

2017-01-25 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu updated OAK-5499:
-
Attachment: OAK-5499-v2-fix.patch
OAK-5499-v2-demo.patch

Attaching a possible fix. I decided to investigate a different approach, which 
is to skip the out of band indexing if the base state is the {{MISSING_NODE}}, 
this is the only case where the extra traversal is very expensive (see 
OAK-5499-v2-fix.patch).

The way it would work is the first index's reindex will be a part of the full 
traversal (no longer a dedicated reindex traversal), and would also pickup 
other index definitions that also need a reindex and include those as well in 
the current traversal (no longer spawning out of band reindex traversals).
Unfortunately this is a pain to test, so I don't have anything better than some 
logs, I also attached the version of the patch where anyone can see the logs 
locally (see OAK-5499-v2-demo.patch).

To simplify feedback here's the output without the patch (_IU_ is the 
IndexUpdate class, _CNA_ is the childNodeAdded call, _E0_ and _E1_ are the 2 
indexers):
{noformat}
[IU] Reindexing [/oak:index/foo1Index]
[E0] /
[E0] /content
[E0] /content/childContent
[E0] /content/childContent/c0
[E0] /content/childContent/c0/c1
[E0] /content/oak:index
[E0] /content/oak:index/foo2Index
[E0] /oak:index
[E0] /oak:index/foo1Index
Reindexing done for [/oak:index/foo1Index]
[IU] CNA /
[IU] Reindexing [/content/oak:index/foo2Index]
[E1] /
[E1] /childContent
[E1] /childContent/c0
[E1] /childContent/c0/c1
[E1] /oak:index
[E1] /oak:index/foo2Index
Reindexing done for [/content/oak:index/foo2Index]
[IU] CNA /content
[IU] CNA /content/childContent
[IU] CNA /content/childContent/c0
[IU] CNA /content
[IU] CNA /content/oak:index
[IU] CNA /
[IU] CNA /oak:index
{noformat}
We can see the extra traversals happening, whereas with the patch, both indexes 
reindex are collapsed into the main traversal thread:
{noformat}
[IU] Reindexing [/oak:index/foo1Index]
[E0] /
[IU] CNA /
[IU] Reindexing [/content/oak:index/foo2Index]
[E1] /
[E0] /content
[IU] CNA /content
[E1] /childContent
[E0] /content/childContent
[IU] CNA /content/childContent
[E1] /childContent/c0
[E0] /content/childContent/c0
[IU] CNA /content/childContent/c0
[E1] /childContent/c0/c1
[E0] /content/childContent/c0/c1
[IU] CNA /content
[E1] /oak:index
[E0] /content/oak:index
[IU] CNA /content/oak:index
[E1] /oak:index/foo2Index
[E0] /content/oak:index/foo2Index
[IU] CNA /
[E0] /oak:index
[IU] CNA /oak:index
[E0] /oak:index/foo1Index
{noformat}

I took some special care to preserve the current logging style on reindex, and 
I believe I managed to do that, but there might have been aspects I forgot. 
feedback very appreciated!


> IndexUpdate can do mulitple traversal of a content tree during initial index 
> when there are sub-root indices
> 
>
> Key: OAK-5499
> URL: https://issues.apache.org/jira/browse/OAK-5499
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
> Fix For: 1.8
>
> Attachments: OAK-5499.patch, OAK-5499-v2-demo.patch, 
> OAK-5499-v2-fix.patch
>
>
> In case we've index defs such as:
> {noformat}
> /oak:index/foo1Index
> /content
>/oak:index/foo2Index
> {noformat}
> then initial indexing process \[0] would traverse tree under {{/content}} 
> twice - once while indexing for top-level indices and next when it starts to 
> index newly discovered {{foo2Index}} while traversing {{/content/oak:index}}.
> What we can do is that while first diff processes {{/content}} and discovers 
> a node named {{oak:index}}, it can actively go in that tree and peek into 
> index defs from under it and register as required. The diff can then proceed 
> under {{/content}} while the new indices would also get diffs (avoiding 
> another traversal)
> \[0] first time indexing or in case {{/:async}} gets deleted or checkpoint 
> for async index couldn't be retrieved



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)