[ 
https://issues.apache.org/jira/browse/OAK-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970754#comment-15970754
 ] 

Chetan Mehrotra edited comment on OAK-3680 at 4/17/17 7:07 AM:
---------------------------------------------------------------

*Potential issue with using time based revision lookup instead of checkpoint 
based*

If RevisionGC has not happened for DocumentNodeStore it would be possible  to 
view the repository at any arbitrary time in past without relying on 
checkpoint. However if RevisionGC is happening then it can cause issue where 
change in node might remain undetected.

Consider a node like /a with following timeline

# T1 - foo=bar and indexed
# T2 - foo=bar2
# T3 RevisionGC collected foo=bar
# T4 - Partial reindexing is attempted and indexer tries to lookup repository 
state at time T1 (via some means). As RevisionGC got triggered repository view 
would of T2 and thus before and after NodeState would turn out to be same and 
/a would not be reported for indexing.

Hence we cannot rely on any means other that checkpoint to lookup repository 
state at older time even with DocumentNodeStore. 

[~mreutegg] [~catholicon] Is that understanding correct


was (Author: chetanm):
*Potential issue with using time based revision lookup instead of checkpoint 
based*

If RevisionGC has not happened much for DocumentNodeStore it would be possible  
to view the repository at any arbitrary time in past without relying on 
checkpoint. However if RevisionGC is happening then it can cause issue where 
change in node might remain undetected.

Consider a node like /a with following timeline

# T1 - foo=bar and indexed
# T2 - foo=bar2
# T3 RevisionGC collected foo=bar
# T4 - Partial reindexing is attempted and indexer tries to lookup repository 
state at time T1 (via some means). As RevisionGC got triggered repository view 
would of T2 and thus before and after NodeState would turn out to be same and 
/a would not be reported for indexing.

Hence we cannot rely on any means other that checkpoint to lookup repository 
state at older time even with DocumentNodeStore

[~mreutegg] [~catholicon] Is that understanding correct

> Partial re-index from last known good state
> -------------------------------------------
>
>                 Key: OAK-3680
>                 URL: https://issues.apache.org/jira/browse/OAK-3680
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: indexing, lucene
>    Affects Versions: 1.0, 1.2
>            Reporter: Michael Marth
>            Assignee: Chetan Mehrotra
>              Labels: resilience
>             Fix For: 1.8
>
>
> ATM indexes break (by whatever circumstances) users need to perform a full 
> re-index. Depending on the size off the repository this can take a long time.
> If the user knows that the indexes were in a good state at a certain revision 
> in the past then it would be very useful, if the user could trigger a 
> "partial" re-index where only the content added after a certain revision was 
> updated in the index.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to