[ 
https://issues.apache.org/jira/browse/OAK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908103#comment-14908103
 ] 

Alex Parvulescu commented on OAK-3436:
--------------------------------------

Let me see if I got this right

{noformat}
  (N1) | ** ( E) ------------ (S ) *** ( E) -----------
  (N2) | ------------ (S ) *********** ( E) --- (S ) **
(time) | -- (T0) ---- (T1) -- (T2) --- (T3) --- (T4) --
{noformat}

* T0:
  N1 ended
  ref checkpoint = CP1
  tmp = []

* T1:
  N1 idle
  N2 starts, creates CP2
  ref checkpoint = CP1
  tmp = [CP2]

* T2:
  N2 indexing
  *lease expires*
  N1 starts again, cleans tmp (*removes CP2*), creates CP3
  ref checkpoint = CP1
  tmp = [CP3]

 * T3:
  N2 ended: releases CP1, updates lease
  N1 will fail commit shortly on account of lease
  ref checkpoint = CP2 (*which doesn't exist*)
  tmp = [CP3]

 * T4:
   N2 has lease, starts again
   ref checkpoint = CP2
   tmp = [CP3]
   CP2 doesn't exist, full reindexing starts

So I seeing 2 ideas standing out:

* first the mechanism we build to prevent concurrent indexing of different 
cluster nodes is not working properly for some reason
I'd invest time in finding out what is happening. My guess is the lease is 
expiring because in some cases the traversal is too long and the index being 
rebuild simply doesn't have any content to index, therefore the callback 
doesn't get called, so the lease doesn't get updated (take this with a grain of 
salt).

One thing I don't understand is why doesn't N2 fail to acquire the lease once 
N1 starts indexing? There's a moment when N2 times-out, N1 starts indexing, but 
at T3 when N2 comes back it should fail the commit, no?

Also if there are blackout intervals where the lease expires and the other node 
is taking over, then this means the 2 nodes will always be competing for 
indexing, unless we fix the root cause that makes the lease expire (would 
increasing the lease timeout help mitigate this issue?).

* moving checkpoint cleanup to the end of the cycle would fix this
I'm not against this option, but I'd like to clarify the lease stuff first, if 
possible.



 

> Prevent missing checkpoint due to unstable topology from causing complete 
> reindexing
> ------------------------------------------------------------------------------------
>
>                 Key: OAK-3436
>                 URL: https://issues.apache.org/jira/browse/OAK-3436
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: query
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.3.7, 1.2.7, 1.0.22
>
>         Attachments: AsyncIndexUpdateClusterTest.java, OAK-3436-0.patch
>
>
> Async indexing logic relies on embedding application to ensure that async 
> indexing job is run as a singleton in a cluster. For Sling based apps it 
> depends on Sling Discovery support. At times it is being seen that if 
> topology is not stable then different cluster nodes can consider them as 
> leader and execute the async indexing job concurrently.
> This can cause problem as both cluster node might not see same repository 
> state (due to write skew and eventual consistency) and might remove the 
> checkpoint which other cluster node is still relying upon. For e.g. consider 
> a 2 node cluster N1 and N2 where both are performing async indexing.
> # Base state - CP1 is the checkpoint for "async" job
> # N2 starts indexing and removes changes CP1 to CP2. For Mongo the 
> checkpoints are saved in {{settings}} collection
> # N1 also decides to execute indexing but has yet not seen the latest 
> repository state so still thinks that CP1 is the base checkpoint and tries to 
> read it. However CP1 is already removed from {{settings}} and this makes N1 
> think that checkpoint is missing and it decides to reindex everything!
> To avoid this topology must be stable but at Oak level we should still handle 
> such a case and avoid doing a full reindexing. So we would need to have a 
> {{MissingCheckpointStrategy}} similar to {{MissingIndexEditorStrategy}} as 
> done in OAK-2203 
> Possible approaches
> # A1 - Fail the indexing run if checkpoint is missing - Checkpoint being 
> missing can have valid reason and invalid reason. Need to see what are valid 
> scenarios where a checkpoint can go missing
> # A2 - When a checkpoint is created also store the creation time. When a 
> checkpoint is found to be missing and its a *recent* checkpoint then fail the 
> run. For e.g. we would fail the run till checkpoint found to be missing is 
> less than an hour old (for just started take startup time into account)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to