[ 
https://issues.apache.org/jira/browse/OAK-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230039#comment-15230039
 ] 

Davide Giannella commented on OAK-4043:
---------------------------------------

[~alex.parvulescu]

I was eventually looking into this. I have some doubts though. tell me
what you think and if I'm wrong with anything.

The trick would be to return, for example, as {{Set<String>}} from the
[Checkpoints.getReferenceCheckpoint()|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/checkpoint/Checkpoints.java#L213]
and then use {{Set.contains()}} (or any other similar approach) when
looking up whether the current analysed checkpoint is referenced or
not
([Checkpoints.removeUnreferenced()|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/checkpoint/Checkpoints.java#L129]).

Unfortunately the name of the property for the configured lanes is
free. So you could put in anything and we don't have traces, as far as
I saw, on what are the configured {{name}}s. Those {{name}}s should
than be used instead of the hardcoded {{async}} in the
[getReferencedCheckpoints()|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/checkpoint/Checkpoints.java#L215].

A possible solution would be to namespace the property that holds the
checkpoint reference under {{/:async}}. So we would end up with
something like: {{/:async@namespace-lane1, /:async@namespace-lane2, ...}}.

In this way we would be able to loop through all the namespaced
properties.

I have serious concerns on previous deployments though. Wouldn't this
trigger a full reindex if we go in this direction? We could mitigate
this by adding a sort-of "upgrade logic" when retrieving the new
checkpoint reference. Something like the following in
AsyncIndexUpdate. Should try to generalise the code in the {{else}}
rather than copy-pasting; but it's for showing the idea.

{noformat}
diff --git 
a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java
 
b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java
index b6eac68..b40e571 100644
--- 
a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java
+++ 
b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java
@@ -102,6 +102,8 @@ public class AsyncIndexUpdate implements Runnable, 
Closeable {
      * taking over a running job
      */
     private static final long DEFAULT_ASYNC_TIMEOUT;
+    
+    private static final String CP_PROP_NAMESPACE = "cpreference-";
 
     static {
         int value = 15;
@@ -387,7 +389,7 @@ public class AsyncIndexUpdate implements Runnable, 
Closeable {
 
         // find the last indexed state, and check if there are recent changes
         NodeState before;
-        String beforeCheckpoint = async.getString(name);
+        String beforeCheckpoint = async.getString(CP_PROP_NAMESPACE + name);
         if (beforeCheckpoint != null) {
             NodeState state = store.retrieve(beforeCheckpoint);
             if (state == null) {
@@ -406,8 +408,28 @@ public class AsyncIndexUpdate implements Runnable, 
Closeable {
                 before = state;
             }
         } else {
-            log.info("[{}] Initial index update", name);
-            before = MISSING_NODE;
+            beforeCheckpoint = async.getString(name);
+            if (beforeCheckpoint == null) {
+                log.info("[{}] Initial index update", name);
+                before = MISSING_NODE;                
+            } else {
+                NodeState state = store.retrieve(beforeCheckpoint);
+                if (state == null) {
+                    log.warn(
+                            "[{}] Failed to retrieve previously indexed 
checkpoint {}; re-running the initial index update",
+                            name, beforeCheckpoint);
+                    beforeCheckpoint = null;
+                    before = MISSING_NODE;
+                } else if (noVisibleChanges(state, root)) {
+                    log.debug(
+                            "[{}] No changes since last checkpoint; skipping 
the index update",
+                            name);
+                    postAsyncRunStatsStatus(indexStats);
+                    return;
+                } else {
+                    before = state;
+                }
+            }
         }
 
         // there are some recent changes, so let's create a new checkpoint
@@ -511,7 +533,7 @@ public class AsyncIndexUpdate implements Runnable, 
Closeable {
                 throw exception;
             }
 
-            builder.child(ASYNC).setProperty(name, afterCheckpoint);
+            builder.child(ASYNC).setProperty(CP_PROP_NAMESPACE + name, 
afterCheckpoint);
             
builder.child(ASYNC).setProperty(PropertyStates.createProperty(lastIndexedTo, 
afterTime, Type.DATE));
             if (callback.isDirty() || before == MISSING_NODE) {
                 if (switchOnSync) {
{noformat}


> Oak run checkpoints needs to account for multiple index lanes
> -------------------------------------------------------------
>
>                 Key: OAK-4043
>                 URL: https://issues.apache.org/jira/browse/OAK-4043
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: core, run
>            Reporter: Alex Parvulescu
>            Assignee: Davide Giannella
>            Priority: Critical
>             Fix For: 1.6
>
>
> Oak run {{checkpoints rm-unreferenced}} [0] currently is hardcoded on a 
> single checkpoint reference (the default one). Now is it possible to add 
> multiple lanes, which we already did in AEM, but the checkpoint tool is 
> blissfully unaware of this and it might trigger a full reindex following 
> offline compaction.
> This needs fixing before the big 1.4 release, so I'm marking it as a blocker.
> fyi [~edivad], [~chetanm]
> [0] https://github.com/apache/jackrabbit-oak/tree/trunk/oak-run#checkpoints



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to