[
https://issues.apache.org/jira/browse/OAK-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230039#comment-15230039
]
Davide Giannella commented on OAK-4043:
---------------------------------------
[~alex.parvulescu]
I was eventually looking into this. I have some doubts though. tell me
what you think and if I'm wrong with anything.
The trick would be to return, for example, as {{Set<String>}} from the
[Checkpoints.getReferenceCheckpoint()|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/checkpoint/Checkpoints.java#L213]
and then use {{Set.contains()}} (or any other similar approach) when
looking up whether the current analysed checkpoint is referenced or
not
([Checkpoints.removeUnreferenced()|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/checkpoint/Checkpoints.java#L129]).
Unfortunately the name of the property for the configured lanes is
free. So you could put in anything and we don't have traces, as far as
I saw, on what are the configured {{name}}s. Those {{name}}s should
than be used instead of the hardcoded {{async}} in the
[getReferencedCheckpoints()|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/checkpoint/Checkpoints.java#L215].
A possible solution would be to namespace the property that holds the
checkpoint reference under {{/:async}}. So we would end up with
something like: {{/:async@namespace-lane1, /:async@namespace-lane2, ...}}.
In this way we would be able to loop through all the namespaced
properties.
I have serious concerns on previous deployments though. Wouldn't this
trigger a full reindex if we go in this direction? We could mitigate
this by adding a sort-of "upgrade logic" when retrieving the new
checkpoint reference. Something like the following in
AsyncIndexUpdate. Should try to generalise the code in the {{else}}
rather than copy-pasting; but it's for showing the idea.
{noformat}
diff --git
a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java
b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java
index b6eac68..b40e571 100644
---
a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java
+++
b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java
@@ -102,6 +102,8 @@ public class AsyncIndexUpdate implements Runnable,
Closeable {
* taking over a running job
*/
private static final long DEFAULT_ASYNC_TIMEOUT;
+
+ private static final String CP_PROP_NAMESPACE = "cpreference-";
static {
int value = 15;
@@ -387,7 +389,7 @@ public class AsyncIndexUpdate implements Runnable,
Closeable {
// find the last indexed state, and check if there are recent changes
NodeState before;
- String beforeCheckpoint = async.getString(name);
+ String beforeCheckpoint = async.getString(CP_PROP_NAMESPACE + name);
if (beforeCheckpoint != null) {
NodeState state = store.retrieve(beforeCheckpoint);
if (state == null) {
@@ -406,8 +408,28 @@ public class AsyncIndexUpdate implements Runnable,
Closeable {
before = state;
}
} else {
- log.info("[{}] Initial index update", name);
- before = MISSING_NODE;
+ beforeCheckpoint = async.getString(name);
+ if (beforeCheckpoint == null) {
+ log.info("[{}] Initial index update", name);
+ before = MISSING_NODE;
+ } else {
+ NodeState state = store.retrieve(beforeCheckpoint);
+ if (state == null) {
+ log.warn(
+ "[{}] Failed to retrieve previously indexed
checkpoint {}; re-running the initial index update",
+ name, beforeCheckpoint);
+ beforeCheckpoint = null;
+ before = MISSING_NODE;
+ } else if (noVisibleChanges(state, root)) {
+ log.debug(
+ "[{}] No changes since last checkpoint; skipping
the index update",
+ name);
+ postAsyncRunStatsStatus(indexStats);
+ return;
+ } else {
+ before = state;
+ }
+ }
}
// there are some recent changes, so let's create a new checkpoint
@@ -511,7 +533,7 @@ public class AsyncIndexUpdate implements Runnable,
Closeable {
throw exception;
}
- builder.child(ASYNC).setProperty(name, afterCheckpoint);
+ builder.child(ASYNC).setProperty(CP_PROP_NAMESPACE + name,
afterCheckpoint);
builder.child(ASYNC).setProperty(PropertyStates.createProperty(lastIndexedTo,
afterTime, Type.DATE));
if (callback.isDirty() || before == MISSING_NODE) {
if (switchOnSync) {
{noformat}
> Oak run checkpoints needs to account for multiple index lanes
> -------------------------------------------------------------
>
> Key: OAK-4043
> URL: https://issues.apache.org/jira/browse/OAK-4043
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: core, run
> Reporter: Alex Parvulescu
> Assignee: Davide Giannella
> Priority: Critical
> Fix For: 1.6
>
>
> Oak run {{checkpoints rm-unreferenced}} [0] currently is hardcoded on a
> single checkpoint reference (the default one). Now is it possible to add
> multiple lanes, which we already did in AEM, but the checkpoint tool is
> blissfully unaware of this and it might trigger a full reindex following
> offline compaction.
> This needs fixing before the big 1.4 release, so I'm marking it as a blocker.
> fyi [~edivad], [~chetanm]
> [0] https://github.com/apache/jackrabbit-oak/tree/trunk/oak-run#checkpoints
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)