mreutegg commented on a change in pull request #346:
URL: https://github.com/apache/jackrabbit-oak/pull/346#discussion_r698350304



##########
File path: 
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/LastRevRecoveryAgent.java
##########
@@ -309,6 +317,8 @@ public void sweepUpdate(Map<Path, UpdateOp> updates)
         long startOfScan = clock.getTime();
         long lastLog = startOfScan;
 
+        final List<Revision> pseudoBcRevs = new ArrayList<Revision>();

Review comment:
       Can use diamond operator.
   ```suggestion
           final List<Revision> pseudoBcRevs = new ArrayList<>();
   ```

##########
File path: 
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/LastRevRecoveryAgent.java
##########
@@ -364,7 +374,41 @@ public void sweepUpdate(Map<Path, UpdateOp> updates)
                     unsavedParents.put(path, lastRevForParents);
                 }
             }
+            // avoid recalculating the size of the updateOp upon every single 
path
+            // but also avoid doing it only after we hit the 16MB limit
+            if (changes.getNumChangedNodes() >= nextFlushCheckCount) {
+                final Revision pseudoBcRev = 
Revision.newRevision(clusterId).asBranchRevision();
+                final UpdateOp pseudoBcUpdateOp = 
changes.asUpdateOp(pseudoBcRev);
+                final int approxPseudoBcUpdateOpSize = 
pseudoBcUpdateOp.toString().length();
+                if (approxPseudoBcUpdateOpSize >= 
PSEUDO_BRANCH_COMMIT_UPDATE_OP_THRESHOLD_BYTES) {
+                    // flush the (pseudo) journal entry
+                    // regarding 'pseudo' : this journal entry, while being a 
branch commit,
+                    // does not correspond to an actual branch commit that 
happened before the crash.
+                    // we might be able to in theory reconstruct the very 
original branch commits,
+                    // but that's a tedious job, and we were not doing that 
prior to OAK-9535 neither.
+                    // hence the optimization built-in here is that we create 
a journal entry
+                    // of type 'branch commit', but with a revision that is 
different from
+                    // what originally happened. Thx to the fact that the 
JournalEntry just
+                    // contains a list of branch commit journal ids, that 
should work fine.
+                    if (store.create(JOURNAL, 
singletonList(pseudoBcUpdateOp))) {
+                        log.info("recover : created intermediate pseudo-bc 
journal entry with rev {} and approx size {} bytes.",
+                                pseudoBcRev, approxPseudoBcUpdateOpSize);
+                        pseudoBcRevs.add(pseudoBcRev);
+                        changes = JOURNAL.newDocument(store);
+                        nextFlushCheckCount = 
PSEUDO_BRANCH_COMMIT_FLUSH_CHECK_COUNT;
+                    } else {
+                        log.warn("recover : could not create intermediate 
pseudo-bc journal entry with rev {}",
+                                pseudoBcRev);
+                        // retry a little later then, hence reduce the next 
counter by half an interval
+                        nextFlushCheckCount += changes.getNumChangedNodes() + 
(PSEUDO_BRANCH_COMMIT_FLUSH_CHECK_COUNT / 2);

Review comment:
       I would rather fail recovery in this case.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to