Hi Alexander Thanks for the detailed report and analysis. May I suggest that you create a ticket at https://issues.apache.org/jira/projects/OAK/issues with component documentmk providing all the details and attaching the patch? I think that's the best way to move forward.
Unfortunately, I am not sufficiently familiar with that area of Oak to review the proposed modification, but Marcel Reutegger and Julian Reschke will for sure be able to provide you with feedback and suggestions and I would suggest you ping them directly in the Jira ticket. Kind regards Angela ________________________________ From: Alexander Lüders <a...@entimo.de> Sent: Friday, July 24, 2020 3:19 PM To: oak-dev@jackrabbit.apache.org <oak-dev@jackrabbit.apache.org> Subject: Re: Slow performance in AbstractNodeState.equals Hi again, digging deeper I derived following patch which solved the performance issue: Index: oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/AbstractDocumentNodeState.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== --- oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/AbstractDocumentNodeState.java (revision 17ea60b0bfb0ca3670c99208170a774a6c99fdfc) +++ oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/AbstractDocumentNodeState.java (date 1595575140063) @@ -83,7 +83,11 @@ // revision does not match: might still be equals } else if (that instanceof ModifiedNodeState) { ModifiedNodeState modified = (ModifiedNodeState) that; - if (modified.getBaseState() == this) { + NodeState baseState = modified.getBaseState(); + if (baseState instanceof ModifiedDocumentNodeState) { + baseState = ((ModifiedDocumentNodeState) baseState).getBaseState(); + } + if (baseState == this) { return EqualsDiff.equals(this, modified); } } I am not entirely sure that this is correct but comparing with the ModifiedDocumentNodeState#equals method, I assume it to be. The latter method compares the passed object with its baseState not the ModifiedDocumentNodeState instance itself. This is the prime difference when comparing the differences of ModifiedDocumentNodeState#equals and AbstractNodeDocumentState#equals. The patch above transfers that logic to AbstractNodeDocumentState#equals. What do you think? Many thanks for your help. Best regards, Alexander Lüders Am 23.07.2020 um 17:26 schrieb Alexander Lüders: Dear Jackrabbit Oak developers, we are currently trying to hunt down a performance issue we are facing at a customer site. We strongly need your help on this so any comments would be greatly appreciated. We have a MongoDB as a backend of a Oak repository where we see hundreds of expensive queries being executed during a rather simple operation. It happens when we are saving a session in which we added a child node to a node already containing a very large number of children. This has not been the case with Jackrabbit Oak 1.4.8 and 1.4.26 but it is an issue with 1.10.8 and 1.22.3. Newer versions have not been tested yet but looking at the source we believe that they will show that performance issue too. What we know so far: * The queries against the MongoDocumentStore are triggered via AbstractNodeState.equals (see stacktrace below) * A TODO "inefficient unless there are very few child nodes" in that method indicates poor performance but this has been there since a long time and cannot explain the issue * OAK-7401 introduced the class ModifiedDocumentNodeState class which is not inheriting ModifiedNodeState or AbstractDocumentNodeState (see class inheritance diagram) * AbstractDocumentNodeState#equals is checking against instances of ModifiedNodeState and AbstractDocumentNodeState but not ModifiedDocumentNodeState. * AbstractDocumentNodeState#equals triggers the "slow" AbstractNodeState#equals as a last resort We definitely do not see the full picture yet but guessing from a high-level perspective it looks like the introduction of ModifiedDocumentNodeState broke the equals logic in class AbstractDocumentNodeState. Would you agree with that conclusion? Is there anything further I can provide? Stacktrace: MongoDocumentStore.query(Collection<T>, String, String, String, long, int) line: 609 MongoDocumentStore.query(Collection<T>, String, String, int) line: 598 DocumentNodeStore.readChildDocs(String, String, int) line: 1293 DocumentNodeStore.readChildren(AbstractDocumentNodeState, String, int) line: 1233 DocumentNodeStore$7.call() line: 1185 DocumentNodeStore$7.call() line: 1182 LocalCache$LocalManualCache$1.load(Object) line: 4724 LocalCache$LoadingValueReference<K,V>.loadFuture(K, CacheLoader<? super K,V>) line: 3522 LocalCache$Segment<K,V>.loadSync(K, int, LoadingValueReference<K,V>, CacheLoader<? super K,V>) line: 2315 LocalCache$Segment<K,V>.lockedGetOrLoad(K, int, CacheLoader<? super K,V>) line: 2278 LocalCache$Segment<K,V>.get(K, int, CacheLoader<? super K,V>) line: 2193 LocalCache<K,V>.get(K, CacheLoader<? super K,V>) line: 3932 LocalCache$LocalManualCache<K,V>.get(K, Callable<? extends V>) line: 4721 DocumentNodeStore.getChildren(AbstractDocumentNodeState, String, int) line: 1182 DocumentNodeState.getChildNodeCount(long) line: 293 AbstractNodeState.equals(NodeState, NodeState) line: 375 DocumentNodeState(AbstractDocumentNodeState).equals(Object) line: 91 AbstractNodeState.equals(NodeState, NodeState) line: 384 DocumentNodeState(AbstractDocumentNodeState).equals(Object) line: 91 DocumentNodeStoreBranch$InMemory.setRoot(NodeState) line: 481 DocumentNodeStoreBranch.setRoot(NodeState) line: 111 DocumentRootBuilder.purge() line: 185 DocumentRootBuilder.merge(CommitHook, CommitInfo) line: 163 DocumentNodeStore.merge(NodeBuilder, CommitHook, CommitInfo) line: 1869 MutableRoot.commit(Map<String,Object>) line: 250 RepositoryImpl$1(SessionDelegate).commit(Root, String) line: 346 RepositoryImpl$1(SessionDelegate).save(String) line: 493 SessionImpl$8.performVoid() line: 424 RepositoryImpl$1(SessionDelegate).performVoid(SessionOperation<Void>) line: 273 SessionImpl.save() line: 421 Class inheritance diagram: [cid:part1.4706D425.06E5791C@entimo.de] Many thanks in advance for your help. Best regards, Alexander Lüders -- Alexander Lüders, Entimo AG Senior Software Engineer Stralauer Platz 33-34 | 10243 Berlin | Germany Tel +49(0)30 520 024 131 Fax +49(0)30 520 024 101 mailto:a...@entimo.com | http://www.entimo.com Vorstand: Marc Jantke (Vors.), Sven Prasse Aufsichtratsvorsitzende: Erika Tannenbaum Sitz der Gesellschaft: Berlin, Germany | Handelsregister: HRB Berlin-Charlottenburg 85073