[ 
https://issues.apache.org/jira/browse/OAK-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053373#comment-14053373
 ] 

Marcel Reutegger commented on OAK-1650:
---------------------------------------

I tried to reproduce the exceptions with the current trunk and so far did not
see any of the two exceptions. I think there are two reasons:

1) MongoDB 2.4.x was used back then. Oak 1.0.x requires MongoDB 2.6.x.
In the past we identified some weird behaviour with MongoDB 2.4.x, e.g.
an insert would succeed even though the document couldn't be written to
MongoDB (OAK-1589). I'm pretty sure the NPE seen during the initial test
is another one. The NPE is caused by a document returned from MongoDB
without an _id. This is actually impossible, because each document in
MongoDB *must* have an _id field.

2) Since the time this issue was created we fixed a number of issues
in MongoMK that affected consistency of reads. The MicroKernelException
seen during the initial test is likely a symptom of those issues.

To further analyze the failover behaviour, I will setup a test, which
continuously kills the current primary and restarts it after a while.

> NPE and MicroKernelException: The node .. does not exist, on replica primary 
> crash during save
> ----------------------------------------------------------------------------------------------
>
>                 Key: OAK-1650
>                 URL: https://issues.apache.org/jira/browse/OAK-1650
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: core, mongomk
>    Affects Versions: 0.19
>         Environment: 0.20-SNAPSHOT as of March 31
>            Reporter: Stefan Egli
>            Assignee: Marcel Reutegger
>             Fix For: 1.1
>
>         Attachments: ReplicaCrashResilienceLargeTxTest.java
>
>
> When crashing the replica-primary while saving a large transaction, the 
> following two exceptions occur. Had this twice in a row, thus 'sort of' 
> reproduceable. I'll attach the test case in a minute.
> {code}Mar 31, 2014 11:49:04 AM com.mongodb.DBTCPConnector setMasterAddress
> WARNING: Primary switching from localhost/127.0.0.1:12321 to 
> localhost/127.0.0.1:12322
> Writer: Created level1 node: 
> Node[NodeDelegate{tree=/replicaCrashLargeTxTest-1396259321921/2: { 
> jcr:primaryType = nt:unstructured}}]
> org.apache.jackrabbit.mk.api.MicroKernelException: 
> java.lang.NullPointerException
>       at 
> org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.findAndModify(MongoDocumentStore.java:483)
>       at 
> org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.createOrUpdate(MongoDocumentStore.java:495)
>       at 
> org.apache.jackrabbit.oak.plugins.document.Commit.createOrUpdateNode(Commit.java:449)
>       at 
> org.apache.jackrabbit.oak.plugins.document.Commit.applyToDocumentStore(Commit.java:335)
>       at 
> org.apache.jackrabbit.oak.plugins.document.Commit.prepare(Commit.java:212)
>       at 
> org.apache.jackrabbit.oak.plugins.document.Commit.apply(Commit.java:181)
>       at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:172)
>       at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:85)
>       at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:1)
>       at 
> org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$Persisted.persistTransientHead(AbstractNodeStoreBranch.java:598)
>       at 
> org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$Persisted.setRoot(AbstractNodeStoreBranch.java:547)
>       at 
> org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch.setRoot(AbstractNodeStoreBranch.java:208)
>       at 
> org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.purge(DocumentRootBuilder.java:188)
>       at 
> org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.updated(DocumentRootBuilder.java:99)
>       at 
> org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.updated(MemoryNodeBuilder.java:205)
>       at 
> org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:489)
>       at 
> org.apache.jackrabbit.oak.core.SecureNodeBuilder.setProperty(SecureNodeBuilder.java:260)
>       at 
> org.apache.jackrabbit.oak.core.MutableTree.updateChildOrder(MutableTree.java:337)
>       at 
> org.apache.jackrabbit.oak.core.MutableTree.setOrderableChildren(MutableTree.java:220)
>       at org.apache.jackrabbit.oak.util.TreeUtil.addChild(TreeUtil.java:207)
>       at 
> org.apache.jackrabbit.oak.jcr.delegate.NodeDelegate.addChild(NodeDelegate.java:692)
>       at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl$5.perform(NodeImpl.java:286)
>       at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl$5.perform(NodeImpl.java:1)
>       at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:308)
>       at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:113)
>       at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.addNode(NodeImpl.java:253)
>       at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.addNode(NodeImpl.java:238)
>       at 
> org.apache.jackrabbit.oak.run.ReplicaCrashResilienceLargeTxTest$1.run(ReplicaCrashResilienceLargeTxTest.java:115)
>       at java.lang.Thread.run(Thread.java:695)
> Caused by: java.lang.NullPointerException
>       at 
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:192)
>       at 
> org.apache.jackrabbit.oak.plugins.document.util.StringValue.<init>(StringValue.java:35)
>       at 
> org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.addToCache(MongoDocumentStore.java:810)
>       at 
> org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.applyToCache(MongoDocumentStore.java:765)
>       at 
> org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.findAndModify(MongoDocumentStore.java:477)
>       ... 28 more
> {code}
> and:
> {code}Exception in thread "Thread-5" 
> org.apache.jackrabbit.mk.api.MicroKernelException: The node 
> 1:/replicaCrashLargeTxTest-1396259321921 does not exist or is already 
> deleted, before
> r145178ad3bd-0-1; document:
> {_id=1:/replicaCrashLargeTxTest-1396259321921,
> _modified=1396259345, :childOrder={},
> _modCount=2,
> _commitRoot={}},
> revision order:
> 1:
>  r145178a7b35-0-1:r145178a7b1b-0-0 r145178a7b35-1-1:r145178a7b36-0-0
> 2:
>  r14517846da4-68-2:r145178a7b1b-1-0
>       at 
> org.apache.jackrabbit.oak.plugins.document.Commit.checkConflicts(Commit.java:532)
>       at 
> org.apache.jackrabbit.oak.plugins.document.Commit.createOrUpdateNode(Commit.java:450)
>       at 
> org.apache.jackrabbit.oak.plugins.document.Commit.applyToDocumentStore(Commit.java:335)
>       at 
> org.apache.jackrabbit.oak.plugins.document.Commit.prepare(Commit.java:212)
>       at 
> org.apache.jackrabbit.oak.plugins.document.Commit.apply(Commit.java:181)
>       at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:172)
>       at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:85)
>       at 
> org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:1)
>       at 
> org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$Persisted.persistTransientHead(AbstractNodeStoreBranch.java:598)
>       at 
> org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$Persisted.setRoot(AbstractNodeStoreBranch.java:547)
>       at 
> org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch.setRoot(AbstractNodeStoreBranch.java:208)
>       at 
> org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.purge(DocumentRootBuilder.java:188)
>       at 
> org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.updated(DocumentRootBuilder.java:99)
>       at 
> org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.updated(MemoryNodeBuilder.java:205)
>       at 
> org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:329)
>       at 
> org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:321)
>       at 
> org.apache.jackrabbit.oak.core.SecureNodeBuilder.setChildNode(SecureNodeBuilder.java:317)
>       at 
> org.apache.jackrabbit.oak.core.MutableTree.addChild(MutableTree.java:199)
>       at org.apache.jackrabbit.oak.util.TreeUtil.addChild(TreeUtil.java:204)
>       at 
> org.apache.jackrabbit.oak.jcr.delegate.NodeDelegate.addChild(NodeDelegate.java:692)
>       at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl$5.perform(NodeImpl.java:286)
>       at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl$5.perform(NodeImpl.java:1)
>       at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:308)
>       at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:113)
>       at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.addNode(NodeImpl.java:253)
>       at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.addNode(NodeImpl.java:238)
>       at 
> org.apache.jackrabbit.oak.run.ReplicaCrashResilienceLargeTxTest$1.run(ReplicaCrashResilienceLargeTxTest.java:95)
>       at java.lang.Thread.run(Thread.java:695)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to