[ 
https://issues.apache.org/jira/browse/QPID-6997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Wall resolved QPID-6997.
------------------------------
    Resolution: Fixed

> [Java Broker, BDBStore] HA JE Transaction commit might end up in 
> NullPointerException when commit is invoked when majority is lost
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: QPID-6997
>                 URL: https://issues.apache.org/jira/browse/QPID-6997
>             Project: Qpid
>          Issue Type: Bug
>          Components: Java Broker
>    Affects Versions: qpid-java-6.0
>            Reporter: Alex Rudyy
>             Fix For: qpid-java-6.0.1, qpid-java-6.1
>
>         Attachments: 
> TEST-org.apache.qpid.server.store.berkeleydb.replication.MultiNodeTest.testTransferMasterWhilstTransactionsInFlight.txt.NPE
>
>
> On running some tests for BDB HA  with default settings 
> (ReplicaAckPolicy=SIMPLE_MAJORITY) involving publishing messages in parallel 
> from multiple connections and disconnecting replica nodes in the middle of 
> publishing transactions, in some rare circumstances the publishing 
> transaction JE commit failed with NPE thrown from JE code as below:
> {noformat}
> ########################################################################
> #
> # Unhandled Exception java.lang.NullPointerException in Thread 
> IO-/127.0.0.1:35408
> #
> # Exiting
> #
> ########################################################################
> java.lang.NullPointerException
>  at 
> com.sleepycat.je.rep.stream.FeederTxns.awaitReplicaAcks(FeederTxns.java:130)
>  at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHook(RepImpl.java:1130)
>  at com.sleepycat.je.rep.txn.MasterTxn.postLogCommitHook(MasterTxn.java:191)
>  at com.sleepycat.je.txn.Txn.commit(Txn.java:756)
>  at com.sleepycat.je.Transaction.doCommit(Transaction.java:609)
>  at com.sleepycat.je.Transaction.commit(Transaction.java:410)
>  at 
> org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.commitAsync(ReplicatedEnvironmentFacade.java:302)
>  at 
> org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore.commitTranAsyncImpl(AbstractBDBMessageStore.java:794)
>  at 
> org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore.access$1200(AbstractBDBMessageStore.java:74)
>  at 
> org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore$BDBTransaction.commitTranAsync(AbstractBDBMessageStore.java:1364)
>  at 
> org.apache.qpid.server.txn.LocalTransaction.commitAsync(LocalTransaction.java:399)
>  at 
> org.apache.qpid.server.protocol.v0_8.AMQChannel.commit(AMQChannel.java:1220)
>  at 
> org.apache.qpid.server.protocol.v0_8.AMQChannel.receiveTxCommit(AMQChannel.java:3622)
>  at org.apache.qpid.codec.ServerDecoder.processMethod(ServerDecoder.java:228)
>  at org.apache.qpid.codec.AMQDecoder.processFrame(AMQDecoder.java:191)
>  at 
> org.apache.qpid.server.protocol.v0_8.BrokerDecoder.doProcessFrame(BrokerDecoder.java:114)
>  at 
> org.apache.qpid.server.protocol.v0_8.BrokerDecoder.access$000(BrokerDecoder.java:36)
>  at 
> org.apache.qpid.server.protocol.v0_8.BrokerDecoder$1.run(BrokerDecoder.java:78)
>  at 
> org.apache.qpid.server.protocol.v0_8.BrokerDecoder$1.run(BrokerDecoder.java:74)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at 
> org.apache.qpid.server.protocol.v0_8.BrokerDecoder.processFrame(BrokerDecoder.java:73)
>  at org.apache.qpid.codec.AMQDecoder.processInput(AMQDecoder.java:173)
>  at org.apache.qpid.codec.AMQDecoder.decode(AMQDecoder.java:114)
>  at org.apache.qpid.codec.ServerDecoder.decodeBuffer(ServerDecoder.java:43)
>  at 
> org.apache.qpid.server.protocol.v0_8.AMQPConnection_0_8$1.run(AMQPConnection_0_8.java:266)
>  at 
> org.apache.qpid.server.protocol.v0_8.AMQPConnection_0_8$1.run(AMQPConnection_0_8.java:258)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at 
> org.apache.qpid.server.protocol.v0_8.AMQPConnection_0_8.received(AMQPConnection_0_8.java:257)
>  at 
> org.apache.qpid.server.transport.MultiVersionProtocolEngine.received(MultiVersionProtocolEngine.java:142)
>  at 
> org.apache.qpid.server.transport.NonBlockingConnection.processAmqpData(NonBlockingConnection.java:547)
>  at 
> org.apache.qpid.server.transport.NonBlockingConnectionPlainDelegate.processData(NonBlockingConnectionPlainDelegate.java:58)
>  at 
> org.apache.qpid.server.transport.NonBlockingConnection.doRead(NonBlockingConnection.java:446)
>  at 
> org.apache.qpid.server.transport.NonBlockingConnection.doWork(NonBlockingConnection.java:253)
>  at 
> org.apache.qpid.server.transport.NetworkConnectionScheduler.processConnection(NetworkConnectionScheduler.java:108)
>  at 
> org.apache.qpid.server.transport.SelectorThread$ConnectionProcessor.processConnection(SelectorThread.java:499)
>  at 
> org.apache.qpid.server.transport.SelectorThread$SelectionTask.performSelect(SelectorThread.java:337)
>  at 
> org.apache.qpid.server.transport.SelectorThread$SelectionTask.run(SelectorThread.java:86)
>  at 
> org.apache.qpid.server.transport.SelectorThread.run(SelectorThread.java:457)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:745)
> {noformat}
> It seems that majority was lost in the middle of transaction. Qpid HA logic 
> detected that and restarted the environment, however, the transaction was not 
> aborted.
> On closing of the environment the following has been reported:
> {noformat}
> Problem: 3 locks left
> ---- LSN: 0x0/0x2d80----
>  ThinLockAddr:1794979774 Owner:1919034342 
> -50_Selector-virtualhost-test-iopool_MasterTxn Waiters: (none)
> ---- LSN: 0x0/0x1ed8----
>  ThinLockAddr:374788562 Owner:1919034342 
> -50_Selector-virtualhost-test-iopool_MasterTxn Waiters: (none)
> ---- LSN: 0x0/0x20db----
>  ThinLockAddr:501824883 Owner:1919034342 
> -50_Selector-virtualhost-test-iopool_MasterTxn Waiters: (none)
> [BROKER-1] o.a.q.t.u.SpawnedBrokerHolder
> Problem: 1 txns left
> txnId = -50 txnName = 1919034342 
> -50_Selector-virtualhost-test-iopool_MasterTxn
> Local Cache Usage = 240
> Cache Layout: Allocation of resources in the cache.
>  adminBytes=0
>  cacheTotalBytes=1,622,051
>  dataAdminBytes=0
>  dataBytes=0
>  lockBytes=240
>  sharedCacheTotalBytes=0
> WARN  
> [Environment-test:nodetestTransferMasterWhilstTransactionsInFlight10001] 
> o.a.q.s.s.b.r.ReplicatedEnvironmentFacade Ignoring an exception whilst 
> closing environment
> com.sleepycat.je.EnvironmentFailureException: (JE 5.0.104) 
> nodetestTransferMasterWhilstTransactionsInFlight10001(1):/tmp/qpid-work-org.apache.qpid.server.store.berkeleydb.replication.MultiNodeTest.testTransferMasterWhilstTransactionsInFlight-1-8967653656689695718/test/config
>  Problem closing handle 
> nodetestTransferMasterWhilstTransactionsInFlight10001(1) 
> UNEXPECTED_EXCEPTION: Unexpected internal Exception, may have side effects. 
> Environment is invalid and must be closed.
>         at 
> com.sleepycat.je.rep.ReplicatedEnvironment.close(ReplicatedEnvironment.java:709)
>  ~[je-5.0.104.jar:5.0.104]
>         at 
> org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.closeEnvironmentOnRestart(ReplicatedEnvironmentFacade.java:1277)
>  [qpid-bdbstore-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
>         at 
> org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.restartEnvironment(ReplicatedEnvironmentFacade.java:1253)
>  [qpid-bdbstore-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
>         at 
> org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.access$500(ReplicatedEnvironmentFacade.java:92)
>  [qpid-bdbstore-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
>         at 
> org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade$3.run(ReplicatedEnvironmentFacade.java:481)
>  [qpid-bdbstore-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_80]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> Caused by: java.lang.IllegalStateException: Unclosed Database: MESSAGE_CONTENT
> Unclosed Database: MESSAGE_CONTENT
> Databases left open: 2
> There is 1 existing transaction opened against the Environment.
> Aborting open transactions ...
> aborting <Transaction id="-46">
>         at com.sleepycat.je.Environment.close(Environment.java:390) 
> ~[je-5.0.104.jar:5.0.104]
>         at 
> com.sleepycat.je.rep.ReplicatedEnvironment.close(ReplicatedEnvironment.java:702)
>  ~[je-5.0.104.jar:5.0.104]
>         ... 7 common frames omitted
> {noformat}
> It appears we need to abort all transactions whilst restarting the environment



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to