[ 
https://issues.apache.org/jira/browse/QPID-6997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15115378#comment-15115378
 ] 

ASF subversion and git services commented on QPID-6997:
-------------------------------------------------------

Commit 1726645 from [email protected] in branch 'java/branches/6.0.x'
[ https://svn.apache.org/r1726645 ]

QPID-6997: [Java Broker] BDB JE HA: Ensure that the VH is closed before closing 
the replicated environment
    
    Ensure that the VH is closed *before* starting to restart the environment. 
Closing the virtualhost will not
    return until connections are closed, and closing connections is guaranteed 
to rollback in flight transactions.
    This will ensure that we comply with JE's business rule that 
transactions/cursors must be closed before
    closing the environment.

    merged from trunk
    svn merge -c 1726358 https://svn.apache.org/repos/asf/qpid/java/trunk
    svn merge -c 1726436 https://svn.apache.org/repos/asf/qpid/java/trunk

> [Java Broker, BDBStore] HA JE Transaction commit might end up in 
> NullPointerException when commit is invoked when majority is lost
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: QPID-6997
>                 URL: https://issues.apache.org/jira/browse/QPID-6997
>             Project: Qpid
>          Issue Type: Bug
>          Components: Java Broker
>    Affects Versions: qpid-java-6.0
>            Reporter: Alex Rudyy
>         Attachments: 
> TEST-org.apache.qpid.server.store.berkeleydb.replication.MultiNodeTest.testTransferMasterWhilstTransactionsInFlight.txt.NPE
>
>
> On running some tests for BDB HA  with default settings 
> (ReplicaAckPolicy=SIMPLE_MAJORITY) involving publishing messages in parallel 
> from multiple connections and disconnecting replica nodes in the middle of 
> publishing transactions, in some rare circumstances the publishing 
> transaction JE commit failed with NPE thrown from JE code as below:
> {noformat}
> ########################################################################
> #
> # Unhandled Exception java.lang.NullPointerException in Thread 
> IO-/127.0.0.1:35408
> #
> # Exiting
> #
> ########################################################################
> java.lang.NullPointerException
>  at 
> com.sleepycat.je.rep.stream.FeederTxns.awaitReplicaAcks(FeederTxns.java:130)
>  at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHook(RepImpl.java:1130)
>  at com.sleepycat.je.rep.txn.MasterTxn.postLogCommitHook(MasterTxn.java:191)
>  at com.sleepycat.je.txn.Txn.commit(Txn.java:756)
>  at com.sleepycat.je.Transaction.doCommit(Transaction.java:609)
>  at com.sleepycat.je.Transaction.commit(Transaction.java:410)
>  at 
> org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.commitAsync(ReplicatedEnvironmentFacade.java:302)
>  at 
> org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore.commitTranAsyncImpl(AbstractBDBMessageStore.java:794)
>  at 
> org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore.access$1200(AbstractBDBMessageStore.java:74)
>  at 
> org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore$BDBTransaction.commitTranAsync(AbstractBDBMessageStore.java:1364)
>  at 
> org.apache.qpid.server.txn.LocalTransaction.commitAsync(LocalTransaction.java:399)
>  at 
> org.apache.qpid.server.protocol.v0_8.AMQChannel.commit(AMQChannel.java:1220)
>  at 
> org.apache.qpid.server.protocol.v0_8.AMQChannel.receiveTxCommit(AMQChannel.java:3622)
>  at org.apache.qpid.codec.ServerDecoder.processMethod(ServerDecoder.java:228)
>  at org.apache.qpid.codec.AMQDecoder.processFrame(AMQDecoder.java:191)
>  at 
> org.apache.qpid.server.protocol.v0_8.BrokerDecoder.doProcessFrame(BrokerDecoder.java:114)
>  at 
> org.apache.qpid.server.protocol.v0_8.BrokerDecoder.access$000(BrokerDecoder.java:36)
>  at 
> org.apache.qpid.server.protocol.v0_8.BrokerDecoder$1.run(BrokerDecoder.java:78)
>  at 
> org.apache.qpid.server.protocol.v0_8.BrokerDecoder$1.run(BrokerDecoder.java:74)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at 
> org.apache.qpid.server.protocol.v0_8.BrokerDecoder.processFrame(BrokerDecoder.java:73)
>  at org.apache.qpid.codec.AMQDecoder.processInput(AMQDecoder.java:173)
>  at org.apache.qpid.codec.AMQDecoder.decode(AMQDecoder.java:114)
>  at org.apache.qpid.codec.ServerDecoder.decodeBuffer(ServerDecoder.java:43)
>  at 
> org.apache.qpid.server.protocol.v0_8.AMQPConnection_0_8$1.run(AMQPConnection_0_8.java:266)
>  at 
> org.apache.qpid.server.protocol.v0_8.AMQPConnection_0_8$1.run(AMQPConnection_0_8.java:258)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at 
> org.apache.qpid.server.protocol.v0_8.AMQPConnection_0_8.received(AMQPConnection_0_8.java:257)
>  at 
> org.apache.qpid.server.transport.MultiVersionProtocolEngine.received(MultiVersionProtocolEngine.java:142)
>  at 
> org.apache.qpid.server.transport.NonBlockingConnection.processAmqpData(NonBlockingConnection.java:547)
>  at 
> org.apache.qpid.server.transport.NonBlockingConnectionPlainDelegate.processData(NonBlockingConnectionPlainDelegate.java:58)
>  at 
> org.apache.qpid.server.transport.NonBlockingConnection.doRead(NonBlockingConnection.java:446)
>  at 
> org.apache.qpid.server.transport.NonBlockingConnection.doWork(NonBlockingConnection.java:253)
>  at 
> org.apache.qpid.server.transport.NetworkConnectionScheduler.processConnection(NetworkConnectionScheduler.java:108)
>  at 
> org.apache.qpid.server.transport.SelectorThread$ConnectionProcessor.processConnection(SelectorThread.java:499)
>  at 
> org.apache.qpid.server.transport.SelectorThread$SelectionTask.performSelect(SelectorThread.java:337)
>  at 
> org.apache.qpid.server.transport.SelectorThread$SelectionTask.run(SelectorThread.java:86)
>  at 
> org.apache.qpid.server.transport.SelectorThread.run(SelectorThread.java:457)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:745)
> {noformat}
> It seems that majority was lost in the middle of transaction. Qpid HA logic 
> detected that and restarted the environment, however, the transaction was not 
> aborted.
> On closing of the environment the following has been reported:
> {noformat}
> Problem: 3 locks left
> ---- LSN: 0x0/0x2d80----
>  ThinLockAddr:1794979774 Owner:1919034342 
> -50_Selector-virtualhost-test-iopool_MasterTxn Waiters: (none)
> ---- LSN: 0x0/0x1ed8----
>  ThinLockAddr:374788562 Owner:1919034342 
> -50_Selector-virtualhost-test-iopool_MasterTxn Waiters: (none)
> ---- LSN: 0x0/0x20db----
>  ThinLockAddr:501824883 Owner:1919034342 
> -50_Selector-virtualhost-test-iopool_MasterTxn Waiters: (none)
> [BROKER-1] o.a.q.t.u.SpawnedBrokerHolder
> Problem: 1 txns left
> txnId = -50 txnName = 1919034342 
> -50_Selector-virtualhost-test-iopool_MasterTxn
> Local Cache Usage = 240
> Cache Layout: Allocation of resources in the cache.
>  adminBytes=0
>  cacheTotalBytes=1,622,051
>  dataAdminBytes=0
>  dataBytes=0
>  lockBytes=240
>  sharedCacheTotalBytes=0
> WARN  
> [Environment-test:nodetestTransferMasterWhilstTransactionsInFlight10001] 
> o.a.q.s.s.b.r.ReplicatedEnvironmentFacade Ignoring an exception whilst 
> closing environment
> com.sleepycat.je.EnvironmentFailureException: (JE 5.0.104) 
> nodetestTransferMasterWhilstTransactionsInFlight10001(1):/tmp/qpid-work-org.apache.qpid.server.store.berkeleydb.replication.MultiNodeTest.testTransferMasterWhilstTransactionsInFlight-1-8967653656689695718/test/config
>  Problem closing handle 
> nodetestTransferMasterWhilstTransactionsInFlight10001(1) 
> UNEXPECTED_EXCEPTION: Unexpected internal Exception, may have side effects. 
> Environment is invalid and must be closed.
>         at 
> com.sleepycat.je.rep.ReplicatedEnvironment.close(ReplicatedEnvironment.java:709)
>  ~[je-5.0.104.jar:5.0.104]
>         at 
> org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.closeEnvironmentOnRestart(ReplicatedEnvironmentFacade.java:1277)
>  [qpid-bdbstore-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
>         at 
> org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.restartEnvironment(ReplicatedEnvironmentFacade.java:1253)
>  [qpid-bdbstore-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
>         at 
> org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.access$500(ReplicatedEnvironmentFacade.java:92)
>  [qpid-bdbstore-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
>         at 
> org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade$3.run(ReplicatedEnvironmentFacade.java:481)
>  [qpid-bdbstore-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_80]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> Caused by: java.lang.IllegalStateException: Unclosed Database: MESSAGE_CONTENT
> Unclosed Database: MESSAGE_CONTENT
> Databases left open: 2
> There is 1 existing transaction opened against the Environment.
> Aborting open transactions ...
> aborting <Transaction id="-46">
>         at com.sleepycat.je.Environment.close(Environment.java:390) 
> ~[je-5.0.104.jar:5.0.104]
>         at 
> com.sleepycat.je.rep.ReplicatedEnvironment.close(ReplicatedEnvironment.java:702)
>  ~[je-5.0.104.jar:5.0.104]
>         ... 7 common frames omitted
> {noformat}
> It appears we need to abort all transactions whilst restarting the environment



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to