[
https://issues.apache.org/jira/browse/QPID-7695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903081#comment-15903081
]
Keith Wall commented on QPID-7695:
----------------------------------
Qpid is configuring {{DbPing}} correctly with both a connectTimeout and
sockTimeout. This is actually with defect is BDB JE (5.0.104).
{{com.sleepycat.je.rep.utilint.RepUtils#openBlockingChannel}} configures
{{setSoTimeout}}, but then goes on to use
{{java.nio.channels.SocketChannel#read(java.nio.ByteBuffer)}} to read the
protocol messages which then hangs. The defect is that the JE code assumes
that SocketChannel#read() is subject to socket timeout. This is not the case.
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4614802
I have not checked the newer releases to see if they too have the same issue.
> [HA] Indefinite hang when new node joins existing group but existing node is
> unresponsive
> -----------------------------------------------------------------------------------------
>
> Key: QPID-7695
> URL: https://issues.apache.org/jira/browse/QPID-7695
> Project: Qpid
> Issue Type: Bug
> Components: Java Broker
> Affects Versions: qpid-java-6.0.6, qpid-java-6.1.1, qpid-java-broker-7.0.0
> Reporter: Keith Wall
>
> When adding a new node to an existing group, internally Qpid uses
> com.sleepycat.je.rep.util.DbPing#DbPing() to establish initial contact with
> the node and perform some preliminary checks. If this node is somehow
> unresponsive, Qpid (the Broker's Confif Thread) hangs indefinitely and is
> unrecoverable. BDB JE 5.0.104 is in use.
> The Broker Config thread stack trace looks like this:
> {noformat}
> java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.FileDispatcherImpl.read0(FileDispatcherImpl.java:-1)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> at sun.nio.ch.IOUtil.read(IOUtil.java:197)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
> - locked <0x168d> (a java.lang.Object)
> at
> com.sleepycat.je.rep.utilint.ServiceDispatcher.doServiceHandshake(ServiceDispatcher.java:325)
> at com.sleepycat.je.rep.util.DbPing.getNodeState(DbPing.java:194)
> at
> org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.getRemoteNodeState(ReplicatedEnvironmentFacade.java:1807)
> at
> org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.connectToHelperNodeAndCheckPermittedHosts(ReplicatedEnvironmentFacade.java:1846)
> at
> org.apache.qpid.server.virtualhostnode.berkeleydb.BDBHAVirtualHostNodeImpl.getPermittedNodesFromHelper(BDBHAVirtualHostNodeImpl.java:566)
> at
> org.apache.qpid.server.virtualhostnode.berkeleydb.BDBHAVirtualHostNodeImpl.validateOnCreate(BDBHAVirtualHostNodeImpl.java:546)
> at
> org.apache.qpid.server.model.AbstractConfiguredObject$6.execute(AbstractConfiguredObject.java:878)
> at
> org.apache.qpid.server.model.AbstractConfiguredObject$6.execute(AbstractConfiguredObject.java:865)
> at
> org.apache.qpid.server.model.AbstractConfiguredObject$2.execute(AbstractConfiguredObject.java:636)
> at
> org.apache.qpid.server.model.AbstractConfiguredObject$2.execute(AbstractConfiguredObject.java:629)
> at
> org.apache.qpid.server.configuration.updater.TaskExecutorImpl$TaskLoggingWrapper.execute(TaskExecutorImpl.java:240)
> at
> org.apache.qpid.server.configuration.updater.TaskExecutorImpl.submitWrappedTask(TaskExecutorImpl.java:157)
> at
> org.apache.qpid.server.configuration.updater.TaskExecutorImpl.submit(TaskExecutorImpl.java:145)
> at
> org.apache.qpid.server.model.AbstractConfiguredObject.doOnConfigThread(AbstractConfiguredObject.java:628)
> at
> org.apache.qpid.server.model.AbstractConfiguredObject.createAsync(AbstractConfiguredObject.java:864)
> at
> org.apache.qpid.server.model.AbstractConfiguredObjectTypeFactory.createAsync(AbstractConfiguredObjectTypeFactory.java:75)
> at
> org.apache.qpid.server.model.ConfiguredObjectFactoryImpl.createAsync(ConfiguredObjectFactoryImpl.java:145)
> at
> org.apache.qpid.server.model.BrokerImpl.createVirtualHostNodeAsync(BrokerImpl.java:605)
> at
> org.apache.qpid.server.model.BrokerImpl.addChildAsync(BrokerImpl.java:660)
> at
> org.apache.qpid.server.model.AbstractConfiguredObject$17.execute(AbstractConfiguredObject.java:2094)
> at
> org.apache.qpid.server.model.AbstractConfiguredObject$17.execute(AbstractConfiguredObject.java:2089)
> at
> org.apache.qpid.server.model.AbstractConfiguredObject$2.execute(AbstractConfiguredObject.java:636)
> at
> org.apache.qpid.server.model.AbstractConfiguredObject$2.execute(AbstractConfiguredObject.java:629)
> at
> org.apache.qpid.server.configuration.updater.TaskExecutorImpl$TaskLoggingWrapper.execute(TaskExecutorImpl.java:240)
> at
> org.apache.qpid.server.configuration.updater.TaskExecutorImpl$CallableWrapper$1.run(TaskExecutorImpl.java:312)
> at
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:360)
> at
> org.apache.qpid.server.configuration.updater.TaskExecutorImpl$CallableWrapper.call(TaskExecutorImpl.java:305)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]