Jessica Cheng Mallet created SOLR-6631:
------------------------------------------
Summary: DistributedQueue spinning on calling zookeeper
getChildren()
Key: SOLR-6631
URL: https://issues.apache.org/jira/browse/SOLR-6631
Project: Solr
Issue Type: Bug
Components: SolrCloud
Reporter: Jessica Cheng Mallet
The change from SOLR-6336 introduced a bug where now I'm stuck in a loop making
getChildren() request to zookeeper with this thread dump:
{quote}
Thread-51 [WAITING] CPU time: 1d 15h 0m 57s
java.lang.Object.wait()
org.apache.zookeeper.ClientCnxn.submitRequest(RequestHeader, Record, Record,
ZooKeeper$WatchRegistration)
org.apache.zookeeper.ZooKeeper.getChildren(String, Watcher)
org.apache.solr.common.cloud.SolrZkClient$6.execute()<2 recursive calls>
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkOperation)
org.apache.solr.common.cloud.SolrZkClient.getChildren(String, Watcher, boolean)
org.apache.solr.cloud.DistributedQueue.orderedChildren(Watcher)
org.apache.solr.cloud.DistributedQueue.getChildren(long)
org.apache.solr.cloud.DistributedQueue.peek(long)
org.apache.solr.cloud.DistributedQueue.peek(boolean)
org.apache.solr.cloud.Overseer$ClusterStateUpdater.run()
java.lang.Thread.run()
{quote}
Looking at the code, I think the issue is that LatchChildWatcher#process always
sets the event to its member variable event, regardless of its type, but the
problem is that once the member event is set, the await no longer waits. In
this state, the while loop in getChildren(long), when called with wait being
Integer.MAX_VALUE will loop back, NOT wait at await because event != null, but
then it still will not get any children.
{quote}
while (true) \{
if (!children.isEmpty()) break;
watcher.await(wait == Long.MAX_VALUE ? DEFAULT_TIMEOUT : wait);
if (watcher.getWatchedEvent() != null)
\{ children = orderedChildren(null); \}
if (wait != Long.MAX_VALUE) break;
\}
{quote}
I think the fix would be to only set the event in the watcher if the type is a
NodeChildrenChanged.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]