[
https://issues.apache.org/jira/browse/SOLR-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-8914:
---------------------------
Description:
Jenkin's encountered a failure in TestTolerantUpdateProcessorCloud over the
weekend....
{noformat}
http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Solaris/32/consoleText
Checking out Revision c46d7686643e7503304cb35dfe546bce9c6684e7
(refs/remotes/origin/branch_6x)
Using Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseG1GC
{noformat}
The failure happened during the static setup of the test, when a
MiniSolrCloudCluster & several clients are initialized -- before any code
related to TolerantUpdateProcessor is ever used.
I can't reproduce this, or really make sense of what i'm (not) seeing here in
the logs, so i'm filing this jira with my analysis in the hopes that someone
else can help make sense of it.
was:
Jenkin's encountered a failure in TestTolerantUpdateProcessorCloud over the
weekend....
{noformat}
http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Solaris/32/consoleText
Checking out Revision c46d7686643e7503304cb35dfe546bce9c6684e7
(refs/remotes/origin/branch_6x)
Using Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseG1GC
{noformat}
The failure happened during the static setup of the test, when a
MiniSolrCloudCluster & several clients are initialized -- before any code
related to TolerantUpdateProcessor is ever used.
I can't reproduce this, or really make sense of what i'm (not) seeing here in
the logs, so i'm filing this jira with my analysis in the hopes that someone
else can help make sense of it.
Summary: ZkStateReader's refreshLiveNodes(Watcher) is not thread safe
(was: inexplicable "no servers hosting shard: shard2" using
MiniSolrCloudCluster)
Updating summary.
I suspect we either need to move the {{zkClient.getChildren(...)}} call inside
the existing {{synchronized (getUpdateLock())}} block, or the entire
{{refreshLiveNodes(Watcher watcher)}} method needs to synchronize on some new
"liveNodesLock".
> ZkStateReader's refreshLiveNodes(Watcher) is not thread safe
> ------------------------------------------------------------
>
> Key: SOLR-8914
> URL: https://issues.apache.org/jira/browse/SOLR-8914
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
> Attachments: jenkins.thetaphi.de_Lucene-Solr-6.x-Solaris_32.log.txt,
> live_node_mentions_port56361_with_threadIds.log.txt,
> live_nodes_mentions.log.txt
>
>
> Jenkin's encountered a failure in TestTolerantUpdateProcessorCloud over the
> weekend....
> {noformat}
> http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Solaris/32/consoleText
> Checking out Revision c46d7686643e7503304cb35dfe546bce9c6684e7
> (refs/remotes/origin/branch_6x)
> Using Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseG1GC
> {noformat}
> The failure happened during the static setup of the test, when a
> MiniSolrCloudCluster & several clients are initialized -- before any code
> related to TolerantUpdateProcessor is ever used.
> I can't reproduce this, or really make sense of what i'm (not) seeing here in
> the logs, so i'm filing this jira with my analysis in the hopes that someone
> else can help make sense of it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]