[ 
https://issues.apache.org/jira/browse/SOLR-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-8914:
---------------------------
    Attachment: SOLR-8914.patch

i've updated the patch to cleanup the test a bit -- besdies some cosmetic stuff 
it now does more iterations of smaller "bursts" with more variability in the 
number of threads used in each burst (which should increase the odds of it 
failing, eventually, on diff machines regardless of CPU count.

bq. I'm beasting your latest patch too, I'll report anything that comes up. 
Just to make sure, I should be beasting StressTestLiveNodes, right?

TestStressLiveNodes, but otherwise yes.

It would also be helpful to know if (and how quickly) you can get 
TestStressLiveNodes to fail on your machine when beasting w/o the rest of the 
patch (so far i'm the only one that's been able to confirm the bug in practice 
w/o Scott's patch - hopefully these changes increase those odds)

> ZkStateReader's refreshLiveNodes(Watcher) is not thread safe
> ------------------------------------------------------------
>
>                 Key: SOLR-8914
>                 URL: https://issues.apache.org/jira/browse/SOLR-8914
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>         Attachments: SOLR-8914.patch, SOLR-8914.patch, SOLR-8914.patch, 
> SOLR-8914.patch, jenkins.thetaphi.de_Lucene-Solr-6.x-Solaris_32.log.txt, 
> live_node_mentions_port56361_with_threadIds.log.txt, 
> live_nodes_mentions.log.txt
>
>
> Jenkin's encountered a failure in TestTolerantUpdateProcessorCloud over the 
> weekend....
> {noformat}
> http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Solaris/32/consoleText
> Checking out Revision c46d7686643e7503304cb35dfe546bce9c6684e7 
> (refs/remotes/origin/branch_6x)
> Using Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseG1GC
> {noformat}
> The failure happened during the static setup of the test, when a 
> MiniSolrCloudCluster & several clients are initialized -- before any code 
> related to TolerantUpdateProcessor is ever used.
> I can't reproduce this, or really make sense of what i'm (not) seeing here in 
> the logs, so i'm filing this jira with my analysis in the hopes that someone 
> else can help make sense of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to