[
https://issues.apache.org/jira/browse/SOLR-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15317085#comment-15317085
]
Hoss Man commented on SOLR-9189:
--------------------------------
sarowe reminded me offline about the "buildTimeTrend" feature of jenkins --
while the ASF jenkins machines have only been running tests about once a day,
so it's hard to spot an obvious pattern, uwe & sarowe's jenkins machines have
been hammering on tests a lot faster, and you can really spot a trend...
http://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/buildTimeTrend
http://jenkins.thetaphi.de/view/All/job/Lucene-Solr-6.x-Linux/buildTimeTrend
http://jenkins.sarowe.net/job/Lucene-Solr-tests-master/buildTimeTrend
http://jenkins.sarowe.net/job/Lucene-Solr-tests-6.x/buildTimeTrend
...from sarowe's master job, build #7028 was the first test in a while to go
over 20 minutes, and from that point on tests were reliably over 40 minutes
until build #7035 which droped down to 10 minutes....
* http://jenkins.sarowe.net/job/Lucene-Solr-tests-master/7028/
** 1e2ba9fe9be84f0b5defe4965735eae892fabf7b
** "Jun 4, 2016 7:14:24 AM"
** changes:
*** Revert "SOLR-9181: Fix test bug in ZkStateReaderTest" (detail)
* http://jenkins.sarowe.net/job/Lucene-Solr-tests-master/7035/
** c8570ed821654cdce5f92ae17d06a21f242524e2
** "Jun 6, 2016 1:08:05 PM"
** changes:
*** Revert "SOLR-9140: Replace some zk state polling with (detail)
*** LUCENE-7132: BooleanQuery sometimes assigned the wrong score when ranges
(detail)
...that means the slow down didn't hit jenkins master until 3 days *after* i
committed SOLR-9107 to that branch -- but it did start right whne a
SOLR-9181commit happened. Likewise the build#7035 speedup was *before* my
SOLR-9189 commit to disable randomized ssl testing on on master completely -
and again, coincided with a SOLR-9140 commit.
[~romseygeek] - definitely wnat to draw your attention to this issue -- your
recent commits may have resvolved the slowdowns (at least on master), but i
want to make sure you're aware of the situation.
> explosion of timeout related failures in jenkins the past few days
> ------------------------------------------------------------------
>
> Key: SOLR-9189
> URL: https://issues.apache.org/jira/browse/SOLR-9189
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
> Assignee: Hoss Man
> Priority: Critical
>
> In the past few days, something has gone seriously wonky with our jenkins
> tests -- causing a serious explosion in the number of test failures --
> notably do to various sorts of timeouts...
> * "Unable to create core ... Timed out getting coreNodeName for ..."
> * "msg=SolrCore is loading,code=503"
> * "Timeout occured while waiting response from server"
> * "No registered leader was found after waiting for 30000ms"
> * "Unable to create core ... Caused by: Timed out getting shard id for core:
> ..."
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]