[ https://issues.apache.org/jira/browse/SOLR-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705126#comment-16705126 ]
Hoss Man commented on SOLR-12932: --------------------------------- Although it doesn't seem like there has been an apache jenkins build of master since mark's commit 75b1831967982, there have been 3 jenkins.thetaphi.de builds of master, and 1 jenkins.sarowe.net build of master... * jenkins.sarowe.net ** prior to mark's commit, the most recent master build had 2 test failures *** [http://fucit.org/solr-jenkins-reports/job-data/sarowe/Lucene-Solr-tests-master/19410/] ** The single build since mark's commit had no failures... *** [http://fucit.org/solr-jenkins-reports/job-data/sarowe/Lucene-Solr-tests-master/19411/] * jenkins.thetaphi.de ** prior to mark's commit, the most recent master build coincidently succeeded w/o any test failures. *** [http://fucit.org/solr-jenkins-reports/job-data/thetaphi/Lucene-Solr-master-Linux/23276/] ** All 3 builds of master since that commit, have had 100+ suite level failures – although it should be noted they all used diff JVMs *** [http://fucit.org/solr-jenkins-reports/job-data/thetaphi/Lucene-Solr-master-Linux/23277/] *** [http://fucit.org/solr-jenkins-reports/job-data/thetaphi/Lucene-Solr-master-Linux/23278/] *** [http://fucit.org/solr-jenkins-reports/job-data/thetaphi/Lucene-Solr-master-Linux/23279/] ---- Skimming the logs from jenkins.thetaphi.de #23277, the suite failures mostly seem to fall into 2 types... # Thread Leaks ** *** SolrRrdBackendFactory-* and MetricsHistoryHandler-* threads seem to frequently leak in "pairs" – ie: it's very common to see 2 threads leaked w/one of each – but also 4 threads leaked 2 of each, etc... *** there were other thread's leaked in some tests to various degrees, not enough for a pattern to be obvious # Object Leaks ** *** Notably instances of [ZkStateReader, SolrZkClient] – also typically in pairs (if 2 objects leaked, one of each; if 24 objects leaked, 12 of each) *** there were a handful of object leak failures that sometimes included other objects: in particular some large lists w/multiple SolrCore objects being leaked and ohter objects you'd expect to see hanging off of a SolrCore Hypothosis: I know mark has mentioned cleaning up a lot of "sleep" and "wait" type logic in tests, I'm guessing that in doing this it's exposed some "shutdown" logic bugs that that in the past weren't as obvious because "slow" jenkins machines were waiting longer for other things due to hardcoded "waits" and were getting lucky that the the threads/objects were being cleaned up in a timely manor. > ant test (without badapples=false) should pass easily for developers. > --------------------------------------------------------------------- > > Key: SOLR-12932 > URL: https://issues.apache.org/jira/browse/SOLR-12932 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) > Components: Tests > Reporter: Mark Miller > Assignee: Mark Miller > Priority: Major > > If we fix the tests we will end up here anyway, but we can shortcut this. > Once I get my first patch in, anyone who mentions a test that fails locally > for them at any time (not jenkins), I will fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org