On Thu, 3 Oct 2024, Gus Heck wrote: : The failures I saw when I downloaded a couple logs centered on threads not : terminated. Perhaps Uwe's box is so overloaded that the shutdown process : for those tests takes too long and the test fails instead?
They are specifically coming from Uwe's box when using the openJ9 JVM. Cross posting from another thread last week... Date: Mon, 14 Oct 2024 12:54:10 -0700 (MST) From: Chris Hostetter <hossman_luc...@fucit.org> To: dev@solr.apache.org Subject: Re: [JENKINS] Solr-main-Linux (64bit/openj9/jdk-17.0.8) - Build # 20654 - Still Unstable! Message-ID: <alpine.DEB.2.21.2410141247570.20696@slate> Uwe: We've been seeing an epic number of these types of TimerThread "leak" failures from your jenkins box in the past few week -- all that i've seen have a thread name "file lock watchdog" and seem to be run on "openj9" Are these possibly related to an openJ9 upgrade you made to your boxes ? ... maybe back in september? Some random googling suggests the sysprop below might be useful to disable this watchdog -- can you try setting this in your jenkins gradle command options? -Dcom.ibm.tools.attach.useFileLockWatchdog=false https://github.com/eclipse-openj9/openj9/commit/f40f665db811f7686dd61d32c1e7c140ab35d78a : : On Thu, Oct 3, 2024 at 3:47 PM David Smiley <dsmi...@apache.org> wrote: : : > Relying on people to go look at CI out of the goodness of our hearts is a : > losing strategy. Our contributors don't even know where that is! There : > needs to be a trigger to do so ideally something personalized -- a build : > failure with recent changes that *you* included. Or instead a post/comment : > on linked JIRA or PR -- gets contributor involvement even if the Git : > metadata iacks a real email address. : > : > On Thu, Oct 3, 2024 at 2:46 PM Houston Putman <hous...@apache.org> wrote: : > : > > The failures generally seem to be coming from Uwe's boxes, and I cannot : > > reproduce them locally. The crossDc ones do seem to be failing a lot, but : > > when they fail, it looks like they aren't failing alone. I will continue : > to : > > do research on it though. : > > : > > Our tests are extremely flakey right now, so it's definitely something we : > > need to clean up quickly. Thanks for pointing it out. : > > : > > - Houston : > > : > > On Thu, Oct 3, 2024 at 12:22 PM Gus Heck <gus.h...@gmail.com> wrote: : > > : > > > I went to the fucit jenkins reports site to check on the state of the : > > build : > > > after my recent commit to make sure all was well, but when I got there : > I : > > > was greeted with several weeks of extremely frequent test failures and : > in : > > > the last 2 weeks we seem to have gained several 100% failures (that : > > clearly : > > > preceded my commit). : > > > : > > > http://fucit.org/solr-jenkins-reports/failure-report.html : > > > : > > > This appears to be on fire. : > > > : > > > Clear culprits include the addition of the crossdc module and some : > > problems : > > > with lucene back compatibility indexes There also seems to be a big : > > uptick : > > > in recovery related failures. : > > > : > > > It would be nice if one could filter fucit somehow to see only lucene : > or : > > > only solr, though I imagine that's not a minor undertaking : > > > : > > > -Gus : > > > : > > > -- : > > > http://www.needhamsoftware.com (work) : > > > https://a.co/d/b2sZLD9 (my fantasy fiction book) : > > > : > > : > : : : -- : http://www.needhamsoftware.com (work) : https://a.co/d/b2sZLD9 (my fantasy fiction book) : -Hoss http://www.lucidworks.com/
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org For additional commands, e-mail: dev-h...@solr.apache.org