Note this is similar to the flakey test mechanism, with the primary difference being that the re-run is done in a minimal CPU load environment rather than wherever the failure first occurred. The existing flakey test rerun logic is not helpful for the high-load-induced failures that I'm looking to handle.
On Fri, Nov 27, 2015 at 10:56 AM, Todd Fiala <todd.fi...@gmail.com> wrote: > Hi all, > > On OS X (and frankly on Linux sometimes as well, but predominently OS X), > we have tests that will sometimes fail when under significant load (e.g. > running the concurrent test suite, exacerbated if we crank up the number of > threads, but bad enough if we run at "number of concurrent workers == > number of logical cores"). > > I'm planning on adding a serialized, one-worker-only phase to the end of > the concurrent test run, where the load is much lighter since only one > worker will be processing at that phase. Then, for tests that fail in the > first run, I'd re-run them in the serialized, single worker test run > phase. On the OS X side, this would eliminate a significant number of test > failures that are both hard to diagnose and hard to justify spending > significant amounts of time on in the short run. (There's a whole other > conversation to have about fixing them for real, i.e. working through all > the race and/or faulty test logic assumptions that are stressed to the max > under heavier load, but practically speaking, there are so many of them > that this is going to be impractical to address in the short/mid term.). > > My question to all of you is if we'd want this functionality in top of > tree llvm.org lldb. If not, I'll do it in one of our branches. If so, > we can talk about possibly having a category or some other mechanism if we > want to mark those tests that are eligible to be run in the follow-up > serialized, low-load pass. Up front I was just going to allow any test to > fall into that bucket. The one benefit to having it in top of tree > llvm.org is that, once I enable test reporting on the green dragon public > llvm.org OS X LLDB builder, that builder will be able to take advantage > of this, and will most certainly tag fewer changes as breaking a test (in > the case where the test is just one of the many that fail under high load). > > Let me know your thoughts either way. > > Thanks! > -- > -Todd > -- -Todd
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev