I haven't tried yet here, but on other occasions I've found the slow or hanging tests I was after, yeah.
I haven't spent any time testing the master branch with the whole suite recently. On Thu, Jun 26, 2014 at 4:59 PM, Mikhail Antonov <[email protected]> wrote: > And if you disable forking completely, do the tests pass for you always, or > they also fail intermittently? > > > 2014-06-26 15:59 GMT-07:00 Andrew Purtell <[email protected]>: > > > Additionally we run unit tests in parallel to reduce the total time > > required for test suite execution. Surefire will fork multiple JVMs, > > dynamically generate test jars containing a subset of tests, and run > them. > > That can make isolating hanging tests difficult but this behavior can be > > influenced by defines on the Maven command line. For example, to fork a > > process for every single unit test: > > > > mvn test -Dsurefire.firstPartForkMode=always > > -Dsurefire.secondPartForkMode=always > > > > And then if you find a hanging surefire runner, you can dump thread > stacks > > of that JVM and know only the unit test you find methods of in the stacks > > contributed to the current wedged state. > > > > > > On Thu, Jun 26, 2014 at 3:48 PM, Andrew Purtell <[email protected]> > > wrote: > > > > > Java 7u60 64-bit on an EC2 m3.4xlarge. Just running the unit test suite > > in > > > a loop. I don't set any special Maven options in MVN_OPTS or anything > > like > > > that. > > > > > > Historically failures that occur when the suite executes but do not > when > > > individual tests pass happen because one test does not shut down in a > > > timely manner, or at all, and a subsequent test might use the same > > > hardcoded path or port. When that happens we have a sporadic and > > sometimes > > > load sensitive failure. Complicating, each time one clones a repository > > on > > > a different host or file filesystem JUnit may pick up a different test > > > order, influenced by whatever readdir hands back for each package. > > > > > > > > > > > > > > > On Thu, Jun 26, 2014 at 3:25 PM, Mikhail Antonov <[email protected] > > > > > wrote: > > > > > >> Andrew, > > >> > > >> Could you share some details - on what env. you're running the tests, > > and > > >> at which point do that fail? I'm curious because of lately I'm seeing > > >> weird > > >> failures on current master too, which do not happen on hadoop-qa - > > >> individual tests always pass, but when running the suite tests either > > get > > >> stuck and time out (in roughly the same point), or fail with NPE or > > >> PermGen > > >> exception. I've been blaming my environment first, but may be it's > > >> something related. > > >> > > >> -Mikhail > > >> > > >> > > >> > > >> > > >> 2014-06-26 13:39 GMT-07:00 Andrew Purtell <[email protected]>: > > >> > > >> > I'm finding that repeated runs of the unit test suite at the head of > > >> branch > > >> > 0.98 intermittently fail. Individual tests do not, so this likely a > > >> lagging > > >> > shutdown, port/resource conflict, and/or zombie test issue. I am > > >> currently > > >> > bisecting commits on 0.98 branch since the last release in the hope > of > > >> > pinning this down to a single change. Depending on how quickly that > > can > > >> > happen, the RC might happen on Monday or not. As things stand at the > > >> head > > >> > of the branch, I'd not +1 the RC given the release criteria I've > been > > >> using > > >> > up to now. > > >> > > > > > -- > > Best regards, > > > > - Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > > > > > -- > Thanks, > Michael Antonov > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
