Hey guys,

thanks for your input. Kristian, I'd like to start with your concerns about
the threading in the forked process. With the code as it is now, you are
totally right, that threading in this LazyTestsToRun class is not really
required. It's a left-over from a previous attempt where the forked servers
did not /ask/ to receive more work, but were that was pushed to them
implicitly, with a special RunListener reacting on finished tests. The only
advantage now left is that with the decoupled threads we can react on a not
responding (hanging) main process, as I can easily observe a timeout with
different threads. Being stuck in an InputStream#read() method might get us
into trouble. But perhaps it's just paranoia and everything will unravel
once the parent process gets killed.

I guess another thing needs to be sorted out before deciding on whether to
drop the extra threading: and that is the TestNG provider. I don't yet have
an idea which way to go there. The current implementation requires all test
classes to be known upfront, in order for the provider to decide where to
store the test results: in case both JUnit and TestNG tests are present,
distinct directories will be created. Otherwise, they just land directly
test result directory. So it might have to be necessary to further
pre-process the test set in the parent process (which would require some
more API changes for the providers); to always hand over all the tests
known and to let each forked process somehow select a subset for the actual
execution; or maybe to decide that TestNG always uses special
sub-directories for JUnit results and TestNG result, regardless of whether
both types of test classes are present or not. (That currently unused
attribute multipleTestsForFork in ProviderConfiguration was meant to
somehow address the TestNG problem, but there's still some stuff to think
about first.)

And, one more thing, the current code does not support the concurrent
execution of test classes within one process (the parallel option). But I
guess that threaded handling of the input in LazyTestsToRun would not make
a difference there, either. ;) However, it might go in a direction similar
to the TestNG thing. But I'd prefer to leave out combining the parallel
option with the *forkperthread options for now... :)

1) Lots of hard to fix bugs will occur for users once you split
> execution across forked JVMs if there are dependencies between suite
> classes (pollution of static fields, left-over threads etc.). These
> are hard to debug enough if you have a predictable suite assignment
> (which can be done, see the runner's static assignment policy) but if
> you have dynamic job stealing the order is pretty much a race
> condition and the information which suites ran on which JVM (and in
> which order) is crucial.
>

If you have tests like that you end up having to use a new process for each
test class. That's like having to decide between forkMode once and always
in the single-threaded configuration. But yes, I've seen such tests as well
:).


>
> 2) Didn't look into Andreas's code and don't know how he solved the
> problem of communicating between JVMs. This is quite tricky if you
> want to detect hung/crashed JVMs and capture the output they dump to
> stdout/stderr descriptors directly (bypassing any System.*
> replacements). We tail event files manually, this turned simple and
> robust.
>

Gladly, I didn't have to do anything there - it pretty much was there
already. The current code uses pipes (stdin/stdout/stderr) to communicate
between the processes, with some specific escaping and op-code mechanism
that allows to sort out test results and events, and stuff written to
stdout / stderr within the tests.


> 4) Many test suites implicitly rely on the fact that they're executed
> in isolation. This may result in unexpected failures as in two test
> suites try to write to the same file in a current working directory,
> for example. We opted for running each forked JVM in a separate CWD so
> that they kind of get their own scratch space to work with.
>

I know exactly what you mean - I have to deal with tests that require
database access (they set up test data, do some JPA, and perform a
rollback). For such scenarios, I have added a feature that replaces the
string ${surefire.threadNumber} in the system properties and in the argLine
with the number of the executing thread, ranging [1..threadCount]. For me,
that did the trick. It's actually something separate from this forking
feature (works with all the other forkModes as well), but I hacked it down
together with this one in an offline, non-git copy of the source and I
didn't want to scratch it out before pushing. :)

Andreas

Reply via email to