Re: QA speed improvement

Gregg Wonderly Wed, 01 Dec 2010 08:41:38 -0800

On 11/26/2010 10:32 PM, Patricia Shanahan wrote:

Patricia Shanahan wrote:

Sim IJskes - QCG wrote:

On 26-11-10 15:21, Patricia Shanahan wrote:

What worries me a little is the state the operating system keeps on
terminated processes. We know of the SO_REUSEADDR issue, but you seem
to experience the non-release of resources on process termination.
formally a OS bug, but if so is something we need to workaround. How
definate are you on your reports where you have problems rerunning a
test caused by the OS not releasing resources on VM exit?


I see two issues here:

1. Terminating every VM.


A working termination is a firm requirement before bypassing orderly teardown
can be attempted.

2. Deleting any temporary files.


I think this comes down to making sure that if we create any files they
are marked deleteOnExit, reducing the problem to terminating every VM.


Empirically, a Ctrl-C termination often leaves the system in a state in
which subsequent tests fail.


Any idea which VM receives this Ctrl-C? The ant VM, the harness master, or
any random process? There is no such concept as process group leader under
windows is there?


I'm sure the shell initially delivers the signal to the ant job that I
told it to run. I have no idea how and to what extent it gets passed on
to the various processes that are created e.g. to run services. It's a
good question, and I'll think a bit about how to find out.


I've thought of two schemes, and some reasons why we may want to implement both.

Scheme 1: Modify each class that calls Runtime.getRuntime.exec to register a
shutdownHook to destroy the Process that exec returned.

This has a window between the exec call and the addShutdownHook call during
which termination of the VM doing the process creation would leave the process
as an orphan, with no arrangement to destroy it.

Scheme 2: Require each public static void main(String[]) method in River to log
the pid of its process in a log message with a specific format. Write a program
that scans the log file for those messages, and kills each process.

This is reliable, in the sense that a thread can be required to write to the log
before doing anything that reserves a resource. It is not so good for normal
shutdown, because it requires log reading.

I think Scheme 1 would work well for normal termination. By the time the test
finishes, all the windows between creating a process and creating the shutdown
hook to kill it will have closed. Scheme 2 would be useful as a backup in the
event of a crash or cntrl-C while processes are being created. Knowing the PID
of each River process in a configuration could be useful to admins running
production systems, as well as in our QA environment.

Any other ideas?

I believe that in UNIX, #1 is already taken care of for interactive sessionsthrough the propagation of SIGINT to all child processes by the kernel. But, asI said before, Runtime.addShutdownHook() would be a much more reliable way tomanage the shutdown process. Adding one shutdown thread per background taskwould not be advisable. Instead, we'd need to track the "background threads" ina Collection, and have one shutdown hook that inspected that collection and didthe appropriate thing.

Creating the "test" processes from a single "process" that used such a mechanismwould allow for a more dependable startup and shutdown. In that case, onethread per background process would be appropriate because we could then useparallel threads (2x the number of processes) to Thread.join() the shutdown hookthreads, and report when they were shutdown etc.


Gregg

Re: QA speed improvement

Reply via email to