Alex,

>Please, find the new version (0.2) and all previous versions here:
>http://issues.apache.org/jira/secure/ManageAttachments.jspa?id=12340105

Good. I like it (everyone likes when his suggestions are implemented
:-). So let me try to outline where we are in Harmony stress testing.

=== TEST DESIGN ===

   * Stress tests are built from simple building blocks according to
configuration strings.

   * Tests have junit interface.
       [Case study] Imagine someone puts tests into SVN which implements
different test interface. To reuse them we can add another generator to
convert these tests to junit interface.

   * Configuration string list is maintained manually. If we plan to use
junit runner to launch a sequent of the stress tests, then the most
straightforward model is to wrap configuration strings into junit test
cases and put documentation into javadoc for these test cases.

=== FURTHER STEPS ===

   * You wrote, "stress test suite should generate relevant bugs". Since
usually stress behavior is unspecified, we need to introduce something
measurable instead of pass/fail result for the stress tests. See my
thoughts about a comparative approach below.

   * I will continue code reviews.

   * All should create tests and run them against Harmony VM and RI.
This would be a real-life testing for our approach.
 
=== COMPARATIVE APPROACH ===

The simplest example of comparative apporach is the following.
        Tester: My test fails on Harmony VM and passes on RI. Please,
fix Harmony VM.
   
This usually does not work for stress tests.
        Developer: Who told you that OutOfMemoryError should be thrown
in your thread? My finalizer thread is just a normal java thread, like
yours, and it can fail as well. You have a bug in your test. 

There are multiple reasons why we always will have such bugs in the
tests.
   * These bugs keep showing up. The time to fix all these bugs
regularly is too high.
   * Stress testing reuses tests which are usually not designed for
stress execution, for example, multithread execution.
   * These bugs are dependent on VM internal structure. Test authors do
not posess sufficient knowledge of the problem and the structure.
   * Sometimes Java is not rich enough.

How can we have a maintainable test product takung all this limitation
into account? We need to learn how to live with occasional failures of
the stress tests. This means, instead of fail, the test should better
report how good it is on Harmony VM compared to RI:
   * Failures with the worst relative metric can be evaluated first.
   * We can detect that a relative metric for a test worsened on the
recent build.

Developers are better convinsed to fix "the worst issue" or
"dergadation" instead of "some issue".

Now let me list here several metrics for each test.
   * Pass rate: assuming the test is 100% reliable on RI we can
calculate a percentage of failures.
   * Number of times the test can be executed sequentionally before a
fail.
   * Memory consumption: a generator can preallocate more and more
memory before launching the test in a loop.
   * Max threads supported: a generator can exponentially increase
number of threads launching the test in parallel.
   * Here is your metric.
   * Execution time: have you noticed that all this apparatus is quite
close to performance testing methodology? There is no need to compete
with them in their field though. :-)

The thing I like most about this approach is it can be introduced on a
stress test generator level.

With best regards,
Alexei Fedotov,
Intel Middleware Products Division

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to