Rick Hillegas wrote:
-0
I am tempted to vote -1 based on DERBY-5430. The 10.8.2 release
candidates produce a deadlock in NsTest. That deadlock was not seen in
10.8.1 or earlier releases.
If we had a reproducible case for DERBY-5430 I would agree, then we
could at the very worst case binary search for the change in 10.8 that
caused the issue. I've tried this but failed and see very inconsistent
results using nstest. On exactly same codeline/machine/environment it
will pop after 1 hour and then not after days. I have also reviewed all
the changes in 10.8 since the previous release and can not come up with
anything that looks likely to cause this kind of problem.
However, I do not have any confidence in NsTest as a release barrier.
This test suffers from a number of defects which severely cripple its
usefulness:
1) No-one seems to understand this test.
2) The test is not being run in its preferred configuration. The "Ns" in
NsTest means "Network Server" I think, but as far as I can see the test
is only being run embedded.
I was around when this test was being developed. Originally I believe
we were looking for a network specific test to add to embedded stress
tests we had. But when we looked at what resulted there was nothing
network specific about it, and in fact was found to be more stressful
run in embedded mode. I agree if we had the resources we should run it
in both modes (and maybe even alter its various parameters to change
what it stresses). For instance I think it currently also only runs
on encryped databases and thus does not stress other more "normal" paths.
3) The test produces reams of errors. I don't think we know how to
strain signal out of this noise. The sheer volume of errors suggests
that the test is badly written and that it does not model a sensible
workload.
I go back and forth on this. As a developer I believe if I wrote this
test I would not have it act this way. But one original objective of
the stress test was to stress unexpected paths not being tested by others.
4) The person who runs this test (Myrna) has lost confidence in its
ability to disclose regressions, as evidenced by the downgrading of the
urgency of DERBY-5430.
I do not think that we should use NsTest as a release barrier again
until we address its defects.
I think release managers should look at the result of this test and make
their own determination. If many ASSERTS or other system errors (like
DERBY-5422) or server crashes start coming from this test then it is
giving good feedback. We would not have seen DERBY-5423 without this
test, and I believe that would have been a severe problem for existing
user applications.
So I agree that nstest failing should not necessarily mean a release
should be blocked. Unfortuntately it results need to be interpreted and
a decision made by the community/release manager on if it should be
block or not. It has shown up real bugs in the past that all other
tests have missed so don't want to throw it out. It is to bad that it's
signal to noise ratio is so large.
Thanks,
-Rick