Re: Regression Test Failure! - TinderBox_Derby 375885 - Sun DBTG

Ole Solberg Wed, 08 Feb 2006 13:08:54 -0800

Bryan Pendleton wrote:

Duration % is compared to some baseline which is given in the last rowof (in this case) testSummary-375885.html
Thanks!

But I'm afraid I'm still confused. Take, for example, this part of
the larger table:

  derbyall   630 6 624 0
    Duration   45.6%

This specific run of derbyall used 45.6% of the wall clock executiontime derbyall used when the baseline was run. (When there are (largenumber) of errors the % won't be very meaningfull ....)


I see that in this particular example, there were 630 tests, and
6 of them passed, and 624 of them failed, and 0 were skipped.

So this is a very uncommon case....


And I see at the bottom that it says:

  Baseline for duration is rev.: 226900 2005-08-02 00:33:20 CEST

The checkout and build on which the baseline was run was svn rev.226900. The checkout was done at 2005-08-02 00:33:20 CEST.

  (SunOS-5.10 i86pc-i386)

(The architecture on which the baseline run was executed. Should alwaysbe identical to architecture where *this* test was run.)


But what does "Duration 45.6%" mean? Is it telling me something
about the wall clock execution time of this test suite? Or is it
telling me something about the pass/fail rate? Or is it telling me
something else entirely?

To me duration has to do with time.... So I thought a duration % wouldbe interpreted as "relative duration time" with respect to something (abaseline) which I tried to indicate at the bottom row of the table.


Each suite seems to have a Duration value, but the numbers don't
seem to some up to 100%, and they don't seem related to the success
or failure of the individual tests in that suite.

Where each subsuite (derbylang, derbytools, ...) of derbyall are listedthese are run "standalone" in addition to running derbyall itself.


For each entry the
  duration % is
  % of wall clock execution time for *this* run

vs. wall clock execution time for the suite when the baseline run wasdone.So ideally we should have 100% for each suite, every time. But there areseveral sources of noise: Errors is one. There might be other load onthe machines used - we try to avoid this. Time measurementgranularity(second) vs. execution time is a third. I am sure there are more.


I'm sorry I'm being so dense here, but you seem to be telling me
that some test suite ("derbyall") is 45.6% of a revision, and I
don't understand what it means for a suite to be a percentage of
a revision.

Sorry for creating confusing information :-(
but again I thought entries like
   "Duration:  102.59%"
and then
   "Baseline for duration is rev.: 356319 2005-12-12 19:45:33 CET"
would be sufficient.


thanks,

bryan


Hope this helps a little,
Ole

Re: Regression Test Failure! - TinderBox_Derby 375885 - Sun DBTG

Reply via email to