Dave Abergel wrote:
In a message on lfs-support today, Dan Nicholson wrote:
FAIL: gcc.c-torture/compile/20001226- 1.c -Os (test for excess errors)
FAIL: gcc.c-torture /execute/va-arg-25.c execution, -Os
Two failures isn't that bad. What hardware are you running on? It's
possible the failures are related to VMware. You'll also notice that
the are both related to -Os optimization, so maybe you just want to
avoid that case to be safe.
>
I have some questions about the philosophy of running these tests. What
is the point of testing software if the results of the test depend on so
many irreproducible variable (like whether or not your system is under
heavy use at the time)?
Unfortunately, testing computer software is Just Hard :( There can be
various failure modes, such as an incorrect exit status, incorrect (or
entirely missing) error message, etc. In the case of some tests, the
test case might not complete in a reasonable time. This latter case is
one which is most likely to be affected by heavy load, as CPU time is
taken away from the application/process being tested.
Maybe it's just the scientist in me, but this seems to make
interpretation of the results a bit tricky - especially for someone like
me who is not an expert in these things.
It does indeed :(
Now, perhaps I should just not bother to run the tests if I don't think
there's any point (it's my distro, so it's my rules 8o) ) but this seems
rather unsatisfactory, given that there seems to be this opportunity to
find out if the software is working as expected.
Well, in the case of gcc and glibc when so many tests are being run,
having just 1 or 2 fail is an indication that the critical steps of the
toolchain adjustment did, in actual fact, work. Comparing those test
results with others (there's a 'gcc-testresults' list which receives
automatically submitted results) will likely indicate whether you have a
commonly encountered problem or not.
If 'a few' failures are acceptable, how do you define 'a few'?
And if there are certain tests that are more critical than others, how
do you know what these are?
Largely by experience. If the majority of tests pass and those that
fail are at least recorded/reproducible by others then building and
running LFS and BLFS packages is a very good indication that the test
failure is not critical. In some cases it may even be the test case
rather than the result that is bogus!
I hope this puts your mind slightly more at ease.
Regards,
Matt.
--
http://linuxfromscratch.org/mailman/listinfo/lfs-chat
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page