Dave Abergel wrote:

In a message on lfs-support today, Dan Nicholson wrote:

FAIL: gcc.c-torture/compile/20001226- 1.c  -Os  (test for excess errors)
FAIL: gcc.c-torture /execute/va-arg-25.c execution,  -Os

Two failures isn't that bad. What hardware are you running on? It's
possible the failures are related to VMware. You'll also notice that
the are both related to -Os optimization, so maybe you just want to
avoid that case to be safe.
>
I have some questions about the philosophy of running these tests. What is the point of testing software if the results of the test depend on so many irreproducible variable (like whether or not your system is under heavy use at the time)?

Unfortunately, testing computer software is Just Hard :( There can be various failure modes, such as an incorrect exit status, incorrect (or entirely missing) error message, etc. In the case of some tests, the test case might not complete in a reasonable time. This latter case is one which is most likely to be affected by heavy load, as CPU time is taken away from the application/process being tested.

Maybe it's just the scientist in me, but this seems to make interpretation of the results a bit tricky - especially for someone like me who is not an expert in these things.

It does indeed :(

Now, perhaps I should just not bother to run the tests if I don't think there's any point (it's my distro, so it's my rules 8o) ) but this seems rather unsatisfactory, given that there seems to be this opportunity to find out if the software is working as expected.

Well, in the case of gcc and glibc when so many tests are being run, having just 1 or 2 fail is an indication that the critical steps of the toolchain adjustment did, in actual fact, work. Comparing those test results with others (there's a 'gcc-testresults' list which receives automatically submitted results) will likely indicate whether you have a commonly encountered problem or not.

If 'a few' failures are acceptable, how do you define 'a few'?
And if there are certain tests that are more critical than others, how do you know what these are?

Largely by experience. If the majority of tests pass and those that fail are at least recorded/reproducible by others then building and running LFS and BLFS packages is a very good indication that the test failure is not critical. In some cases it may even be the test case rather than the result that is bogus!

I hope this puts your mind slightly more at ease.

Regards,

Matt.

--
http://linuxfromscratch.org/mailman/listinfo/lfs-chat
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page

Reply via email to