On Sep 14, 2006, at 11:04 PM, Heikki Toivonen wrote:

Bryan Stearns wrote:
I'd like to see the native Python logging mechanism used, since we
already use it for other debug output in Chandler; output that needs to
be found by Tinderbox could be logged by its own logger object, which
would make parsing it much less prone to false-positive errors (the
lines are in a well-defined format, so finding the lines produced by one logger is easy); this would also make diagnosis of failures easier (the
big problem with the current output) because all the output (including
regular debug output) would be combined in correct chronological order
(so we wouldn't have to correlate multiple logs).

There might be something here that we could pursue. But I also realized
I did not explain everything we do and one thing I said was confusing...

chandler.log (and twisted.log, although we probably don't include it
currently on Tbox) are produced by the python logger.

Currently the tests are run from a combination of shell scripts and python scripts and depending on which test you look at it could be a combination of both. The python logger is already used by Chandler but the things that need to be captured by the test running code is error output that sometimes cannot be logged by Python.

One other kind of output in the full log is produce by our Tinderbox
client script, and we obviously have control over how that is formatted.

Unit test output is currently in a transitional mode, and at least
currently not very much under our control. It might be a lot of work,
but somebody would need to take a look.

Unit test output hasn't changed in almost a year so I would classify it as transitional at all - what has changed is how the unit tests are run. They used to be run by hardhat and now they are run from a bash script where all stdout and stderr output is captured and dumped to a log file and also to the screen. The stdout dump at the end of the test run is what is captured by the tbox code.

Functional test output is totally under our control, and switched to the
new cats format pretty recently. Unfortunately it outputs many lines
that get recognized by the Tbox log parser as error lines even though
they are not errors.

Performance tests can fail as well; these are transitioning to new cats
at some point as well.

Finally there is a lot of output we don't really have much control over,
which comes from make, compiler, etc.

All of the secondary output for quick builds (make install and other Chandler environment setup steps) is captured by the tbox code.

The parser we use is the generic 'unix error parser' that comes with
Tinderbox. I don't know how difficult it would be to change that. In
fact, I'd rather not change it unless absolutely necessary, because
maintaining a hacked tbox is not fun.

The "unix" parser is actually the best parser to use because of the varied output sources we have - it's the only one that will recognize the variety of messages that can be generated by the build system. It's also the easiest to update for new patterns.

The confusing thing I said about concatenated logs... Each tbox client
concatenates the logs from a single run for the email, but the Tinderbox
server maintains each email (or build report, rather) separately.

This sounds like an easy thing to scan for (a line starting with "exit
code=" immediately after the 'starting functional tests' line).

And we could actually make that say:

error exit code=1

which would be recognized as failure by the Tbox parser and you'd get a
link to that location.

That is definitely something that could be changed - the different output sources can bracket their runs with markers.

Alternatively, the thing that concatenates everything could notice when
a log that's supposed to be non-empty is empty or nonexistant, and
substitute an easily found error message instead.

Yeah, I think we should output an error message in place of a log if we
expected the log to be there.

The tbox client already does this - if it expects a log to be present it will complain when it's not found. The problem is that most of the time the tbox client is two levels up from where the issue is so by the time it gets to the tbox client the log has data but it's minimal.

This case also sounds readily identifiable, even though "all tests
passed" followed by "tests failed" looks confusing. If the tinderbox
doesn't panic when this case is identified, I won't either. :-)

It doesn't panic, and it correctly reports test failure. It is just
confusing to read the log. It would be clearer if we could say 'error
tests failed because python exit code=XXX'.


I'll see if I can make the changes for the last three things.

We should avoid making any changes immediately - we really need to get all of the developer feedback and I do mean all - even if it's just to say "yep, looks ok to me" and then Dan and I can make the changes required as it will need to be coordinated.

---
Bear

Build and Release Engineer
Open Source Applications Foundation (OSAF)
[EMAIL PROTECTED]
http://www.osafoundation.org

[EMAIL PROTECTED]
http://code-bear.com

PGP Fingerprint = 9996 719F 973D B11B E111  D770 9331 E822 40B3 CD29


Attachment: PGP.sig
Description: This is a digitally signed message part

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "chandler-dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/chandler-dev

Reply via email to