On Thu, 2009-08-27 at 23:46 -0400, Greg Watson wrote:
> I didn't realize it would be such a problem. Unfortunately there is
> simply no way to reliably parse this kind of output, because it is
> impossible to know what the error messages are going to be, and
> presumably they could include XML-like formatting as well. The whole
> point of the XML was to try and simplify the parsing of the mpirun
> output, but it now looks like it's actually more difficult.

I thought this might be difficult when I saw you were attempting it.

Let me tell you about what Valgrind does because they have similar
problems.  Initially they just had added --xml=yes option which put most
of the valgrind (as distinct from application) output in xml tags.  This
works for simple cases and if you mix it with --log-file=<filename> it
keeps the valgrind output separate from the application output.

Unfortunately there are lots of places throughout the code where
developers have inserted print statements (in the valgrind case these
all go to the logfile) which means the xml is interspersed with non-xml
output and hence impossibly to parse reliably.

What they have now done in the current release is to add a extra
--xml-file=<file> option as well as the --log-file=<file> option.  Now
in the simple case all output from a normal run goes well formatted to
the xml file and the log file remains empty, any tool that wraps around
valgrind can parse the xml which is guaranteed to be well formatted and
it can detect the presence of other messages by looking for output in
the standard log file.  The onus is then on tool writers to look at the
remaining cases and decide if they are common or important enough to
wrap in xml and propose a patch or removal of the non-formatted message
entirely.

The above seems to work well, having a separate log file for xml is a
huge step forward as it means whilst the xml isn't necessarily complete
you can both parse it and are able to tell when it's missing something. 

Of course when looking at this level of tool integration it's better to
use sockets that files (e.g. --xml-socket=localhost:1234 rather than
--xml-file=/tmp/app_XXXX.xml) but I'll leave that up to you.

I hope this gives you something to think over.

Ashley,

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk

Reply via email to