On Apr 30, 2009, at 3:51 PM, Knut Urdalen wrote:
I'm currently looking into how to log events to HTML, XML and RSS
files.
There is one main issue here:
Header and footer of the layout is added to each log message which
leads to malformed output.
In example using the html layout together with the file appender
leads to a html-file that includes complete chunks of an html-
document for each message which is appended after each other. The
output is both a malformed document and the output could probably
look better ;)
So how do we deal with this.
1. I think we can agree that log4php should always produce valid
documents, if not it's worth next to nothing.
2. If so we can't use the file appender together with the "complex
layouts" (layouts that need a fixed header and footer and stuff
logging event messages somewhere in that file) and need to create a
separate LoggerAppenderHtml and LoggerAppenderXml to be able to add
the logic needed for creating and updating a valid output.
So my final question would be:
Does it make sense to create a LoggerAppenderHtml in this case and
modify the DOM upon appending messages? Any other suggestions on how
to resolve this?
Knut
Writing an XML document doesn't mesh well with logging. One of the
essentials is that there is only one top level element and to do that
would either require rewriting the entire document or being able to
truncate the document and write a log element before the close tag for
the one and only element. Java doesn't have support for positioning
and rewriting would be unacceptable.
However, a parsed entity can have any number of elements and meshes
well with logging. Easy just to continue to append to the file. What
it does require to make a valid XML document is that the parsed
entity is referenced within a skeletal document, something like:
<!DOCTYPE log [
<!ENTITY entity.log SYSTEM 'entity.log'>
]>
<log>
&entity.log;
</log>
where entity.log might be an open-ended sequence of element.
So this document should be processable by anything that consumes an
XML document, the skeletal document does not need to be updated.
If you want to write a valid XML document, you could just output the
skeleton to the specified file name and use a temp file name in the
same directory for the open-ended content. However, log4j and log4cxx
both just write the open-ended file to the specified name.
For HTML, I'd suggest just avoiding it and output a
<?xsl-stylesheet>
directive and let the browser handle the conversion to HTML. Might
limit the browsers that could be supported, but can be done cleanly.
Encoding and escaping are other pitfalls that log4j and log4cxx had
fallen into. If you can hardwire the encoding to "UTF-8" and not
follow the platform default, that will help a lot. For escaping, you
make sure that you replace special characters with the corresponding
character entity. There are test cases in the log4j and log4cxx test
case that output problematic content and then reparse the output to
make sure that things were properly escaped.