Daniel, et al.,

On 8/7/07, Daniel Krech <[EMAIL PROTECTED]> wrote:
> Thank you for the TriX serializer. I was just taking a look at it in my
> sandbox and notice it has a dependency on Ft.Xml. Any chance you could
> rewrite it so it only depends on the Python standard library?

It has been my experience that the Python standard library has
problems processing XML, but I decided to follow your suggestion and
see how things are coming along.  I attempted to use
`xml.sax.saxutils.XMLGenerator` (see the `xml.sax.saxutils` module
documentation[0]) to write the XML, as it seems to be the closest
thing to an actual streaming XML writer in the standard library.  It
turns out that `XMLGenerator` does not support default namespace
prefixes or the built-in `xml` namespace prefix.

I think that correctly processing XML is important; that is the first
reason that I want to propose that the RDFLib developers acknowledge a
dependency on 4Suite.   4Suite (which provides the `Ft` namespace)
offers the highest quality XML processing capabilities available in
Python.  Writing XML has a number of hairy spots, and 4Suite has a
history of being highly compliant, with a meaty test suite backing up
this claim.  It has key processing components implemented in C, which
gives it a significant performance advantage.  It also provides a
large set of additional features that would be of value to RDFLib.

Starting with writing XML, 4Suite offers some advanced features
modeled after the kind of output processing that XSLT offers.
Speaking of XSLT, a full TriX parser requires an XSLT processor to
support its syntax extension mechanism, and 4Suite boasts an excellent
XSLT processor.  4Suite could boost the performance and likely the
correctness of our XML reading, as well.  Beyond dealing with XML,
4Suite has a powerful library for syntactically manipulating URIs,
which would help deal with some of the outstanding URI problems in
RDFLib.

I would be more than happy to help with the legwork of integrating
4Suite support where it makes sense to do so.

> We also have an XMLWriter class in
> rdflib/syntax/serializers/XMLWriter.py that should work, I
> think.

The RDFLib XMLWriter takes URIs and uses them as XML element names.
As such, it is well suited to writing RDF/XML, but not for
general-purpose XML writing.

> I don't think anyone minds if you clean up the TriX parser. I think it was
> gromgull that did the current version. Looks like we have a few test cases
> for the parser. If anyone else has a test case they'd like to contribute -
> it'd be helpful to make sure we don't introduce any breakage while cleaning
> up the parser.

Good idea; I've been meaning to get the test suite up and running.
Last time I had a go at the test suite, though, it fell over, so I've
been a bit gun shy.  I'll give it another shot soon.

Take care,

    John L. Clark

[0] http://docs.python.org/lib/module-xml.sax.saxutils.html

-- 
PLEASE NOTE that this message is not digitally signed.  As a result,
you have no strong evidence that this message was actually sent by me.
 Upon request I can provide a digitally signed receipt for this
message or other evidence validating its contents if you need such
evidence.
_______________________________________________
Dev mailing list
Dev@rdflib.net
http://rdflib.net/mailman/listinfo/dev

Reply via email to