Daniel, et al., On 8/7/07, Daniel Krech <[EMAIL PROTECTED]> wrote: > Thank you for the TriX serializer. I was just taking a look at it in my > sandbox and notice it has a dependency on Ft.Xml. Any chance you could > rewrite it so it only depends on the Python standard library?
It has been my experience that the Python standard library has problems processing XML, but I decided to follow your suggestion and see how things are coming along. I attempted to use `xml.sax.saxutils.XMLGenerator` (see the `xml.sax.saxutils` module documentation[0]) to write the XML, as it seems to be the closest thing to an actual streaming XML writer in the standard library. It turns out that `XMLGenerator` does not support default namespace prefixes or the built-in `xml` namespace prefix. I think that correctly processing XML is important; that is the first reason that I want to propose that the RDFLib developers acknowledge a dependency on 4Suite. 4Suite (which provides the `Ft` namespace) offers the highest quality XML processing capabilities available in Python. Writing XML has a number of hairy spots, and 4Suite has a history of being highly compliant, with a meaty test suite backing up this claim. It has key processing components implemented in C, which gives it a significant performance advantage. It also provides a large set of additional features that would be of value to RDFLib. Starting with writing XML, 4Suite offers some advanced features modeled after the kind of output processing that XSLT offers. Speaking of XSLT, a full TriX parser requires an XSLT processor to support its syntax extension mechanism, and 4Suite boasts an excellent XSLT processor. 4Suite could boost the performance and likely the correctness of our XML reading, as well. Beyond dealing with XML, 4Suite has a powerful library for syntactically manipulating URIs, which would help deal with some of the outstanding URI problems in RDFLib. I would be more than happy to help with the legwork of integrating 4Suite support where it makes sense to do so. > We also have an XMLWriter class in > rdflib/syntax/serializers/XMLWriter.py that should work, I > think. The RDFLib XMLWriter takes URIs and uses them as XML element names. As such, it is well suited to writing RDF/XML, but not for general-purpose XML writing. > I don't think anyone minds if you clean up the TriX parser. I think it was > gromgull that did the current version. Looks like we have a few test cases > for the parser. If anyone else has a test case they'd like to contribute - > it'd be helpful to make sure we don't introduce any breakage while cleaning > up the parser. Good idea; I've been meaning to get the test suite up and running. Last time I had a go at the test suite, though, it fell over, so I've been a bit gun shy. I'll give it another shot soon. Take care, John L. Clark [0] http://docs.python.org/lib/module-xml.sax.saxutils.html -- PLEASE NOTE that this message is not digitally signed. As a result, you have no strong evidence that this message was actually sent by me. Upon request I can provide a digitally signed receipt for this message or other evidence validating its contents if you need such evidence. _______________________________________________ Dev mailing list Dev@rdflib.net http://rdflib.net/mailman/listinfo/dev