Pure python sax parser would be a good start. Other interesting tools could
be based on that. Development grants, indeed. ;)
On Tue, Feb 2, 2010 at 4:21 AM, Malthe Borch <mbo...@gmail.com> wrote:
> We're currently using an obscure feature of the expat parser library
> ``CurrentByteIndex`` for at least two purposes: tricking expat into
> giving us the raw text without resolving entities, and to parse
> fragmented documents (e.g. with more than one top-level node).
> This does not work on Jython.
> An alternative is to incorporate the ``pxdom`` parser; it's pure
> Python and licensed under "New BSD". I believe it can be adapted
> fairly easily to our needs. However, the optimal solution would be
> 1) Pure Python library that parses straight into
> ElementTree-compatible elements;
> 2) Pure Python parser that yields SAX events.
> Note that the SAX-parser provided with CPython relies on ``expat``. On
> Jython there's a different underlying implementation; it's probably
> best to stay away from libraries which merely bind to native
> code––afterall, compatible, not speed, is the goal here.
> Feedback on how to go forward is appreciated; as are development grants.
> Repoze-dev mailing list
Repoze-dev mailing list