On Thu, 2006-12-28 at 12:26, TRANS wrote:
> On 12/27/06, Andrew S. Townley <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > I'm looking for a good SAX implementation for Ruby to do some XML
> > parsing.  I see from the mailing list archives you've just added the
> > pull-based XML::Reader based on the Microsoft C# API, but I want "real"
> > SAX-based event-driven parsing.
> 
> What are you planning to do with it? I'm curious, is it because you
> prefer that way of doing it, or is there a techincal reason.

A bit of both :)  I want to be able to efficiently process some
arbitrarily large documents, but I also want to be able to provide some
processing filters to perform some normalization somewhat equivalent to
the C14N/XCL-C14N process as well as some other, non-XSLT types of
transformations.  I've also done quite a bit of this with Java, so it is
a model which is familiar to me, and, once you get used to it, I find
that it's easier to deal with than pull parsing for these types of
tasks.

There's also a desire to not tie myself too tightly to a particular XML
parser implementation because certain ones are better for certain
tasks.  Ideally, I'd like to see a Ruby XML::Parser::SAX module which
would either be implemented by or at least support a number of different
parsers, based on what you needed at the time.

> 
> > I also noticed that the current support for this in the library is based
> > on the now deprecated SAXv1 interface rather than the newer SAX2
> > interface.  Is there any plans to migrate this or change it?  Also, I
> > think it would be worthwhile implementing a filter mechanism similar to
> > the Java SAX API's XMLReader, XMLFilter and friends (originally, I saw
> > the XML::Reader subject in the archives and thought it may have been
> > already done).
> 
> I don't know enought about this to say, but since this is a binding to
> libxml, what is libxml's support of this?

libxml2's support of SAX2 seems to be reasonably complete.  I haven't
looked at it under the magnifying glass yet, though.  As long as the
base events are there, implementing stuff like the SAX
XMLReader/XMLFilter Java API and friends would be done on top of the C
code using Ruby.  There is some support for filtering, but I'd need to
look at how hard it would be to integrate what's there with Ruby.  If I
had to do it myself, I'd be able to do it quicker in Ruby than in C
because I'm out of practice, and I've never worked with extending Ruby
in C before.  However, from the base library support, there's no current
code that I could find in this project that uses the new SAX2 API in
libxml2 (http://xmlsoft.org/html/libxml-SAX2.html).

I keep coming back to a personal Ruby vs. Python debate for some
projects that I want to work on.  Python has pretty rich XML support,
but I don't like it as much as Ruby.  However, I don't want to have to
start building the project from the ground up, including all of the
supporting libraries.  There's also the Unicode thing in Ruby, but I can
probably live with the double-byte hacks.  I'm also more current in Ruby
than Python since it's been 4 years since I did much with Python, but
it'll probably boil down to how much I'm really going to need to write
before I can focus on the problem I want to try and solve.

Cheers,

ast
-- 
Andrew S. Townley <[EMAIL PROTECTED]>
http://atownley.org

_______________________________________________
libxml-devel mailing list
libxml-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/libxml-devel

Reply via email to