On Thu, 2006-12-28 at 12:26, TRANS wrote: > On 12/27/06, Andrew S. Townley <[EMAIL PROTECTED]> wrote: > > Hi, > > > > I'm looking for a good SAX implementation for Ruby to do some XML > > parsing. I see from the mailing list archives you've just added the > > pull-based XML::Reader based on the Microsoft C# API, but I want "real" > > SAX-based event-driven parsing. > > What are you planning to do with it? I'm curious, is it because you > prefer that way of doing it, or is there a techincal reason.
A bit of both :) I want to be able to efficiently process some arbitrarily large documents, but I also want to be able to provide some processing filters to perform some normalization somewhat equivalent to the C14N/XCL-C14N process as well as some other, non-XSLT types of transformations. I've also done quite a bit of this with Java, so it is a model which is familiar to me, and, once you get used to it, I find that it's easier to deal with than pull parsing for these types of tasks. There's also a desire to not tie myself too tightly to a particular XML parser implementation because certain ones are better for certain tasks. Ideally, I'd like to see a Ruby XML::Parser::SAX module which would either be implemented by or at least support a number of different parsers, based on what you needed at the time. > > > I also noticed that the current support for this in the library is based > > on the now deprecated SAXv1 interface rather than the newer SAX2 > > interface. Is there any plans to migrate this or change it? Also, I > > think it would be worthwhile implementing a filter mechanism similar to > > the Java SAX API's XMLReader, XMLFilter and friends (originally, I saw > > the XML::Reader subject in the archives and thought it may have been > > already done). > > I don't know enought about this to say, but since this is a binding to > libxml, what is libxml's support of this? libxml2's support of SAX2 seems to be reasonably complete. I haven't looked at it under the magnifying glass yet, though. As long as the base events are there, implementing stuff like the SAX XMLReader/XMLFilter Java API and friends would be done on top of the C code using Ruby. There is some support for filtering, but I'd need to look at how hard it would be to integrate what's there with Ruby. If I had to do it myself, I'd be able to do it quicker in Ruby than in C because I'm out of practice, and I've never worked with extending Ruby in C before. However, from the base library support, there's no current code that I could find in this project that uses the new SAX2 API in libxml2 (http://xmlsoft.org/html/libxml-SAX2.html). I keep coming back to a personal Ruby vs. Python debate for some projects that I want to work on. Python has pretty rich XML support, but I don't like it as much as Ruby. However, I don't want to have to start building the project from the ground up, including all of the supporting libraries. There's also the Unicode thing in Ruby, but I can probably live with the double-byte hacks. I'm also more current in Ruby than Python since it's been 4 years since I did much with Python, but it'll probably boil down to how much I'm really going to need to write before I can focus on the problem I want to try and solve. Cheers, ast -- Andrew S. Townley <[EMAIL PROTECTED]> http://atownley.org _______________________________________________ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel