Nokogiri! First, it's less buggy, then it's supported and finally its author didn't vanish in thin air. Also, please never ever, ever, ever use REXML unless you don't care about memory and performance. Finally, you might want to look at libxml directly if you need low level power. I would also consider using a SAX parser ( http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/SAX/Parser.html ) to save some memory if you are parsing 800mb files.
- Matt On Wed, Apr 20, 2011 at 11:58 AM, David Sainte-Claire < [email protected]> wrote: > Hello all. I'm starting a new project where I'm going to need to > start parsing some pretty massive XML files using Ruby (~100MB on > average but could by up to 800MB). In the past I had been using REXML > for a lot of parsing, because the documents were small enough, and not > very complicated, but this one seems like it might take some more > massaging. > > I was doing some research into benchmarks between Nokogiri and Hpricot > and it the most recent performance numbers I can find were from over > a year ago. Does anyone have a strong preference for either of the > libraries, or knows of strengths / weaknesses that I should consider > before selecting one of them? > > Thanks! > David > > -- > SD Ruby mailing list > [email protected] > http://groups.google.com/group/sdruby -- SD Ruby mailing list [email protected] http://groups.google.com/group/sdruby
