I'm hoping to use Hpricot for general XML processing instead of Rexml or Libxml in the SDS and DIY systems and I wanted to find out the speeds of different XML parsers in MRI and JRuby.
So far I've only got one test parsing a 100k SDS bundle file. Do this 50 times: - parse a 100k SDS bundle file and count the 466 sockEntries The results shown below are the times after a "rehearsal". The times for JRuby are faster when the JVM has been "warmed-up". The rehearsal has no effect on the MRI timings. Platform and method total time ------------------------------------------- JRuby: rexml 6.93 JRuby: rexml+jrexml 6.95 MRI: rexml 4.08 JRuby: hpricot 3.25 * see note below MRI: hpricot 1.17 JRuby: jdom_document_builder 0.77 MRI: libxml 0.21 The code and complete results: http://pastie.caboo.se/170141 The code and data file used for the benchmark is also in svn: https://svn.concord.org/svn/projects/trunk/common/ruby/xml_benchmarks * Hpricot uses code created by Ragel, a state machine compiler that can produce C or Java code, for the initial parsing. The Ragel => Java compiler can only produce one style of code generation and it is not the fastest. The style chosen by Hpricot for generating the C code produces a larger executable and is faster. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "SAIL-Dev" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/SAIL-Dev?hl=en -~----------~----~----~----~------~----~------~--~---
