I'm hoping to use Hpricot for general XML processing instead of Rexml 
or Libxml in the SDS and DIY systems and I wanted to find out the 
speeds of different XML parsers in MRI and JRuby.

So far I've only got one test parsing a 100k SDS bundle file.

Do this 50 times:
   - parse a 100k SDS bundle file and count the 466 sockEntries

The results shown below are the times after a "rehearsal". The times 
for JRuby are faster when the JVM has been "warmed-up". The rehearsal 
has no effect on the MRI timings.

  Platform and method          total time
-------------------------------------------
  JRuby: rexml                   6.93
  JRuby: rexml+jrexml            6.95
  MRI:   rexml                   4.08
  JRuby: hpricot                 3.25     * see note below
  MRI:   hpricot                 1.17
  JRuby: jdom_document_builder   0.77
  MRI:   libxml                  0.21

The code and complete results:

   http://pastie.caboo.se/170141

The code and data file used for the benchmark is also in svn:

   https://svn.concord.org/svn/projects/trunk/common/ruby/xml_benchmarks

* Hpricot uses code created by Ragel, a state machine compiler that 
can produce C or Java code, for the initial parsing. The Ragel => 
Java compiler can only produce one style of code generation and it is 
not the fastest. The style chosen by Hpricot for generating the C 
code produces a larger executable and is faster.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"SAIL-Dev" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/SAIL-Dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to