real world test shows JRuby 3x slower than MRI on large dataset regex matching
------------------------------------------------------------------------------

                 Key: JRUBY-2436
                 URL: http://jira.codehaus.org/browse/JRUBY-2436
             Project: JRuby
          Issue Type: Bug
          Components: Performance
    Affects Versions: JRuby 1.1b1
         Environment: Ubuntu Gutsy 2.6.22-14-generic #1 SMP,
x86_64,
sun-jdk 1.6.0_03
            Reporter: Martin Matusiak
         Attachments: harvest.rb

The original benchmark was born out of a careless real world test that produced 
surprising results, described here:
http://www.matusiak.eu/numerodix/blog/index.php/2008/04/21/clocking-jruby11/

A second test on a smaller amount of raw disk data (1gb instead of 25gb) showed 
JRuby trailing MRI by significant margin. I reran the tests applying more rigor 
to the process. First the test itself.

Executable: A Ruby script I wrote to find urls in any sort of data (text files, 
markup, binary etc).
Dataset: 5gb of my ext3 root partition, generated such:
sudo dd count=$((3*1024*1024*1024/512)) if=/dev/sda5 of=data

The same file was executed in 3 different modes: MRI, JRuby, JRuby w/ 
-J-server. The execution strings were respectively:
time ( cat data | bin/jruby -J-server harvest.rb --url > urls ) 2>> logfile
time ( cat data | bin/jruby harvest.rb --url > urls ) 2>> logfile
time ( cat data | harvest.rb --url > urls ) 2>> logfile
This sequence was repeated 3 times. The output of the processing (the file 
urls) was (correctly) identical in each case, so that is not a concern.

Conditions for the test was a live dual core laptop running a few desktop 
applications, but nothing heavy. The test had the lion's share of cpu time 
throughout.

The timing results follow:

jruby -jserver

real    15m7.881s
user    13m41.895s
sys     1m11.024s

jruby

real    15m10.361s
user    13m38.691s
sys     1m11.948s

ruby

real    6m7.761s
user    4m52.322s
sys     0m12.829s

jruby -jserver

real    15m26.364s
user    13m55.592s
sys     1m12.957s

jruby

real    15m41.676s
user    14m2.181s
sys     1m12.721s

ruby

real    6m6.427s
user    4m44.338s
sys     0m12.589s

jruby -jserver

real    15m11.562s
user    13m50.360s
sys     1m9.768s

jruby

real    14m22.427s
user    12m57.313s
sys     1m11.548s

ruby

real    5m54.793s
user    4m37.929s
sys     0m12.493s



Memory observations during test as seen in "top"

JRuby
Virtual: 650mb, Resident: 250mb
MRI
Virtual: 21mb, Resident: 15mb


I'm attaching the script harvest.rb, which was the executable file. I'm *not* 
attaching the 5gb of my harddrive, but perhaps you can improvise a similar 
dataset yourselves :)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply via email to