Iterative String concatenation operations in JRuby degrade faster than in MRI.
------------------------------------------------------------------------------

                 Key: JRUBY-3252
                 URL: http://jira.codehaus.org/browse/JRUBY-3252
             Project: JRuby
          Issue Type: Improvement
          Components: Core Classes/Modules
    Affects Versions: JRuby 1.1.5
         Environment: I see this on Solaris x64, but expect that the symptoms 
would be the same across different platforms.
            Reporter: Prashant Srinivasan
         Attachments: string-cat-perf.tar.gz

Iterative String concatenation operations in JRuby seemingly degrade in a less 
graceful manner than in MRI.  Both of them use the same algorithm to 
concatenate, ie. allocate new string s3, copy str1 into s3, copy str2 into s3 
at the appropriate location, and return s3[2].  MRI uses memcpy[3] to copy the 
input Strings into the final one - memcpy is pretty fast, and I'm reasonably 
certain that the compiler generates inline assembly for memcpy in any case, 
while JRuby uses System#arraycopy.  But arraycopy is implemented natively 
too(probably in assembly?), and some profiling[4] revealed that the speed of 
the array copy was actually not the time hog, rather, it seems that something 
in org.jruby.util.ByteList(or below) is causing the time bloat in the 
algorithm.  A concatenation program[5] shows the differences in how string 
concatenation scales on both the platforms([6],[7]).  JRuby execution time 
tends to go up rather steeply after an inflection point, while MRI continues to 
linearly(at least across this interval) move forward.

I haven't looked into ByteList.java to see what causes the time bloat, and a 
source of the problem might really be time consumed for memory allocation(but 
probably not GC, since the HPROF profiles should certainly have caught that?) 
in the VM. 

References:
[2],[3]
MRI Version:
VALUE
rb_str_plus(str1, str2)
   VALUE str1, str2;
{
   VALUE str3;

   StringValue(str2);
   str3 = rb_str_new(0, RSTRING(str1)->len+RSTRING(str2)->len);
   memcpy(RSTRING(str3)->ptr, RSTRING(str1)->ptr, RSTRING(str1)->len);
   memcpy(RSTRING(str3)->ptr + RSTRING(str1)->len,
          RSTRING(str2)->ptr, RSTRING(str2)->len);
   RSTRING(str3)->ptr[RSTRING(str3)->len] = '\0';

   if (OBJ_TAINTED(str1) || OBJ_TAINTED(str2))
       OBJ_TAINT(str3);
   return str3;
}


JRuby's version:
   @JRubyMethod(name = "+", required = 1)
   public IRubyObject op_plus(ThreadContext context, IRubyObject other) {
       RubyString str = other.convertToString();

       ByteList result = new ByteList(value.realSize + str.value.realSize);
       result.realSize = value.realSize + str.value.realSize;
       System.arraycopy(value.bytes, value.begin, result.bytes, 0, value.realSi
ze);
       System.arraycopy(str.value.bytes, str.value.begin, result.bytes, value.r
ealSize, str.value.realSize);

       RubyString resultStr = newString(context.getRuntime(), result);
       if (isTaint() || str.isTaint()) resultStr.setTaint(true);
       return resultStr;
   }
[4]
hprof.file-test.longer-run.txt
and
hprof.file-test.txt

in the attachment.

[5]
string-scalability.rb

in the attachment.

[6]
comparison.PNG

in the attachment

[7]
ie., more details on [6] above is at jruby_vs_mri_string_scalability.ods 

in the attachment.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply via email to