Hi Yasu,

I ran your tests in Shark. Tests 1 and 3 are significantly slower because #scan 
and #gsub are called with a block, which means MacRuby has to create a new 
Match object for every yield, to conform to the Ruby specs. Each Match object 
contains a copy of the original string.

MacRuby has a slow memory allocator (much slower than the original Ruby), so 
one must be careful to not allocate too many objects. This is something we are 
working on, unfortunately MacRuby doesn't fully control the object allocator, 
as it resides in the libauto library (the Objective-C garbage collector).

In your case, I recommend using the method in Test 2, which is to not pass a 
block. 

It is possible that we can reduce memory usage when doing regexps in MacRuby, 
however after having a quick look at the source code I am not sure something 
can be done for 0.8 :(

Laurent

On Dec 1, 2010, at 9:46 AM, Yasu Imao wrote:

> Hello,
> 
> I'm rewriting an app for text analysis in MacRuby, which I originally wrote 
> in RubyCocoa.  But I encountered a serious performance issue in MacRuby, 
> which is related to processing text using regular expressions.  
> 
> I'm wondering if this will be taken care of in the near future (or already 
> done in 0.8?).
> 
> Below are my simple tests.  The first two are essentially the same with a 
> slightly different approach.  Both are simply counting frequency of each 
> word.  I want to use the first approach not to count word frequencies, but in 
> other processes.  The third one is to test the speed of String#gsub with 
> regular expression.  I felt String#gsub was slow in my app, so I just wanted 
> to test how slow it is compared to RubyCocoa.
> 
> 
> Test 1 - scan-block
> 
> freq = Hash.new(0)
> text.scan(/\w+/) do |word|
>  freq[word] += 1
> end
> 
> 
> Test 2 - scan array.each
> 
> freq = Hash.new(0)
> text.scan(/\w+/).each do |word|
>  freq[word] += 1
> end
> 
> 
> Test 3 - gsub upcase
> 
> text.gsub!(/\w+/){|x| x.upcase}  
> 
> 
> The results are in seconds.  The original text is in English with 8154 words. 
>  Each process was repeated 10 times to calculate processing times.  Each test 
> were done 3 times.
> 
> Ruby 1.8.7     Test1 - scan-block:                      0.542,    0.502,    
> 0.518
> Ruby 1.8.7     Test2 - scan array.each:                 0.399,    0.392,    
> 0.399
> Ruby 1.8.7     Test3 - gsub upcase:             0.384,    0.349,    0.390
> 
> MacRuby 0.7.1 Test1 - scan-block:                     27.612,  27.707,  27.453
> MacRuby 0.7.1 Test2 - scan array.each:          3.556,    3.616,    3.554
> MacRuby 0.7.1 Test3 - gsub upcase:                    27.613,  26.826,  27.327
> 
> 
> Thanks,
> Yasu
> _______________________________________________
> MacRuby-devel mailing list
> MacRuby-devel@lists.macosforge.org
> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel

_______________________________________________
MacRuby-devel mailing list
MacRuby-devel@lists.macosforge.org
http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel

Reply via email to