Hi Laurent,

Thank you for your prompt work.  I tried the latest nightly build and it's much 
faster than 0.7.1.  The Test 1 and Test 2 are only 2 - 2.5 times slower than 
those on Ruby 1.8.7 and Test 3 is about 5 times slower.  And I tried my app on 
this nightly build.  Now I can say MacRuby version of my app is quite usable.  
From now on, I'll be more serious about rewriting my RubyCocoa apps in MacRuby.

But I was curious about the difference between String#scan and String#gsub, so 
I also tested String#gsub without a block.  

text.gsub!(/\w+/,"test")

This was also about 5 times slower on MacRuby than Ruby 1.8.7.  Could this be a 
bit more faster?  This is not in the main process of my apps (for 
pre-processing of text), so the performance of String#gsub doesn't affect as 
much, though.

And thanks for looking into the regexp bug.  I guess I'll have to wait and see 
if Apple updates ICU on OS X.

Best,
Yasu

On 2010/12/02, at 9:39, Laurent Sansonetti wrote:

> Hi Yasu,
> 
> It's committed to trunk, it should be available in tonight's nightly build, 
> so feel free to grab it :) http://www.macruby.org/files/nightlies. It will 
> also be in the upcoming 0.8 release.
> 
> I see your ticket about the look-ahead regexp bug, I will have a look later 
> today. Thanks for reporting the problem. Hopefully it can also be fixed for 
> 0.8.
> 
> Laurent
> 
> On Dec 1, 2010, at 4:29 PM, Yasu Imao wrote:
> 
>> Hi Laurent,
>> 
>> This is great!  I think I read in the discussion of StringScanner 
>> performance about object allocation (though I didn't understand what exactly 
>> was happening behind the scene), so I guessed it was about 'using block' 
>> with regular expression match data.  
>> 
>> For a word frequency count feature, I could use Test 2 script, but for other 
>> part of the app, I needed match information ($`, $' to be exact), so this 
>> performance improvement means a lot to my app.
>> 
>> Is this going to be in 0.8?  Then, I'll test this with my app.
>> 
>> By the way, the regular expression itself seems to have a bug (not related 
>> to this, but to negative look-ahead) and I issued(?) a ticket (though I'm 
>> not sure I did it properly).
>> 
>> Best,
>> Yasu
>> 
>> On 2010/12/02, at 8:50, Laurent Sansonetti wrote:
>> 
>>> I spoke too fast, having a second look I found that it was possible to make 
>>> the Match strings point to a unique object. I committed this optimization 
>>> in r4964 and verified that no regression is introduced.
>>> 
>>> Before:
>>> 
>>> $ time /usr/local/bin/macruby -e "text=File.read('/tmp/foo.txt'); 
>>> freq=Hash.new(0); text.scan(/\w+/) {}"
>>> 
>>> real        0m2.430s
>>> user        0m1.628s
>>> sys 0m1.030s
>>> 
>>> After :)
>>> 
>>> $ time ./miniruby -e "text=File.read('/tmp/foo.txt'); freq=Hash.new(0); 
>>> text.scan(/\w+/) {}"
>>> 
>>> real        0m0.121s
>>> user        0m0.100s
>>> sys 0m0.015s
>>> 
>>> Laurent
>>> 
>>> On Dec 1, 2010, at 2:46 PM, Laurent Sansonetti wrote:
>>> 
>>>> Hi Yasu,
>>>> 
>>>> I ran your tests in Shark. Tests 1 and 3 are significantly slower because 
>>>> #scan and #gsub are called with a block, which means MacRuby has to create 
>>>> a new Match object for every yield, to conform to the Ruby specs. Each 
>>>> Match object contains a copy of the original string.
>>>> 
>>>> MacRuby has a slow memory allocator (much slower than the original Ruby), 
>>>> so one must be careful to not allocate too many objects. This is something 
>>>> we are working on, unfortunately MacRuby doesn't fully control the object 
>>>> allocator, as it resides in the libauto library (the Objective-C garbage 
>>>> collector).
>>>> 
>>>> In your case, I recommend using the method in Test 2, which is to not pass 
>>>> a block. 
>>>> 
>>>> It is possible that we can reduce memory usage when doing regexps in 
>>>> MacRuby, however after having a quick look at the source code I am not 
>>>> sure something can be done for 0.8 :(
>>>> 
>>>> Laurent
>>>> 
>>>> On Dec 1, 2010, at 9:46 AM, Yasu Imao wrote:
>>>> 
>>>>> Hello,
>>>>> 
>>>>> I'm rewriting an app for text analysis in MacRuby, which I originally 
>>>>> wrote in RubyCocoa.  But I encountered a serious performance issue in 
>>>>> MacRuby, which is related to processing text using regular expressions.  
>>>>> 
>>>>> I'm wondering if this will be taken care of in the near future (or 
>>>>> already done in 0.8?).
>>>>> 
>>>>> Below are my simple tests.  The first two are essentially the same with a 
>>>>> slightly different approach.  Both are simply counting frequency of each 
>>>>> word.  I want to use the first approach not to count word frequencies, 
>>>>> but in other processes.  The third one is to test the speed of 
>>>>> String#gsub with regular expression.  I felt String#gsub was slow in my 
>>>>> app, so I just wanted to test how slow it is compared to RubyCocoa.
>>>>> 
>>>>> 
>>>>> Test 1 - scan-block
>>>>> 
>>>>> freq = Hash.new(0)
>>>>> text.scan(/\w+/) do |word|
>>>>> freq[word] += 1
>>>>> end
>>>>> 
>>>>> 
>>>>> Test 2 - scan array.each
>>>>> 
>>>>> freq = Hash.new(0)
>>>>> text.scan(/\w+/).each do |word|
>>>>> freq[word] += 1
>>>>> end
>>>>> 
>>>>> 
>>>>> Test 3 - gsub upcase
>>>>> 
>>>>> text.gsub!(/\w+/){|x| x.upcase}  
>>>>> 
>>>>> 
>>>>> The results are in seconds.  The original text is in English with 8154 
>>>>> words.  Each process was repeated 10 times to calculate processing times. 
>>>>>  Each test were done 3 times.
>>>>> 
>>>>> Ruby 1.8.7         Test1 - scan-block:                      0.542,    
>>>>> 0.502,    0.518
>>>>> Ruby 1.8.7         Test2 - scan array.each:                 0.399,    
>>>>> 0.392,    0.399
>>>>> Ruby 1.8.7         Test3 - gsub upcase:             0.384,    0.349,    
>>>>> 0.390
>>>>> 
>>>>> MacRuby 0.7.1 Test1 - scan-block:                 27.612,  27.707,  27.453
>>>>> MacRuby 0.7.1 Test2 - scan array.each:      3.556,    3.616,    3.554
>>>>> MacRuby 0.7.1 Test3 - gsub upcase:                27.613,  26.826,  27.327
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Yasu
>>>>> _______________________________________________
>>>>> MacRuby-devel mailing list
>>>>> MacRuby-devel@lists.macosforge.org
>>>>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
>>>> 
>>>> _______________________________________________
>>>> MacRuby-devel mailing list
>>>> MacRuby-devel@lists.macosforge.org
>>>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
>>> 
>>> _______________________________________________
>>> MacRuby-devel mailing list
>>> MacRuby-devel@lists.macosforge.org
>>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
>> 
>> _______________________________________________
>> MacRuby-devel mailing list
>> MacRuby-devel@lists.macosforge.org
>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
> 
> _______________________________________________
> MacRuby-devel mailing list
> MacRuby-devel@lists.macosforge.org
> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel

_______________________________________________
MacRuby-devel mailing list
MacRuby-devel@lists.macosforge.org
http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel

Reply via email to