Hi Laurent, This is great! I think I read in the discussion of StringScanner performance about object allocation (though I didn't understand what exactly was happening behind the scene), so I guessed it was about 'using block' with regular expression match data.
For a word frequency count feature, I could use Test 2 script, but for other part of the app, I needed match information ($`, $' to be exact), so this performance improvement means a lot to my app. Is this going to be in 0.8? Then, I'll test this with my app. By the way, the regular expression itself seems to have a bug (not related to this, but to negative look-ahead) and I issued(?) a ticket (though I'm not sure I did it properly). Best, Yasu On 2010/12/02, at 8:50, Laurent Sansonetti wrote: > I spoke too fast, having a second look I found that it was possible to make > the Match strings point to a unique object. I committed this optimization in > r4964 and verified that no regression is introduced. > > Before: > > $ time /usr/local/bin/macruby -e "text=File.read('/tmp/foo.txt'); > freq=Hash.new(0); text.scan(/\w+/) {}" > > real 0m2.430s > user 0m1.628s > sys 0m1.030s > > After :) > > $ time ./miniruby -e "text=File.read('/tmp/foo.txt'); freq=Hash.new(0); > text.scan(/\w+/) {}" > > real 0m0.121s > user 0m0.100s > sys 0m0.015s > > Laurent > > On Dec 1, 2010, at 2:46 PM, Laurent Sansonetti wrote: > >> Hi Yasu, >> >> I ran your tests in Shark. Tests 1 and 3 are significantly slower because >> #scan and #gsub are called with a block, which means MacRuby has to create a >> new Match object for every yield, to conform to the Ruby specs. Each Match >> object contains a copy of the original string. >> >> MacRuby has a slow memory allocator (much slower than the original Ruby), so >> one must be careful to not allocate too many objects. This is something we >> are working on, unfortunately MacRuby doesn't fully control the object >> allocator, as it resides in the libauto library (the Objective-C garbage >> collector). >> >> In your case, I recommend using the method in Test 2, which is to not pass a >> block. >> >> It is possible that we can reduce memory usage when doing regexps in >> MacRuby, however after having a quick look at the source code I am not sure >> something can be done for 0.8 :( >> >> Laurent >> >> On Dec 1, 2010, at 9:46 AM, Yasu Imao wrote: >> >>> Hello, >>> >>> I'm rewriting an app for text analysis in MacRuby, which I originally wrote >>> in RubyCocoa. But I encountered a serious performance issue in MacRuby, >>> which is related to processing text using regular expressions. >>> >>> I'm wondering if this will be taken care of in the near future (or already >>> done in 0.8?). >>> >>> Below are my simple tests. The first two are essentially the same with a >>> slightly different approach. Both are simply counting frequency of each >>> word. I want to use the first approach not to count word frequencies, but >>> in other processes. The third one is to test the speed of String#gsub with >>> regular expression. I felt String#gsub was slow in my app, so I just >>> wanted to test how slow it is compared to RubyCocoa. >>> >>> >>> Test 1 - scan-block >>> >>> freq = Hash.new(0) >>> text.scan(/\w+/) do |word| >>> freq[word] += 1 >>> end >>> >>> >>> Test 2 - scan array.each >>> >>> freq = Hash.new(0) >>> text.scan(/\w+/).each do |word| >>> freq[word] += 1 >>> end >>> >>> >>> Test 3 - gsub upcase >>> >>> text.gsub!(/\w+/){|x| x.upcase} >>> >>> >>> The results are in seconds. The original text is in English with 8154 >>> words. Each process was repeated 10 times to calculate processing times. >>> Each test were done 3 times. >>> >>> Ruby 1.8.7 Test1 - scan-block: 0.542, 0.502, >>> 0.518 >>> Ruby 1.8.7 Test2 - scan array.each: 0.399, 0.392, >>> 0.399 >>> Ruby 1.8.7 Test3 - gsub upcase: 0.384, 0.349, 0.390 >>> >>> MacRuby 0.7.1 Test1 - scan-block: 27.612, 27.707, 27.453 >>> MacRuby 0.7.1 Test2 - scan array.each: 3.556, 3.616, 3.554 >>> MacRuby 0.7.1 Test3 - gsub upcase: 27.613, 26.826, 27.327 >>> >>> >>> Thanks, >>> Yasu >>> _______________________________________________ >>> MacRuby-devel mailing list >>> MacRuby-devel@lists.macosforge.org >>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel >> >> _______________________________________________ >> MacRuby-devel mailing list >> MacRuby-devel@lists.macosforge.org >> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel > > _______________________________________________ > MacRuby-devel mailing list > MacRuby-devel@lists.macosforge.org > http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel