Hi Laurent, Thank you for your prompt work. I tried the latest nightly build and it's much faster than 0.7.1. The Test 1 and Test 2 are only 2 - 2.5 times slower than those on Ruby 1.8.7 and Test 3 is about 5 times slower. And I tried my app on this nightly build. Now I can say MacRuby version of my app is quite usable. From now on, I'll be more serious about rewriting my RubyCocoa apps in MacRuby.
But I was curious about the difference between String#scan and String#gsub, so I also tested String#gsub without a block. text.gsub!(/\w+/,"test") This was also about 5 times slower on MacRuby than Ruby 1.8.7. Could this be a bit more faster? This is not in the main process of my apps (for pre-processing of text), so the performance of String#gsub doesn't affect as much, though. And thanks for looking into the regexp bug. I guess I'll have to wait and see if Apple updates ICU on OS X. Best, Yasu On 2010/12/02, at 9:39, Laurent Sansonetti wrote: > Hi Yasu, > > It's committed to trunk, it should be available in tonight's nightly build, > so feel free to grab it :) http://www.macruby.org/files/nightlies. It will > also be in the upcoming 0.8 release. > > I see your ticket about the look-ahead regexp bug, I will have a look later > today. Thanks for reporting the problem. Hopefully it can also be fixed for > 0.8. > > Laurent > > On Dec 1, 2010, at 4:29 PM, Yasu Imao wrote: > >> Hi Laurent, >> >> This is great! I think I read in the discussion of StringScanner >> performance about object allocation (though I didn't understand what exactly >> was happening behind the scene), so I guessed it was about 'using block' >> with regular expression match data. >> >> For a word frequency count feature, I could use Test 2 script, but for other >> part of the app, I needed match information ($`, $' to be exact), so this >> performance improvement means a lot to my app. >> >> Is this going to be in 0.8? Then, I'll test this with my app. >> >> By the way, the regular expression itself seems to have a bug (not related >> to this, but to negative look-ahead) and I issued(?) a ticket (though I'm >> not sure I did it properly). >> >> Best, >> Yasu >> >> On 2010/12/02, at 8:50, Laurent Sansonetti wrote: >> >>> I spoke too fast, having a second look I found that it was possible to make >>> the Match strings point to a unique object. I committed this optimization >>> in r4964 and verified that no regression is introduced. >>> >>> Before: >>> >>> $ time /usr/local/bin/macruby -e "text=File.read('/tmp/foo.txt'); >>> freq=Hash.new(0); text.scan(/\w+/) {}" >>> >>> real 0m2.430s >>> user 0m1.628s >>> sys 0m1.030s >>> >>> After :) >>> >>> $ time ./miniruby -e "text=File.read('/tmp/foo.txt'); freq=Hash.new(0); >>> text.scan(/\w+/) {}" >>> >>> real 0m0.121s >>> user 0m0.100s >>> sys 0m0.015s >>> >>> Laurent >>> >>> On Dec 1, 2010, at 2:46 PM, Laurent Sansonetti wrote: >>> >>>> Hi Yasu, >>>> >>>> I ran your tests in Shark. Tests 1 and 3 are significantly slower because >>>> #scan and #gsub are called with a block, which means MacRuby has to create >>>> a new Match object for every yield, to conform to the Ruby specs. Each >>>> Match object contains a copy of the original string. >>>> >>>> MacRuby has a slow memory allocator (much slower than the original Ruby), >>>> so one must be careful to not allocate too many objects. This is something >>>> we are working on, unfortunately MacRuby doesn't fully control the object >>>> allocator, as it resides in the libauto library (the Objective-C garbage >>>> collector). >>>> >>>> In your case, I recommend using the method in Test 2, which is to not pass >>>> a block. >>>> >>>> It is possible that we can reduce memory usage when doing regexps in >>>> MacRuby, however after having a quick look at the source code I am not >>>> sure something can be done for 0.8 :( >>>> >>>> Laurent >>>> >>>> On Dec 1, 2010, at 9:46 AM, Yasu Imao wrote: >>>> >>>>> Hello, >>>>> >>>>> I'm rewriting an app for text analysis in MacRuby, which I originally >>>>> wrote in RubyCocoa. But I encountered a serious performance issue in >>>>> MacRuby, which is related to processing text using regular expressions. >>>>> >>>>> I'm wondering if this will be taken care of in the near future (or >>>>> already done in 0.8?). >>>>> >>>>> Below are my simple tests. The first two are essentially the same with a >>>>> slightly different approach. Both are simply counting frequency of each >>>>> word. I want to use the first approach not to count word frequencies, >>>>> but in other processes. The third one is to test the speed of >>>>> String#gsub with regular expression. I felt String#gsub was slow in my >>>>> app, so I just wanted to test how slow it is compared to RubyCocoa. >>>>> >>>>> >>>>> Test 1 - scan-block >>>>> >>>>> freq = Hash.new(0) >>>>> text.scan(/\w+/) do |word| >>>>> freq[word] += 1 >>>>> end >>>>> >>>>> >>>>> Test 2 - scan array.each >>>>> >>>>> freq = Hash.new(0) >>>>> text.scan(/\w+/).each do |word| >>>>> freq[word] += 1 >>>>> end >>>>> >>>>> >>>>> Test 3 - gsub upcase >>>>> >>>>> text.gsub!(/\w+/){|x| x.upcase} >>>>> >>>>> >>>>> The results are in seconds. The original text is in English with 8154 >>>>> words. Each process was repeated 10 times to calculate processing times. >>>>> Each test were done 3 times. >>>>> >>>>> Ruby 1.8.7 Test1 - scan-block: 0.542, >>>>> 0.502, 0.518 >>>>> Ruby 1.8.7 Test2 - scan array.each: 0.399, >>>>> 0.392, 0.399 >>>>> Ruby 1.8.7 Test3 - gsub upcase: 0.384, 0.349, >>>>> 0.390 >>>>> >>>>> MacRuby 0.7.1 Test1 - scan-block: 27.612, 27.707, 27.453 >>>>> MacRuby 0.7.1 Test2 - scan array.each: 3.556, 3.616, 3.554 >>>>> MacRuby 0.7.1 Test3 - gsub upcase: 27.613, 26.826, 27.327 >>>>> >>>>> >>>>> Thanks, >>>>> Yasu >>>>> _______________________________________________ >>>>> MacRuby-devel mailing list >>>>> MacRuby-devel@lists.macosforge.org >>>>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel >>>> >>>> _______________________________________________ >>>> MacRuby-devel mailing list >>>> MacRuby-devel@lists.macosforge.org >>>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel >>> >>> _______________________________________________ >>> MacRuby-devel mailing list >>> MacRuby-devel@lists.macosforge.org >>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel >> >> _______________________________________________ >> MacRuby-devel mailing list >> MacRuby-devel@lists.macosforge.org >> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel > > _______________________________________________ > MacRuby-devel mailing list > MacRuby-devel@lists.macosforge.org > http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel _______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel