Hi Yasu, It's committed to trunk, it should be available in tonight's nightly build, so feel free to grab it :) http://www.macruby.org/files/nightlies. It will also be in the upcoming 0.8 release.
I see your ticket about the look-ahead regexp bug, I will have a look later today. Thanks for reporting the problem. Hopefully it can also be fixed for 0.8. Laurent On Dec 1, 2010, at 4:29 PM, Yasu Imao wrote: > Hi Laurent, > > This is great! I think I read in the discussion of StringScanner performance > about object allocation (though I didn't understand what exactly was > happening behind the scene), so I guessed it was about 'using block' with > regular expression match data. > > For a word frequency count feature, I could use Test 2 script, but for other > part of the app, I needed match information ($`, $' to be exact), so this > performance improvement means a lot to my app. > > Is this going to be in 0.8? Then, I'll test this with my app. > > By the way, the regular expression itself seems to have a bug (not related to > this, but to negative look-ahead) and I issued(?) a ticket (though I'm not > sure I did it properly). > > Best, > Yasu > > On 2010/12/02, at 8:50, Laurent Sansonetti wrote: > >> I spoke too fast, having a second look I found that it was possible to make >> the Match strings point to a unique object. I committed this optimization in >> r4964 and verified that no regression is introduced. >> >> Before: >> >> $ time /usr/local/bin/macruby -e "text=File.read('/tmp/foo.txt'); >> freq=Hash.new(0); text.scan(/\w+/) {}" >> >> real 0m2.430s >> user 0m1.628s >> sys 0m1.030s >> >> After :) >> >> $ time ./miniruby -e "text=File.read('/tmp/foo.txt'); freq=Hash.new(0); >> text.scan(/\w+/) {}" >> >> real 0m0.121s >> user 0m0.100s >> sys 0m0.015s >> >> Laurent >> >> On Dec 1, 2010, at 2:46 PM, Laurent Sansonetti wrote: >> >>> Hi Yasu, >>> >>> I ran your tests in Shark. Tests 1 and 3 are significantly slower because >>> #scan and #gsub are called with a block, which means MacRuby has to create >>> a new Match object for every yield, to conform to the Ruby specs. Each >>> Match object contains a copy of the original string. >>> >>> MacRuby has a slow memory allocator (much slower than the original Ruby), >>> so one must be careful to not allocate too many objects. This is something >>> we are working on, unfortunately MacRuby doesn't fully control the object >>> allocator, as it resides in the libauto library (the Objective-C garbage >>> collector). >>> >>> In your case, I recommend using the method in Test 2, which is to not pass >>> a block. >>> >>> It is possible that we can reduce memory usage when doing regexps in >>> MacRuby, however after having a quick look at the source code I am not sure >>> something can be done for 0.8 :( >>> >>> Laurent >>> >>> On Dec 1, 2010, at 9:46 AM, Yasu Imao wrote: >>> >>>> Hello, >>>> >>>> I'm rewriting an app for text analysis in MacRuby, which I originally >>>> wrote in RubyCocoa. But I encountered a serious performance issue in >>>> MacRuby, which is related to processing text using regular expressions. >>>> >>>> I'm wondering if this will be taken care of in the near future (or already >>>> done in 0.8?). >>>> >>>> Below are my simple tests. The first two are essentially the same with a >>>> slightly different approach. Both are simply counting frequency of each >>>> word. I want to use the first approach not to count word frequencies, but >>>> in other processes. The third one is to test the speed of String#gsub >>>> with regular expression. I felt String#gsub was slow in my app, so I just >>>> wanted to test how slow it is compared to RubyCocoa. >>>> >>>> >>>> Test 1 - scan-block >>>> >>>> freq = Hash.new(0) >>>> text.scan(/\w+/) do |word| >>>> freq[word] += 1 >>>> end >>>> >>>> >>>> Test 2 - scan array.each >>>> >>>> freq = Hash.new(0) >>>> text.scan(/\w+/).each do |word| >>>> freq[word] += 1 >>>> end >>>> >>>> >>>> Test 3 - gsub upcase >>>> >>>> text.gsub!(/\w+/){|x| x.upcase} >>>> >>>> >>>> The results are in seconds. The original text is in English with 8154 >>>> words. Each process was repeated 10 times to calculate processing times. >>>> Each test were done 3 times. >>>> >>>> Ruby 1.8.7 Test1 - scan-block: 0.542, 0.502, >>>> 0.518 >>>> Ruby 1.8.7 Test2 - scan array.each: 0.399, 0.392, >>>> 0.399 >>>> Ruby 1.8.7 Test3 - gsub upcase: 0.384, 0.349, 0.390 >>>> >>>> MacRuby 0.7.1 Test1 - scan-block: 27.612, 27.707, 27.453 >>>> MacRuby 0.7.1 Test2 - scan array.each: 3.556, 3.616, 3.554 >>>> MacRuby 0.7.1 Test3 - gsub upcase: 27.613, 26.826, 27.327 >>>> >>>> >>>> Thanks, >>>> Yasu >>>> _______________________________________________ >>>> MacRuby-devel mailing list >>>> MacRuby-devel@lists.macosforge.org >>>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel >>> >>> _______________________________________________ >>> MacRuby-devel mailing list >>> MacRuby-devel@lists.macosforge.org >>> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel >> >> _______________________________________________ >> MacRuby-devel mailing list >> MacRuby-devel@lists.macosforge.org >> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel > > _______________________________________________ > MacRuby-devel mailing list > MacRuby-devel@lists.macosforge.org > http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
_______________________________________________ MacRuby-devel mailing list MacRuby-devel@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel