Re: [MacRuby-devel] Regular expression related performance

2010-12-03 Thread Laurent Sansonetti
Hi Yasu, On Dec 2, 2010, at 9:37 PM, Yasu Imao wrote: > Hi Laurent, > > I filed a ticket on String#gsub performance. Thanks :) > While I was playing with regex, I noticed another difference(?) between Ruby > 1.8.7 and MacRuby. > > In Ruby Regexp class, fixnums assigned to Regexp.new option

Re: [MacRuby-devel] Regular expression related performance

2010-12-02 Thread Yasu Imao
Hi Laurent, I filed a ticket on String#gsub performance. While I was playing with regex, I noticed another difference(?) between Ruby 1.8.7 and MacRuby. In Ruby Regexp class, fixnums assigned to Regexp.new options are different with Ruby 1.8.7 and with MacRuby Ruby 1.8.7 p Regexp::IGNORECA

Re: [MacRuby-devel] Regular expression related performance

2010-12-02 Thread Laurent Sansonetti
Hi Yasu, On Dec 2, 2010, at 5:20 AM, Yasu Imao wrote: > Hi Laurent, > > Thank you for your prompt work. I tried the latest nightly build and it's > much faster than 0.7.1. The Test 1 and Test 2 are only 2 - 2.5 times slower > than those on Ruby 1.8.7 and Test 3 is about 5 times slower. And

Re: [MacRuby-devel] Regular expression related performance

2010-12-02 Thread Laurent Sansonetti
On Dec 2, 2010, at 8:45 AM, Caio Chassot wrote: > On 2010-12-01, at 20:15 , Laurent Sansonetti wrote: >> >> MacRuby is using ICU. I'm not aware of ICU's internals, but I suspect the >> performance issue is probably elsewhere here, given the huge difference >> against 1.8. >> > > I know we've

Re: [MacRuby-devel] Regular expression related performance

2010-12-02 Thread Matt Aimonetti
I know I'm not responding to your question but it's important to note that Oniguruma isn't thread safe and if you were to do that you would have to deal with a lot of weird bugs. On a different note, no I don't think it would be remotely trivial but Vincent or Laurent can confirm. - Matt On Thu,

Re: [MacRuby-devel] Regular expression related performance

2010-12-02 Thread Caio Chassot
On 2010-12-01, at 20:15 , Laurent Sansonetti wrote: > > MacRuby is using ICU. I'm not aware of ICU's internals, but I suspect the > performance issue is probably elsewhere here, given the huge difference > against 1.8. > I know we've talked about this before and you found ICU and Oniguruma to

Re: [MacRuby-devel] Regular expression related performance

2010-12-02 Thread Yasu Imao
Hi Laurent, Thank you for your prompt work. I tried the latest nightly build and it's much faster than 0.7.1. The Test 1 and Test 2 are only 2 - 2.5 times slower than those on Ruby 1.8.7 and Test 3 is about 5 times slower. And I tried my app on this nightly build. Now I can say MacRuby vers

Re: [MacRuby-devel] Regular expression related performance

2010-12-01 Thread Jordan K. Hubbard
On Dec 1, 2010, at 3:50 PM, Laurent Sansonetti wrote: > Before: > > $ time /usr/local/bin/macruby -e "text=File.read('/tmp/foo.txt'); > freq=Hash.new(0); text.scan(/\w+/) {}" > > real 0m2.430s > user 0m1.628s > sys 0m1.030s > > After :) > > $ time ./miniruby -e "text=File.read('/tmp/foo.

Re: [MacRuby-devel] Regular expression related performance

2010-12-01 Thread Laurent Sansonetti
Hi Yasu, It's committed to trunk, it should be available in tonight's nightly build, so feel free to grab it :) http://www.macruby.org/files/nightlies. It will also be in the upcoming 0.8 release. I see your ticket about the look-ahead regexp bug, I will have a look later today. Thanks for rep

Re: [MacRuby-devel] Regular expression related performance

2010-12-01 Thread Yasu Imao
Hi Laurent, This is great! I think I read in the discussion of StringScanner performance about object allocation (though I didn't understand what exactly was happening behind the scene), so I guessed it was about 'using block' with regular expression match data. For a word frequency count f

Re: [MacRuby-devel] Regular expression related performance

2010-12-01 Thread Laurent Sansonetti
I spoke too fast, having a second look I found that it was possible to make the Match strings point to a unique object. I committed this optimization in r4964 and verified that no regression is introduced. Before: $ time /usr/local/bin/macruby -e "text=File.read('/tmp/foo.txt'); freq=Hash.new(

Re: [MacRuby-devel] Regular expression related performance

2010-12-01 Thread Laurent Sansonetti
Hi Yasu, I ran your tests in Shark. Tests 1 and 3 are significantly slower because #scan and #gsub are called with a block, which means MacRuby has to create a new Match object for every yield, to conform to the Ruby specs. Each Match object contains a copy of the original string. MacRuby has

Re: [MacRuby-devel] Regular expression related performance

2010-12-01 Thread Laurent Sansonetti
On Dec 1, 2010, at 2:20 PM, Jordan K. Hubbard wrote: > > On Dec 1, 2010, at 2:15 PM, Laurent Sansonetti wrote: > >>> http://swtch.com/~rsc/regexp/regexp1.html >> >> MacRuby is using ICU. I'm not aware of ICU's internals, but I suspect the >> performance issue is probably elsewhere here, given

Re: [MacRuby-devel] Regular expression related performance

2010-12-01 Thread Jordan K. Hubbard
On Dec 1, 2010, at 2:15 PM, Laurent Sansonetti wrote: >> http://swtch.com/~rsc/regexp/regexp1.html > > MacRuby is using ICU. I'm not aware of ICU's internals, but I suspect the > performance issue is probably elsewhere here, given the huge difference > against 1.8. It would be instructive to

Re: [MacRuby-devel] Regular expression related performance

2010-12-01 Thread Laurent Sansonetti
Hi, On Dec 1, 2010, at 12:59 PM, Perry E. Metzger wrote: > On Thu, 2 Dec 2010 02:46:07 +0900 Yasu Imao > wrote: >> Hello, >> >> I'm rewriting an app for text analysis in MacRuby, which I >> originally wrote in RubyCocoa. But I encountered a serious >> performance issue in MacRuby, which is rel

Re: [MacRuby-devel] Regular expression related performance

2010-12-01 Thread Perry E. Metzger
On Thu, 2 Dec 2010 02:46:07 +0900 Yasu Imao wrote: > Hello, > > I'm rewriting an app for text analysis in MacRuby, which I > originally wrote in RubyCocoa. But I encountered a serious > performance issue in MacRuby, which is related to processing text > using regular expressions. Is MacRuby usi

[MacRuby-devel] Regular expression related performance

2010-12-01 Thread Yasu Imao
Hello, I'm rewriting an app for text analysis in MacRuby, which I originally wrote in RubyCocoa. But I encountered a serious performance issue in MacRuby, which is related to processing text using regular expressions. I'm wondering if this will be taken care of in the near future (or already