Hi Hannes,

thanks a lot for your reply :)

> I'm not sure what you have tried. But I tried your hardcoded version.

I tried to make my testing more transparent and uploaded my code on a GitHub repo:

 https://github.com/jviereck/regexp.js-octane

> Though I would suggest to try to run the numbers again, since the numbers differ so much from mine.

Looking at the numbers, I think the numbers are fine if we assume you have a more powerful PC that results in a score roughly 2x of my value by default. Your score values before and after differ by ~200 points, while my do by ~100 - so there is the 2x speed difference.

> we see 2 signatures in "Exec". So it is less specialized (not much, just an extra if to distinguish the paths at the "exec" call). I'm sure if all regexps would be transformed to "RegExpJS" we would get that back. It would only see 1 signature again.

Thanks a lot for this hint! Based on this input, I have created a new "Exec2" function, which is an exact copy of the "Exec" function, but the "Exec2" function is only used for executing the re0 regular expression [1]. Using the hard coded RegExpJS function for re0 [2] resulted in these numbers:

before: 1582.7 (https://github.com/jviereck/regexp.js-octane/tree/e925606d0850b5c94d1622f7cfdcd2ab2c08e767) after: 1632.7 (https://github.com/jviereck/regexp.js-octane/tree/0630eec8e656f3df5effc27114ba80ffe970d53e)

These numbers are the average of 10 runs. There seems to be a speedup using the hardcoded JS version.

These results look more promising. However, they should be treated with care as getting /^ba/ to work is quite simple and the implementation makes very good to JS functions (e.g. String.prototype.startsWith), while a more complicated example including backtracking might yield different results.

Do you think it is worth to implement a hard coded version of the second Octane tested regular expression:

var re1 = /(((\w+):\/\/)([^\/:]*)(:(\d+))?)?([^#?]*)(\?([^#]*))?(#(.*))?/;

to see how good the performance can get?

Best,

- Julian


[1]: https://github.com/jviereck/regexp.js-octane/commit/0d6e01d36a7d5dc24c385e3437e6b740dbd9da78#diff-0

[2]: https://github.com/jviereck/regexp.js-octane/commit/0630eec8e656f3df5effc27114ba80ffe970d53e

On 05/01/14 12:13, hv1989 wrote:
Hi Julian,

I'm not sure what you have tried. But I tried your hardcoded version.
(i.e. defining RegExpJS ourself, with the ^ba hack)

- octane1.0-regexp:
before: 4510
after: 4658

- octane2.0-regexp:
before: 2585
after: 2390

So in octane1.0 that is indeed an improvement. For octane2.0 not and
that has a reason. In octane2.0 all calls to "exec()" have a wrapper:
"Exec()" that does some extra testing to make sure the result is
correct. Using TypeInformation we can find out this is only called
with "RegExp" as first parameter. So we can optimize that. Now with
"new RegExpJS(/^ba/);" we see 2 signatures in "Exec". So it is less
specialized (not much, just an extra if to distinguish the paths at
the "exec" call). I'm sure if all regexps would be transformed to
"RegExpJS" we would get that back. It would only see 1 signature
again.

Now about RegExp.JS bringing such a big loss. That is possible. Yarr
isn't bad and in octane-regexp we only are stuck in the interpreter
for 3% and even in that case the interpreter isn't that slow. We
wouldn't win much on octane-regexp if we could JIT everything (what
the problem is for the other benchmarks like jQuery and Peacekeeper).
It will bring maximum a 4% gain for octane-regexp. Though I would
suggest to try to run the numbers again, since the numbers differ so
much from mine.

Best Hannes

On Sun, Jan 5, 2014 at 11:31 AM,  <[email protected]> wrote:
On Thursday, January 2, 2014 6:47:58 PM UTC+1, Nicolas Pierron wrote:
On 01/02/2014 07:31 AM, Nicolas B. Pierron wrote:

I should have wrote that with a past tense …

https://github.com/jviereck/regexp.js
So far I hadn't done any performance numbers for RegExp.JS. I looked into this 
and thanks to the help of Till I got the Octane benchmark running in the JS 
shell [1].

Before converting the entire Octane RegExp benchmark to run using RegExp.JS I 
thought I just try the first RegExp tested in the benchmark. This means the in 
terms of code changes:

   diff --git a/regexp.js b/regexp.js
   - var re0 = /^ba/;
   + var re0 = new RegExpJS(/^ba/);

Just changing this one RegExp caused the score from ~1480 on my machine to drop to 
77 (!!!) using the RegExp.JS library (& my.mood = :( ).

Okay, so maybe RegExp.JS is doing something completely wrong, which is why I 
tried another dump approach and defined:

   function RegExpJS(reg) { }

   RegExpJS.prototype.exec = function(str) {
     if (str.startsWith('ba')) {
         return ['ba'];
       } else {
         return null;
       }
   }

This RegExpJS object ONLY works HARDCODED with the first regexp of the octane 
benchmark (/^ba/) - cheating, I know, but let's see where this gets us in terms 
of performance. Running the regexp.js benchmark with this RegExpJS definition 
and the modification |var re0 = new RegExpJS(/^ba/);| resulted in a score of 
~1340. Better than 77, but still a huge drop compared to 1480 by only changing 
one RegExp in the benchmark!

(If you wonder if replacing the |if(str.startsWith('ba'))| call with |if (str[0] == 'b' 
&& str[1] == 'a') {| --- no, that doesn't make any difference in terms of 
performance :/).

---

Without knowing anything about the Spidermonkey JS internals, this very small 
benchmarking raises the following questions to me:

1) Is the YARR implementation so much faster than anything written in plane JS 
(even if the JS is highly optimized for the RegExp and matches the string in 
the best optimial way)?
2) Is there a performance bug in Spidermonkey, that makes even the plain 
RegExpJS running only /^ba/ such slow?



Cheers,

- Julian




[1] Using the js shell provided at 
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-trunk/ dated on 
the 04-Jan-2014 11:50.



_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

--

- Julian

_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

Reply via email to