On 01/02/2014 05:01 PM, Andreas Gal wrote:
I am not confident that regexp performance is enough of a key investment area
for us to justify (3). (2) sounds like a viable option to me, though
we will have to investigate the platform bindings as you said. I remember Lars
bragging that their regexp engine compiles are possible cases of
regexps, so at least this will be the last time we have to do this. Plus, we
will automatically always be performance competitive with Chrome.
Thats an important strategic approach here.
With (2) it becomes hard to beat v8. Also, if (2) requires significant work to
make it integrate well with SM, doesn't
it pretty much become (3). (Though, I'm not a SpiderMonkey hacker)
But anyhow, I agree something needs to be done. regexp slowness shows up in too
many profiles these days.
-Olli
Thanks,
Andreas
On Jan 2, 2014, at 6:46 AM, Jan de Mooij <[email protected]> wrote:
Back in 2010, we imported the YARR regular expression engine from JSC [0]. It
has served us well over the years, but with all the optimizations
to the rest of the engine, regular expression performance is becoming a
bottleneck again. When YARR is able to JIT a regular expression,
performance is mostly on par with V8. However, when we can't compile a regexp,
we're stuck in the interpreter and become very slow.
Unfortunately, YARR is unable to JIT some regular expressions used in popular
JS libraries like jQuery [1]. The main problem is that YARR can't
compile regexps with nested parenthesized groups. As I understand it, this is a
pretty fundamental issue that requires a major refactoring. The
upstream WebKit bug has had no activity for over 3 years [2].
There's also a problem with "quantity 1 subpatterns that are copies" that
affects a Peacekeeper email validation regular expression [3] and is
the only reason for us being slower than Chrome on the Peacekeeper
stringValidateForm test [4].
To address these issues, we have the following options: (1) Fix YARR ourselves,
either upstream or locally. (2) Switch from YARR to V8's irregexp
engine. (3) Write something ourselves, probably based on V8's irregexp.
(1) will be hard; I don't think we have somebody familiar enough with YARR to
do a refactoring of this size. It could be an option though.
For (2), we'd have to write a layer mapping V8's macro assembler calls to our
own macro assembler. Unfortunately, unlike SM and JSC, V8 has more
platform-specific code and we'd have to do this work for different platforms.
I'm not sure what other dependencies there are on other parts of
the V8 engine.
Personally, I like (3): it's not a small task, but it'd finally give us a
regexp engine that integrates well with the rest of the engine. This
also means we can dump JSC's macro-assembler (JM used it as well, but is also
gone) and use the one we wrote for the baseline/Ion JITs. It'd
also integrate much better than Yarr in terms of code style and data
structures. If we base it on irregexp, we should be able to avoid most
pitfalls or design problems.
What do people think?
Jan
[0] https://bugzilla.mozilla.org/show_bug.cgi?id=564953 [1]
https://bugzilla.mozilla.org/show_bug.cgi?id=929507 [2]
https://bugs.webkit.org/show_bug.cgi?id=42264 [3]
https://bugs.webkit.org/show_bug.cgi?id=122891 [4]
https://bugzilla.mozilla.org/show_bug.cgi?id=692009
_______________________________________________ dev-tech-js-engine-internals
mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals