Yes, but I will let Jan summarize.

/be

Chris Peterson <mailto:[email protected]>
January 10, 2014 5:17 PM


Any news from Apple about Yarr?


chris
_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Chris Peterson <mailto:[email protected]>
January 3, 2014 7:52 PM


Does anyone here have a JSC/Yarr contact at Apple?


On 1/2/14, 12:28 PM, Luke Wagner wrote:
> One thing, though, is we'd really need an owner for this code who
> took the time to fully understand irregexp so they could fix what may
> come as it came and review patches.

If we we don't get an affirmative response from Apple, who would be a good owner for porting irregexp to SM?


chris
_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Andreas Gal <mailto:[email protected]>
January 2, 2014 12:35 PM
Sounds like a solid plan. It combines the best of both worlds (we don't have to reinvent the wheel but we minimize how much code we import). The fact that the code is pretty stable definitely supports this approach.

Andreas


_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Luke Wagner <mailto:[email protected]>
January 2, 2014 12:28 PM
I don't think a pure (2) approach is our cheapest option. Even with Yarr, it took Chris a whole bunch of work to import and it also took Dave/Dave a long time each time they pulled a new version. It sounds like irregexp would be much worse. Furthermore, having a whole hunk of code you can't just change means everybody goes to lengths to avoid touching it and it becomes a big sad sinkhole.

Perhaps we could use a modified (2) approach: fork irregexp. In particular, we'd: - significantly refactor the code to use SM rooting, assembler, Vector, LifoAlloc, etc APIs
- declare open season on stylistic refactorings to make irregexp match SM

The obvious concern is that we'd miss updates/fixes in V8. However, looking at the V8 svn repo, the irregexp files change infrequently (almost nothing in the last 6 months) so we could just as well, every month or so, just look at all the changes to the 9 *regexp* files and manually apply the diffs.

One thing, though, is we'd really need an owner for this code who took the time to fully understand irregexp so they could fix what may come as it came and review patches.

Cheers,
Luke

----- Original Message -----
_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Jan de Mooij <mailto:[email protected]>
January 2, 2014 6:46 AM
Back in 2010, we imported the YARR regular expression engine from JSC [0].
It has served us well over the years, but with all the optimizations to the rest of the engine, regular expression performance is becoming a bottleneck again. When YARR is able to JIT a regular expression, performance is mostly on par with V8. However, when we can't compile a regexp, we're stuck in the
interpreter and become very slow.

Unfortunately, YARR is unable to JIT some regular expressions used in
popular JS libraries like jQuery [1]. The main problem is that YARR can't
compile regexps with nested parenthesized groups. As I understand it, this
is a pretty fundamental issue that requires a major refactoring. The
upstream WebKit bug has had no activity for over 3 years [2].

There's also a problem with "quantity 1 subpatterns that are copies" that
affects a Peacekeeper email validation regular expression [3] and is the
only reason for us being slower than Chrome on the Peacekeeper
stringValidateForm test [4].

To address these issues, we have the following options:
(1) Fix YARR ourselves, either upstream or locally.
(2) Switch from YARR to V8's irregexp engine.
(3) Write something ourselves, probably based on V8's irregexp.

(1) will be hard; I don't think we have somebody familiar enough with YARR
to do a refactoring of this size. It could be an option though.

For (2), we'd have to write a layer mapping V8's macro assembler calls to
our own macro assembler. Unfortunately, unlike SM and JSC, V8 has more
platform-specific code and we'd have to do this work for different
platforms. I'm not sure what other dependencies there are on other parts of
the V8 engine.

Personally, I like (3): it's not a small task, but it'd finally give us a
regexp engine that integrates well with the rest of the engine. This also
means we can dump JSC's macro-assembler (JM used it as well, but is also
gone) and use the one we wrote for the baseline/Ion JITs. It'd also
integrate much better than Yarr in terms of code style and data structures.
If we base it on irregexp, we should be able to avoid most pitfalls or
design problems.

What do people think?

Jan

[0] https://bugzilla.mozilla.org/show_bug.cgi?id=564953
[1] https://bugzilla.mozilla.org/show_bug.cgi?id=929507
[2] https://bugs.webkit.org/show_bug.cgi?id=42264
[3] https://bugs.webkit.org/show_bug.cgi?id=122891
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=692009
_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

Reply via email to