The new parser will certainly be faster than the old, mostly because it's now hackable. The old parser was un-touchable for fear of breaking the world. This one is tested, perf-tested, documented and much better designed. May the optimizing begin!
-eric On Mon, Jun 14, 2010 at 12:07 PM, Adam Barth <[email protected]> wrote: > On Mon, Jun 14, 2010 at 11:05 AM, Oliver Hunt <[email protected]> wrote: >> Have you done perf testing? > > Yes. We've been working with our parsing benchmark: > > http://trac.webkit.org/browser/trunk/WebCore/benchmarks/parser/html-parser.html > >> What's the change? > > Last time we measured, the new parser was ~1% slower than the old > parser. I believe parsing accounts for <5% of PLT, so that > corresponds to a <0.05% slowdown on PTL, which is, AFAIK, > unmeasurable. We'll double check perf before we switch over. > > We think the new parser will end up being faster than the old parser. > We've done just enough performance optimization to remove perf as a > blocking issue for switching over. There's a bunch more we can do. > For example, we're currently wasting a bunch of time converting > new-style tokens into old-style tokens to feed them to the old tree > constructor. Once we start working on phase 2 (the HTML5 tree > constructor), we won't need to waste time there. > > Adam > > >> On Jun 13, 2010, at 10:21 PM, Adam Barth wrote: >> >>> People of WebKit, >>> >>> As mentioned recently on webkit-dev, Eric, Tonyg, and I have been >>> working on implementing the HTML5 parsing algorithm in WebKit: >>> >>> http://www.mail-archive.com/[email protected]/msg11472.html >>> >>> We're now ready to turn the new tokenization algorithm on by default >>> (probably early this week). The new code passes all the existing >>> LayoutTests, with the exception of roughly 40 tests that "expect" >>> behavior that violates the HTML5 specification [1]. >>> >>> There are some differences between the old parser and the HTML5 >>> parser. We've written up a brief document outlining those >>> differences: >>> >>> https://docs.google.com/document/edit?id=1as5xYjyMSCph4960iz0-Kb7hZKf_L6f2vts57NMcVBI&hl=en >>> >>> If these differences cause real compatibility issues on the web, we >>> should contribute this information to the working group so we can >>> improve the specification. If these differences cause compatibility >>> issues for WebKit-specific HTML (e.g., for Dashboard widgets), we >>> might need to add a flag to support some subset of these parsing >>> quirks for non-web uses of WebKit. >>> >>> Please be on the lookout for parsing-related regressions and CC Eric, >>> Tonyg, and me on the bugs. There's still a lot of work to do >>> (including implementing the tree construction algorithm), but turning >>> the tokenization code on by default is an important milestone for the >>> project. >>> >>> Happy parsing, >>> Adam >>> >>> [1] See >>> https://spreadsheets.google.com/ccc?key=0AppchfQ5mBrEdDFJUW5DOGNsdmtvZkN0ZmIzMjdaT0E&hl=en >>> for details. >>> _______________________________________________ >>> webkit-dev mailing list >>> [email protected] >>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev >> >> > _______________________________________________ > webkit-dev mailing list > [email protected] > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > _______________________________________________ webkit-dev mailing list [email protected] http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

