Re: [webkit-dev] Feature Announcement: Moving HTML Parser off the Main Thread

2013-01-10 Thread Zoltan Herczeg
Parsing, especially JS parsing still takes a large amount of time on page
loading. We tried to improve the preload scanner by moving it into
anouther thread, but there was no gain (except some special cases).
Synchronization between threads is surprisingly (ridiculously) costly,
usually worth for those tasks, which needs quite a few million
instructions to be executed (and tokenization takes far less in most
cases). For smaller tasks, SIMD instruction sets can help, which is
basically a parallel execution on a single thread. Anyway it is worth
trying, but it is really challenging to make it work in practice. Good
luck!

Regards,
Zoltan

 On Jan 9, 2013, at 10:04 PM, Ian Hickson i...@hixie.ch wrote:

 On Wed, 9 Jan 2013, Eric Seidel wrote:

 The core goal is to reduce latency -- to free up the main thread for
 JavaScript and UI interaction -- which as you correctly note, cannot be
 moved off of the main thread due to the single thread of execution
 model of the web.

 Parsing and (maybe to a lesser extent) compiling JS can be moved off the
 main thread, though, right? That's probably worth examining too, if it
 hasn't already been done.

 100% agree.

 However, the same problem I brought up about tokenization applies here: a
 lot of JS functions are super cheap to parse and compile already, and the
 latency of doing so on the main thread is likely to be lower than the
 latency of chatting with another core.  I suspect this could be alleviated
 by (1) aggressively pipelining the work, where during page load or during
 heavy JS use the compilation thread always has a non-empty queue of work
 to do; this will mean that the latency of communication is paid only when
 the first compilation occurs, and (2) allowing the main thread to steal
 work from the compilation queue.  I'm not sure how to make (2) work well.
 For parsing it's actually harder since we rely heavily on the lazy parsing
 optimization: code is only parsed once we need it *right now* to run a
 function.  For compilation, it's somewhat easier: the most expensive
 compilation step is the third-tier optimizing JIT; we can delay this as
 long as we want, though the longer we dela
  y it, the longer we spend running slower code.

 Hence, to make parsing concurrent, the main problem is figuring out how to
 do predictive parsing: have a concurrent thread start parsing something
 just before we need it.  Without predictive parsing, making it concurrent
 would be a guaranteed loss since the main thread would just be stuck
 waiting for the thread to finish.

 To make optimized compiles concurrent without a regression, the main
 problem is ensuring that in those cases where we believe that the time
 taken to compile the function will be smaller than the time taken to awake
 the concurrent thread, we will instead just compile it on the main thread
 right away.  Though, if we could predict that a function was going to get
 hot in the future, we could speculatively tell a concurrent thread to
 compile it fully knowing that it won't wake up and do so until exactly
 when we would have otherwise invoked the compiler on the main thread (that
 is, it'll wake up and start compiling it once the main thread has executed
 the function enough times to get good profiling data).

 Anyway, you're absolutely right that this is an area that should be
 explored.

 -F



 --
 Ian Hickson   U+1047E)\._.,--,'``.fL
 http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
 Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev

 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev



___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] Feature Announcement: Moving HTML Parser off the Main Thread

2013-01-10 Thread Zoltan Herczeg
https://bugs.webkit.org/show_bug.cgi?id=63531

The work was done by Zoltan Horvath and Balazs Kelemen.

Regards,
Zoltan

 Hi Zoltan,

 I would be curious how you did the synchronization.  I've had some luck
 reducing synchronization costs before.

 Was the patch ever uploaded anywhere?

 -F


 On Jan 10, 2013, at 12:11 AM, Zoltan Herczeg zherc...@webkit.org wrote:

 Parsing, especially JS parsing still takes a large amount of time on
 page
 loading. We tried to improve the preload scanner by moving it into
 anouther thread, but there was no gain (except some special cases).
 Synchronization between threads is surprisingly (ridiculously) costly,
 usually worth for those tasks, which needs quite a few million
 instructions to be executed (and tokenization takes far less in most
 cases). For smaller tasks, SIMD instruction sets can help, which is
 basically a parallel execution on a single thread. Anyway it is worth
 trying, but it is really challenging to make it work in practice. Good
 luck!

 Regards,
 Zoltan

 On Jan 9, 2013, at 10:04 PM, Ian Hickson i...@hixie.ch wrote:

 On Wed, 9 Jan 2013, Eric Seidel wrote:

 The core goal is to reduce latency -- to free up the main thread for
 JavaScript and UI interaction -- which as you correctly note, cannot
 be
 moved off of the main thread due to the single thread of execution
 model of the web.

 Parsing and (maybe to a lesser extent) compiling JS can be moved off
 the
 main thread, though, right? That's probably worth examining too, if it
 hasn't already been done.

 100% agree.

 However, the same problem I brought up about tokenization applies here:
 a
 lot of JS functions are super cheap to parse and compile already, and
 the
 latency of doing so on the main thread is likely to be lower than the
 latency of chatting with another core.  I suspect this could be
 alleviated
 by (1) aggressively pipelining the work, where during page load or
 during
 heavy JS use the compilation thread always has a non-empty queue of
 work
 to do; this will mean that the latency of communication is paid only
 when
 the first compilation occurs, and (2) allowing the main thread to steal
 work from the compilation queue.  I'm not sure how to make (2) work
 well.
 For parsing it's actually harder since we rely heavily on the lazy
 parsing
 optimization: code is only parsed once we need it *right now* to run a
 function.  For compilation, it's somewhat easier: the most expensive
 compilation step is the third-tier optimizing JIT; we can delay this as
 long as we want, though the longer we dela
 y it, the longer we spend running slower code.

 Hence, to make parsing concurrent, the main problem is figuring out how
 to
 do predictive parsing: have a concurrent thread start parsing something
 just before we need it.  Without predictive parsing, making it
 concurrent
 would be a guaranteed loss since the main thread would just be stuck
 waiting for the thread to finish.

 To make optimized compiles concurrent without a regression, the main
 problem is ensuring that in those cases where we believe that the time
 taken to compile the function will be smaller than the time taken to
 awake
 the concurrent thread, we will instead just compile it on the main
 thread
 right away.  Though, if we could predict that a function was going to
 get
 hot in the future, we could speculatively tell a concurrent thread to
 compile it fully knowing that it won't wake up and do so until exactly
 when we would have otherwise invoked the compiler on the main thread
 (that
 is, it'll wake up and start compiling it once the main thread has
 executed
 the function enough times to get good profiling data).

 Anyway, you're absolutely right that this is an area that should be
 explored.

 -F



 --
 Ian Hickson   U+1047E)\._.,--,'``.
 fL
 http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._
 ,.
 Things that are impossible just take longer.
 `._.-(,_..'--(,_..'`-.;.'
 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev

 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev



 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev




___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] Feature Announcement: Moving HTML Parser off the Main Thread

2013-01-10 Thread Maciej Stachowiak

I presume from your other comments that the goal of this work is 
responsiveness, rather than page load speed as such. I'm excited about the 
potential to improve responsiveness during page loading.

One question: what tests are you planning to use to validate whether this 
approach achieves its goals of better responsiveness?

The reason I ask is that this sounds like a significant increase in complexity, 
so we should be very confident that there is a real and major benefit. One 
thing I wonder about is how common it is to have enough of the page processed 
that the user could interact with it in principle, yet still have large parsing 
chunks remaining which would prevent that interaction from being smooth. 
Another thing I wonder about is whether yielding to the event loop more 
aggressively could achieve a similar benefit at a much lower complexity cost. 

Having a test to drive the work would allow us to answer these types of 
questions. (It may also be that the test data you cited would already answer 
these questions but I didn't sufficiently understand it; if so, further 
explanation would be appreciated.)

Regards,
Maciej

On Jan 9, 2013, at 6:00 PM, Eric Seidel e...@webkit.org wrote:

 We're planning to move parts of the HTML Parser off of the main thread:
 https://bugs.webkit.org/show_bug.cgi?id=106127
 
 This is driven by our testing showing that HTML parsing on mobile is
 be slow, and long (causing user-visible delays averaging 10 frames /
 150ms).
 https://bug-106127-attachments.webkit.org/attachment.cgi?id=182002
 Complete data can be found at [1].
 
 Mozilla moved their parser onto a separate thread during their HTML5
 parser re-write:
 https://developer.mozilla.org/en-US/docs/Mozilla/Gecko/HTML_parser_threading
 
 We plan to take a slightly simpler approach, moving only Tokenizing
 off of the main thread:
 https://docs.google.com/drawings/d/1hwYyvkT7HFLAtTX_7LQp2lxA6LkaEWkXONmjtGCQjK0/edit
 The left is our current design, the middle is a tokenizer-only design,
 and the right is more like mozilla's threaded-parser design.
 
 Profiling shows Tokenizing accounts for about 10x the number of
 samples as TreeBuilding.  Including Antti's recent testing (.5% vs.
 3%):
 https://bugs.webkit.org/show_bug.cgi?id=106127#c10
 If after we do this we measure and find ourselves still spending a lot
 of main-thread time parsing, we'll move the TreeBuilder too. :)  (This
 work is a nicely separable sub-set of larger work needed to move the
 TreeBuilder.)
 
 We welcome your thoughts and comments.
 
 
 1. 
 https://docs.google.com/spreadsheet/ccc?key=0AlC4tS7Ao1fIdGtJTWlSaUItQ1hYaDFDcWkzeVAxOGc#gid=0
 (Epic thanks to Nat Duca for helping us collect that data.)
 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] Feature Announcement: Moving HTML Parser off the Main Thread

2013-01-10 Thread Antti Koivisto
When loading web pages we are very frequently in a situation where we
already have the source data (HTML text here but the same applies to
preloaded Javascript, CSS, images, ...) and know we are likely to need it
in soon, but can't actually utilize it for indeterminate time. This happens
because pending external JS resources blocks the main parser (and pending
CSS resources block JS execution) for web compatibility reasons. In this
situation it makes sense to start processing resources we have to forms
that are faster to use when they are eventually actually needed (like token
stream here).

One thing we already do when the main parser gets blocked is preload
scanning. We look through the unparsed HTML source we have and trigger
loads for any resources found. It would be beneficial if this happened off
the main thread. We could do it when new data arrives in parallel with JS
execution and other time consuming engine work, potentially triggering
resource loads earlier.

I think a good first step here would be to share the tokens between the
preload scanner and the main parser and worry about the threading part
afterwards. We often parse the HTML source more or less twice so this is an
unquestionable win.


  antti


On Thu, Jan 10, 2013 at 7:41 AM, Filip Pizlo fpi...@apple.com wrote:

 I think your biggest challenge will be ensuring that the latency of
 shoving things to another core and then shoving them back will be smaller
 than the latency of processing those same things on the main thread.

 For small documents, I expect concurrent tokenization to be a pure
 regression because the latency of waking up another thread to do just a
 small bit of work, plus the added cost of whatever synchronization
 operations will be needed to ensure safety, will involve more total work
 than just tokenizing locally.

 We certainly see this in the JSC parallel GC, and in line with traditional
 parallel GC design, we ensure that parallel threads only kick in when the
 main thread is unable to keep up with the work that it has created for
 itself.

 Do you have a vision for how to implement a similar self-throttling, where
 tokenizing continues on the main thread so long as it is cheap to do so?

 -Filip


 On Jan 9, 2013, at 6:00 PM, Eric Seidel e...@webkit.org wrote:

  We're planning to move parts of the HTML Parser off of the main thread:
  https://bugs.webkit.org/show_bug.cgi?id=106127
 
  This is driven by our testing showing that HTML parsing on mobile is
  be slow, and long (causing user-visible delays averaging 10 frames /
  150ms).
  https://bug-106127-attachments.webkit.org/attachment.cgi?id=182002
  Complete data can be found at [1].
 
  Mozilla moved their parser onto a separate thread during their HTML5
  parser re-write:
 
 https://developer.mozilla.org/en-US/docs/Mozilla/Gecko/HTML_parser_threading
 
  We plan to take a slightly simpler approach, moving only Tokenizing
  off of the main thread:
 
 https://docs.google.com/drawings/d/1hwYyvkT7HFLAtTX_7LQp2lxA6LkaEWkXONmjtGCQjK0/edit
  The left is our current design, the middle is a tokenizer-only design,
  and the right is more like mozilla's threaded-parser design.
 
  Profiling shows Tokenizing accounts for about 10x the number of
  samples as TreeBuilding.  Including Antti's recent testing (.5% vs.
  3%):
  https://bugs.webkit.org/show_bug.cgi?id=106127#c10
  If after we do this we measure and find ourselves still spending a lot
  of main-thread time parsing, we'll move the TreeBuilder too. :)  (This
  work is a nicely separable sub-set of larger work needed to move the
  TreeBuilder.)
 
  We welcome your thoughts and comments.
 
 
  1.
 https://docs.google.com/spreadsheet/ccc?key=0AlC4tS7Ao1fIdGtJTWlSaUItQ1hYaDFDcWkzeVAxOGc#gid=0
  (Epic thanks to Nat Duca for helping us collect that data.)
  ___
  webkit-dev mailing list
  webkit-dev@lists.webkit.org
  http://lists.webkit.org/mailman/listinfo/webkit-dev

 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] Feature Announcement: Moving HTML Parser off the Main Thread

2013-01-10 Thread Nat Duca
The data Eric and Adam were using comes from a python library a few of us
have been developing called telemetry. Its basically a bunch of python
that lets us write performance tests against any browser that speaks the
inspector websocket protocol. We're using it a lot of should we
parallelize X questions, as well as regression-style have our changes to
X stayed a win over time?

They might have other ways in mind to obtain this data that is more
webkit-y, but I figure a bit on how we got this far might be useful for
this mailing list.

Roughly, telemetry scripts connect up to a host and port where you've
arranged to have an inspector websocket listening, e.g. $MY_PHONE_IP:9222,
or google-chrome --remote-debugging-port=9222  telemetry
--browser=$LOCALHOST:9222. Once that's established, we have communication
with WebCore's InspectorAgent, and assuming we trust the agent, can do some
pretty powerful stuff from there.

The benchmark being discussed here [webkit_benchmark] navigates the browser
from page to page, enabling inspector's TimelineAgent as it does in order
to get performance data about the page load. We then postprocess that data
stream into a human consumable csv and there is [some amount] of rejoicing.
Assuming we trust inspector timeline [Pavel's done a number of fixes to
help us trust it more!] this gets pretty clean results, pretty easily.


A key challenge with telemetry has been getting stable runs on real world
sites. The archive.org technqiues are cool, but they dont capture some of
the big ones, like a logged-in gmail account. We've addressed this using
tonyg and simonjam's http://code.google.com/p/web-page-replay/. If the
browser under test supports web page replay [~= redirecting dns requests to
the replay server instead of the real site], then you can get stable,
repeatable runs against super complex real world sites --- its worked on
every site we've tried so far.


The core telemetry framework is here:
http://src.chromium.org/chrome/trunk/src/tools/telemetry/

Its in chromium repo, but please dont hold that against it --- its movable,
given interest.

The actual webkit benchmark is pretty simple, because most of the
functionality comes from telemetry:
https://codereview.chromium.org/11791043/


With the patch above landed, obtaining the benchmarking results that Eric
got against chrome should be ~= getting a telemetry checkout and doing:
./run_multipage_benchmarks --browser=canary
webkit_benchmark page_sets/top_25.json

Or if you had an android with chrome on it:
./run_multipage_benchmarks --browser=android-chrome
webkit_benchmark page_sets/top_25.json


Anyway, I'll leave it to Eric/Adam to speak to how this maps back into the
WebKit ecosystem. The use of inspector protocol makes it a theoretical
possibility on other ports, but I know some people get nervous (or run away
angrily!) when they hear that we're using Inspector as a perf data source.
 :)


- Nat


On Thu, Jan 10, 2013 at 1:44 AM, Antti Koivisto koivi...@iki.fi wrote:

 When loading web pages we are very frequently in a situation where we
 already have the source data (HTML text here but the same applies to
 preloaded Javascript, CSS, images, ...) and know we are likely to need it
 in soon, but can't actually utilize it for indeterminate time. This happens
 because pending external JS resources blocks the main parser (and pending
 CSS resources block JS execution) for web compatibility reasons. In this
 situation it makes sense to start processing resources we have to forms
 that are faster to use when they are eventually actually needed (like token
 stream here).

 One thing we already do when the main parser gets blocked is preload
 scanning. We look through the unparsed HTML source we have and trigger
 loads for any resources found. It would be beneficial if this happened off
 the main thread. We could do it when new data arrives in parallel with JS
 execution and other time consuming engine work, potentially triggering
 resource loads earlier.

 I think a good first step here would be to share the tokens between the
 preload scanner and the main parser and worry about the threading part
 afterwards. We often parse the HTML source more or less twice so this is an
 unquestionable win.


   antti


 On Thu, Jan 10, 2013 at 7:41 AM, Filip Pizlo fpi...@apple.com wrote:

 I think your biggest challenge will be ensuring that the latency of
 shoving things to another core and then shoving them back will be smaller
 than the latency of processing those same things on the main thread.

 For small documents, I expect concurrent tokenization to be a pure
 regression because the latency of waking up another thread to do just a
 small bit of work, plus the added cost of whatever synchronization
 operations will be needed to ensure safety, will involve more total work
 than just tokenizing locally.

 We certainly see this in the JSC parallel GC, and in line with
 traditional parallel GC design, we ensure that parallel threads only 

Re: [webkit-dev] Feature Announcement: Moving HTML Parser off the Main Thread

2013-01-10 Thread Tom Hudson
On Thu, Jan 10, 2013 at 8:37 AM, Maciej Stachowiak m...@apple.com wrote:

 The reason I ask is that this sounds like a significant increase in
 complexity, so we should be very confident that there is a real and major
 benefit. One thing I wonder about is how common it is to have enough of the
 page processed that the user could interact with it in principle, yet still
 have large parsing chunks remaining which would prevent that interaction
 from being smooth. Another thing I wonder about is whether yielding to the
 event loop more aggressively could achieve a similar benefit at a much
 lower complexity cost.


I don't want to let this point of Maciej's slip away: on mobile we may have
fewer cores than desktop, and we're paying a pretty high complexity burden
for multiple threads already; some of Nat's awesome recent work in Chromium
is too multithreaded for my comfort. I'd back-of-enveloped yielding during
page layout and guessed it wasn't worthwhile, but do we know that yielding
during parsing isn't?

Tom
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] PSA: Migration plan to GStreamer 1.x

2013-01-10 Thread Dumez, Christophe
Hi,

FYI, WebKit-EFL is now building with gstreamer 1.0 by default [1].

[1] https://bugs.webkit.org/show_bug.cgi?id=106178

On Tue, Jan 8, 2013 at 10:09 PM, Ryan Ware w...@linux.intel.com wrote:



  -Original Message-
  From: webkit-dev-boun...@lists.webkit.org [mailto:webkit-dev-
  boun...@lists.webkit.org] On Behalf Of Simon Hausmann
  Sent: Tuesday, January 08, 2013 2:10 AM
  To: webkit-dev@lists.webkit.org
  Subject: Re: [webkit-dev] PSA: Migration plan to GStreamer 1.x
 
  On Tuesday, January 08, 2013 10:21:00 AM Philippe Normand wrote:
   Hi,
  
   This mail is mainly for the GTK, Qt and EFL port maintainers, I
   decided to post here instead of cross-posting to three mailing lists
   :)
  
   So there's been work to port the MediaPlayer and WebAudio GStreamer
   backends to the new GStreamer 1.x APIs. At the moment you can choose
   (well, for the GTK port at least) at build time if you want to use the
   0.10 or 1.x APIs.
  
   The issue is that GStreamer 0.10 is no longer actively maintained and
   the GStreamer developers/maintainers entirely focus on GStreamer 1.x.
  
   Moreover we currently don't have the manpower to maintain the 2 code
   paths in the WebKit/GStreamer platform layer. The GTK port buildbots
   already switched to 1.0 last month and I encourage Qt and EFL to do
   the same ASAP, at least for their buildbots.
  
   I'd like to propose we drop the GStreamer 0.10 support from WebKit
   once the next stable branch of GStreamer is released, it will be 1.2,
   scheduled somewhere around February.
 
  Sounds good to me.

 This will also be less problematic from a security perspective in the
 future
 since it will be harder to get security updates for 0.10.

 Ryan

 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev




-- 
Christophe Dumez, PhD
Linux Software Engineer
Intel Finland Oy - Open Source Technology Center
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] Constructors for DOM4 Events

2013-01-10 Thread Kentaro Hara
At TPAC there was no objection for DOM4 Event constructors (e.g. new
MouseEvent()).

Now DOM4 Event constructors are on editor's draft:
http://html5labs.interoperabilitybridges.com/dom4events/
https://dvcs.w3.org/hg/d4e/raw-file/tip/source_respec.htm

Given the above, I am planning to implement them in WebKit (without
any flag). If you have any concern, please let me know.

Best Regards


On Mon, Oct 1, 2012 at 7:44 AM, Kentaro Hara hara...@chromium.org wrote:
 Since TPAC is less than a month away, I don't understand why we can't wait 
 for that discussion.

 Sounds reasonable. I'll wait for TPAC.

 I do support the idea in general, and I plan to be at TPAC and will advocate 
 for it.

 I'll be also going to TPAC. I would appreciate your support.



 On Mon, Oct 1, 2012 at 2:11 PM, Maciej Stachowiak m...@apple.com wrote:

 Since TPAC is less than a month away, I don't understand why we can't wait 
 for that discussion. I do support the idea in general, and I plan to be at 
 TPAC and will advocate for it.

 I understand that sometimes we need to move ahead of the spec. If there's a 
 reason not to wait a few extra weeks in this case, then please at least use 
 a prefix.

 Cheers,
 Maciej

 On Sep 30, 2012, at 6:32 PM, Kentaro Hara hara...@chromium.org wrote:

 TL;DR: Would it be OK to implement constructors for DOM4 Events in
 WebKit without waiting for the spec?


 == Background ==

 Events should have constructors. 'new XXXEvent()' is much easier than
 'e = document.createEvent(...); e.initXXXEvent(_a_lot_of_arguments_)'.
 We have already implemented constructors for a bunch of Events such as
 Event, CustomEvent, ProgressEvent, etc [5]. However, we have not yet
 implemented constructors for DOM4 Events (i.e. UIEvent, MouseEvent,
 KeyboardEvent, WheelEvent, TextEvent, CompositionEvent) because they
 are not yet speced.

 Recently PointerEvent was speced with [Constructor] [2]. Considering
 that PointerEvent inherits MouseEvent, now we want to support
 [Constructor] on MouseEvent
 too. In terms of implementation, it is possible to implement
 [Constructor] on PointerEvent without implementing [Constructor] on
 MouseEvent. However, implementing [Constructor] on both PointerEvent
 and MouseEvent would be best.

 == Rationale for implementing constructors for DOM4 Events ==

 I have been discussing this topic for one year, in www-dom@ [4] and a
 www.w3.org bug [3]. It looks like there is a consensus on introducing
 constructors for DOM4 Events. However, the spec is still a draft [1]
 and the www.w3.org bug [3] is marked as LATER. Last week I discussed
 the timeline of the spec with Jacob Rossi (a.k.a. a spec author of
 PointerEvent and DOM4 Events). According to him:

 - Their primary focus is on finishing DOM3 Events first.
 - With DOM3 Events in Candidate Recommendation, they are going to
 start working on the DOM4 Events. They will discuss it in TPAC.
 - They will introduce constructors to DOM4 Events.

 In summary, constructors for DOM4 Events are going to be speced, but
 it will take time. So I would like to implement them in WebKit a bit
 ahead of the spec (and thus implement PointerEvent constructors too).
 If you have any concern, please let me know.


 == References ==
 [1] The spec draft by Jacob Rossi:
 http://html5labs.interoperabilitybridges.com/dom4events/

 [2] The spec of Pointer Events:
 http://www.w3.org/Submission/pointer-events/

 [3] www.w3.org bug:
 https://www.w3.org/Bugs/Public/show_bug.cgi?id=14051

 [4] Discussion on www-dom@:
 http://lists.w3.org/Archives/Public/www-dom/2011OctDec/0081.html
 http://lists.w3.org/Archives/Public/www-dom/2012JanMar/0025.html

 [5] WebKit bug:
 https://bugs.webkit.org/show_bug.cgi?id=67824


 --
 Kentaro Hara, Tokyo, Japan (http://haraken.info)
 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev

 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev



 --
 Kentaro Hara, Tokyo, Japan (http://haraken.info)



-- 
Kentaro Hara, Tokyo, Japan
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] Feature Announcement: Moving HTML Parser off the Main Thread

2013-01-10 Thread Adam Barth
Thanks everyone for your feedback.  Detailed responses inline.

On Wed, Jan 9, 2013 at 9:41 PM, Filip Pizlo fpi...@apple.com wrote:
 I think your biggest challenge will be ensuring that the latency of shoving 
 things to another core and then shoving them back will be smaller than the 
 latency of processing those same things on the main thread.

Yes.  That's something we know we have to worry about.  Given that we
need to retain the ability to parse HTML on the main thread to handle
document.write and innerHTML, we should be able to easily do A/B
comparisons to make sure we understand any performance trade-offs that
might arise.

 For small documents, I expect concurrent tokenization to be a pure regression 
 because the latency of waking up another thread to do just a small bit of 
 work, plus the added cost of whatever synchronization operations will be 
 needed to ensure safety, will involve more total work than just tokenizing 
 locally.

Once we have the ability to tokenize on a background thread, we can
examine cases like these and heuristically decide whether to use the
background thread or not at runtime.  As I wrote above, we'll need
these ability anyway, so keeping the ability to optimize these cases
shouldn't add any new constraints to the design.

 We certainly see this in the JSC parallel GC, and in line with traditional 
 parallel GC design, we ensure that parallel threads only kick in when the 
 main thread is unable to keep up with the work that it has created for itself.

 Do you have a vision for how to implement a similar self-throttling, where 
 tokenizing continues on the main thread so long as it is cheap to do so?

It's certainly something we can tune in the optimization phase.  I
don't think we need a particular vision to be able to do it.  Given
that we want to implement speculative parsing (to replace preload
scanning---more on this below), we'll already have the ability to
checkpoint and restore the tokenizer state across threads.  Once you
have that primitive, it's easy to decide whether to continue
tokenization on the main thread or on a background thread.

On Wed, Jan 9, 2013 at 10:04 PM, Ian Hickson i...@hixie.ch wrote:
 Parsing and (maybe to a lesser extent) compiling JS can be moved off the
 main thread, though, right? That's probably worth examining too, if it
 hasn't already been done.

Yes, once we have the tokenizer running on a background thread, that
opens up the possibility of parsing other sorts of data on the
background thread as well.  For example, when the tokenizer encounters
an inline script block, you could imagine parsing the script on the
background thread as well so that the main thread has less work to do.
 (You could also imagine making the optimizations without a background
tokenizer, but the design constraints would be a bit different.)

On Thu, Jan 10, 2013 at 12:11 AM, Zoltan Herczeg zherc...@webkit.org wrote:
 Parsing, especially JS parsing still takes a large amount of time on page
 loading. We tried to improve the preload scanner by moving it into
 anouther thread, but there was no gain (except some special cases).
 Synchronization between threads is surprisingly (ridiculously) costly,
 usually worth for those tasks, which needs quite a few million
 instructions to be executed (and tokenization takes far less in most
 cases). For smaller tasks, SIMD instruction sets can help, which is
 basically a parallel execution on a single thread. Anyway it is worth
 trying, but it is really challenging to make it work in practice. Good
 luck!

This is something we're worried about and will need to be careful
about.  In the design we're proposing, preload scanning is replaced by
speculative parsing, so the overhead of the preload scanner is removed
entirely.  The way this works is a follows:

When running on the background thread, the tokenizer produces a queue
of PickledTokens.  As these tokens are queued, we can scan them to
kick off any preloads that we find.  Whenever the tokenizer queues a
token that creates a new insertion point (in the terminology of the
HTML specification), the tokenizer checkpoints itself but continues
tokenizing speculatively.  (Notice that tokens produced in this
situation are still scanned for preloads but might not ever actually
result in DOM being constructed.)

After the main thread has processed the token that created the
insertion point, if no characters were inserted, the main thread
continues processing PickledTokens that were created speculative.  If
some characters were inserted, the main thread instead instructs the
tokenizer to roll back to that checkpoint and continue tokenizing in a
new state.  In this case, the queue of speculative tokens is
discarded.

Notice that in the common case, we're execute JavaScript and tokenize
in parallel, something that's not possible with a main-thread
tokenizer.  Once the script is done executing, we expect it to be
common to be able to result tree building immediately as the 

Re: [webkit-dev] Constructors for DOM4 Events

2013-01-10 Thread Maciej Stachowiak

+1 on the feature addition.

Please use a feature define so vendors can decide to ship the new functionality 
at a time of their choosing.

Cheers,
Maciej

On Jan 10, 2013, at 6:36 AM, Kentaro Hara hara...@chromium.org wrote:

 At TPAC there was no objection for DOM4 Event constructors (e.g. new
 MouseEvent()).
 
 Now DOM4 Event constructors are on editor's draft:
 http://html5labs.interoperabilitybridges.com/dom4events/
 https://dvcs.w3.org/hg/d4e/raw-file/tip/source_respec.htm
 
 Given the above, I am planning to implement them in WebKit (without
 any flag). If you have any concern, please let me know.
 
 Best Regards
 
 
 On Mon, Oct 1, 2012 at 7:44 AM, Kentaro Hara hara...@chromium.org wrote:
 Since TPAC is less than a month away, I don't understand why we can't wait 
 for that discussion.
 
 Sounds reasonable. I'll wait for TPAC.
 
 I do support the idea in general, and I plan to be at TPAC and will 
 advocate for it.
 
 I'll be also going to TPAC. I would appreciate your support.
 
 
 
 On Mon, Oct 1, 2012 at 2:11 PM, Maciej Stachowiak m...@apple.com wrote:
 
 Since TPAC is less than a month away, I don't understand why we can't wait 
 for that discussion. I do support the idea in general, and I plan to be at 
 TPAC and will advocate for it.
 
 I understand that sometimes we need to move ahead of the spec. If there's a 
 reason not to wait a few extra weeks in this case, then please at least use 
 a prefix.
 
 Cheers,
 Maciej
 
 On Sep 30, 2012, at 6:32 PM, Kentaro Hara hara...@chromium.org wrote:
 
 TL;DR: Would it be OK to implement constructors for DOM4 Events in
 WebKit without waiting for the spec?
 
 
 == Background ==
 
 Events should have constructors. 'new XXXEvent()' is much easier than
 'e = document.createEvent(...); e.initXXXEvent(_a_lot_of_arguments_)'.
 We have already implemented constructors for a bunch of Events such as
 Event, CustomEvent, ProgressEvent, etc [5]. However, we have not yet
 implemented constructors for DOM4 Events (i.e. UIEvent, MouseEvent,
 KeyboardEvent, WheelEvent, TextEvent, CompositionEvent) because they
 are not yet speced.
 
 Recently PointerEvent was speced with [Constructor] [2]. Considering
 that PointerEvent inherits MouseEvent, now we want to support
 [Constructor] on MouseEvent
 too. In terms of implementation, it is possible to implement
 [Constructor] on PointerEvent without implementing [Constructor] on
 MouseEvent. However, implementing [Constructor] on both PointerEvent
 and MouseEvent would be best.
 
 == Rationale for implementing constructors for DOM4 Events ==
 
 I have been discussing this topic for one year, in www-dom@ [4] and a
 www.w3.org bug [3]. It looks like there is a consensus on introducing
 constructors for DOM4 Events. However, the spec is still a draft [1]
 and the www.w3.org bug [3] is marked as LATER. Last week I discussed
 the timeline of the spec with Jacob Rossi (a.k.a. a spec author of
 PointerEvent and DOM4 Events). According to him:
 
 - Their primary focus is on finishing DOM3 Events first.
 - With DOM3 Events in Candidate Recommendation, they are going to
 start working on the DOM4 Events. They will discuss it in TPAC.
 - They will introduce constructors to DOM4 Events.
 
 In summary, constructors for DOM4 Events are going to be speced, but
 it will take time. So I would like to implement them in WebKit a bit
 ahead of the spec (and thus implement PointerEvent constructors too).
 If you have any concern, please let me know.
 
 
 == References ==
 [1] The spec draft by Jacob Rossi:
 http://html5labs.interoperabilitybridges.com/dom4events/
 
 [2] The spec of Pointer Events:
 http://www.w3.org/Submission/pointer-events/
 
 [3] www.w3.org bug:
 https://www.w3.org/Bugs/Public/show_bug.cgi?id=14051
 
 [4] Discussion on www-dom@:
 http://lists.w3.org/Archives/Public/www-dom/2011OctDec/0081.html
 http://lists.w3.org/Archives/Public/www-dom/2012JanMar/0025.html
 
 [5] WebKit bug:
 https://bugs.webkit.org/show_bug.cgi?id=67824
 
 
 --
 Kentaro Hara, Tokyo, Japan (http://haraken.info)
 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev
 
 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev
 
 
 
 --
 Kentaro Hara, Tokyo, Japan (http://haraken.info)
 
 
 
 -- 
 Kentaro Hara, Tokyo, Japan

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] Feature Announcement: Moving HTML Parser off the Main Thread

2013-01-10 Thread Maciej Stachowiak

On Jan 10, 2013, at 12:07 PM, Adam Barth aba...@webkit.org wrote:

 
 On Thu, Jan 10, 2013 at 12:37 AM, Maciej Stachowiak m...@apple.com wrote:
 I presume from your other comments that the goal of this work is 
 responsiveness, rather than page load speed as such. I'm excited about the 
 potential to improve responsiveness during page loading.
 
 The goals are described in the first link Eric gave in his email:
 https://bugs.webkit.org/show_bug.cgi?id=106127#c0.  Specifically:
 
 ---8---
 1) Moving parsing off the main thread could make web pages more
 responsive because the main thread is available for handling input
 events and executing JavaScript.
 2) Moving parsing off the main thread could make web pages load more
 quickly because WebCore can do other work in parallel with parsing
 HTML (such as parsing CSS or attaching elements to the render tree).

OK - what test (if any) will be used to test whether the page load speed goal 
is achieved?

 ---8---
 
 One question: what tests are you planning to use to validate whether this 
 approach achieves its goals of better responsiveness?
 
 The tests we've run so far are also described in the first link Eric
 gave in his email: https://bugs.webkit.org/show_bug.cgi?id=106127.
 They suggest that there's a good deal of room for improvement in this
 area.  After we have a working implementation, we'll likely re-run
 those experiments and run other experiments to do an A/B comparison of
 the two approaches.  As Filip points out, we'll likely end up with a
 hybrid of the two designs that's optimized for handling various work
 loads.

I agree the test suggests there is room for improvement. From the description 
of how the test is run, I can think of two potential ways to improve how well 
it correlates with actual user-perceived responsiveness:

(1) It seems to look at the max parsing pause time without considering whether 
there's any content being shown that it's possible to interact with. If the 
longest pauses happen before meaningful content is visible, then reducing those 
pauses is unlikely to actually materially improve responsiveness, at least in 
models where web content processing happens in a separate process or thread 
from the UI. One possibility is to track the max parsing pause time starting 
from the first visually non-empty layout. That would better approximate how 
much actual user interaction is blocked.

(2) It might be helpful to track max and average pause time from non-parsing 
sources, for the sake of comparison.

These might result in a more accurate assessment of the benfits.

 
 The reason I ask is that this sounds like a significant increase in 
 complexity, so we should be very confident that there is a real and major 
 benefit. One thing I wonder about is how common it is to have enough of the 
 page processed that the user could interact with it in principle, yet still 
 have large parsing chunks remaining which would prevent that interaction 
 from being smooth.
 
 If you're interested in reducing the complexity of the parser, I'd
 recommend removing the NEW_XML code.  As previously discussed, that
 code creates significant complexity for zero benefit.

Tu quoque fallacy. From your glib reply, I get the impression that you are not 
giving the complexity cost of multithreading due consideration. I hope that is 
not actually the case and I merely caught you at a bad moment or something.

(And also we agreed to a drop dead date to remove the code which has either 
passed or is very close.)


 
 Another thing I wonder about is whether yielding to the event loop more 
 aggressively could achieve a similar benefit at a much lower complexity cost.
 
 Yielding to the event loop more could reduce the ParseHTML_max time,
 but it cannot reduce the ParseHTML time.  Generally speaking,
 yielding to the event loop is a trade-off between throughput (i.e.,
 page load time) and responsiveness.  Moving work to a background
 thread should let us achieve a better trade-off between these
 quantities than we're likely to be able to achieve by tuning the yield
 parameter alone.

I agree that is possible. But it also seems like making the improvements that 
don't impose the complexity and hazards of multithreading in this area are 
worth trying first. Things such as retuning yielding and replacing the preload 
scanner with (non-threaded) speculative pre-tokenizing as suggested by Antti. 
That would let us better assess the benefits of the threading itself.

 
 Having a test to drive the work would allow us to answer these types of 
 questions. (It may also be that the test data you cited would already answer 
 these questions but I didn't sufficiently understand it; if so, further 
 explanation would be appreciated.)
 
 If you're interested in building such a test, I would be interested in
 hearing the results.  We don't plan to build such a test at this time.

If you're actually planning to make a significant complexity-imposing 
architectural change 

[webkit-dev] commit-queue and JSC/WK2 specific changes

2013-01-10 Thread Ryosuke Niwa
Hi all,

As you might all know, the commit-queue uses chromium linux port.
Consequently, any JavaScriptCore and WebKit2 specific changes (and any
non-Chromium port specific changes) are never tested. Commit-queue doesn't
even detect whether it builds or not.

This is a source of confusion because many (new) contributors appear to
mistakenly think that commit-queue ensures that the patch builds  passes
tests on all platforms yet commit-queue doesn't wait until EWS bots process
a patch before landing it. As a result, I've seen quite a few people
landing patches that break JSC/WK2 via commit-queue.

My initial proposal was to make commit-queue wait until EWS bots catch up
when landing a port specific or JSC/WK2 specific changes. However, Adam
thinks that's a bad idea (webkit.org/b/74776) because EWS bots are only
advisory and waiting for EWS bots slows things down.

Is this a problem worth finding a solution? If so, do you have any
suggestions?

- R. Niwa
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] commit-queue and JSC/WK2 specific changes

2013-01-10 Thread Adam Barth
The solution I'd recommend is to make the JavaScriptCore and/or
WebKit2 bots faster.  If those bots are able to complete their
processing before the commit-queue, then they'll stop the patch from
being committed by marking the patch commit-queue-.

Adam


On Thu, Jan 10, 2013 at 9:22 PM, Ryosuke Niwa rn...@webkit.org wrote:
 Hi all,

 As you might all know, the commit-queue uses chromium linux port.
 Consequently, any JavaScriptCore and WebKit2 specific changes (and any
 non-Chromium port specific changes) are never tested. Commit-queue doesn't
 even detect whether it builds or not.

 This is a source of confusion because many (new) contributors appear to
 mistakenly think that commit-queue ensures that the patch builds  passes
 tests on all platforms yet commit-queue doesn't wait until EWS bots process
 a patch before landing it. As a result, I've seen quite a few people landing
 patches that break JSC/WK2 via commit-queue.

 My initial proposal was to make commit-queue wait until EWS bots catch up
 when landing a port specific or JSC/WK2 specific changes. However, Adam
 thinks that's a bad idea (webkit.org/b/74776) because EWS bots are only
 advisory and waiting for EWS bots slows things down.

 Is this a problem worth finding a solution? If so, do you have any
 suggestions?

 - R. Niwa

 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] Feature Announcement: Moving HTML Parser off the Main Thread

2013-01-10 Thread Filip Pizlo
Adam,

Thanks for your detailed reply. Seems like you guys have a pretty good plan in 
place. 

I hope this works and produces a performance improvement. That being said this 
does look like a sufficiently complex work item that success is far from 
guaranteed. So to play devil's advocate, what is your plan for if this doesn't 
work out?

I.e. are we talking about adding a bunch of threading support code in the 
optimistic hope that it makes things run fast, and then forgetting about it if 
it doesn't?  Or are you prepared to roll put any complexity that got landed if 
this does not ultimately live up to promise?  Or is this going to be one giant 
patch that only lands if it works?

I'm also trying to understand what would happen during the interim when this 
work is incomplete, we have thread-related goop in some critical paths, and we 
don't yet know if the WIP code is ever going to result in a speedup. And also, 
what will happen sometime from now if that code is never successfully optimized 
to the point where it is worth enabling. 

I appreciate that this sort of question can be asked of any performance work 
but in this particular case my gut tells me that this is going to result in 
significantly more complexity than the usual incremental performance work. So 
it's good to understand what plan B is. 

Probably a good answer to this sort of question would address some fears that 
people may have. If this work does lead to a performance win then probably 
everyone will be happy. But if it doesn't then it would be great to have a 
plan of retreat. 

-Filip

Dnia 10 sty 2013 o godz. 12:07 Adam Barth aba...@webkit.org napisaƂ(a):

 Thanks everyone for your feedback.  Detailed responses inline.
 
 On Wed, Jan 9, 2013 at 9:41 PM, Filip Pizlo fpi...@apple.com wrote:
 I think your biggest challenge will be ensuring that the latency of shoving 
 things to another core and then shoving them back will be smaller than the 
 latency of processing those same things on the main thread.
 
 Yes.  That's something we know we have to worry about.  Given that we
 need to retain the ability to parse HTML on the main thread to handle
 document.write and innerHTML, we should be able to easily do A/B
 comparisons to make sure we understand any performance trade-offs that
 might arise.
 
 For small documents, I expect concurrent tokenization to be a pure 
 regression because the latency of waking up another thread to do just a 
 small bit of work, plus the added cost of whatever synchronization 
 operations will be needed to ensure safety, will involve more total work 
 than just tokenizing locally.
 
 Once we have the ability to tokenize on a background thread, we can
 examine cases like these and heuristically decide whether to use the
 background thread or not at runtime.  As I wrote above, we'll need
 these ability anyway, so keeping the ability to optimize these cases
 shouldn't add any new constraints to the design.
 
 We certainly see this in the JSC parallel GC, and in line with traditional 
 parallel GC design, we ensure that parallel threads only kick in when the 
 main thread is unable to keep up with the work that it has created for 
 itself.
 
 Do you have a vision for how to implement a similar self-throttling, where 
 tokenizing continues on the main thread so long as it is cheap to do so?
 
 It's certainly something we can tune in the optimization phase.  I
 don't think we need a particular vision to be able to do it.  Given
 that we want to implement speculative parsing (to replace preload
 scanning---more on this below), we'll already have the ability to
 checkpoint and restore the tokenizer state across threads.  Once you
 have that primitive, it's easy to decide whether to continue
 tokenization on the main thread or on a background thread.
 
 On Wed, Jan 9, 2013 at 10:04 PM, Ian Hickson i...@hixie.ch wrote:
 Parsing and (maybe to a lesser extent) compiling JS can be moved off the
 main thread, though, right? That's probably worth examining too, if it
 hasn't already been done.
 
 Yes, once we have the tokenizer running on a background thread, that
 opens up the possibility of parsing other sorts of data on the
 background thread as well.  For example, when the tokenizer encounters
 an inline script block, you could imagine parsing the script on the
 background thread as well so that the main thread has less work to do.
 (You could also imagine making the optimizations without a background
 tokenizer, but the design constraints would be a bit different.)
 
 On Thu, Jan 10, 2013 at 12:11 AM, Zoltan Herczeg zherc...@webkit.org wrote:
 Parsing, especially JS parsing still takes a large amount of time on page
 loading. We tried to improve the preload scanner by moving it into
 anouther thread, but there was no gain (except some special cases).
 Synchronization between threads is surprisingly (ridiculously) costly,
 usually worth for those tasks, which needs quite a few million
 instructions to be executed