Re: [webkit-dev] A Parallel Webkit?
JS compilation is also done lazily in webkit so we don't ever end up with multiple pieces of code to compile concurrently. Two of us have been focusing on speeding up bottlenecks like these to allow such synchronous interfaces. However, frameworks like TBB push towards having parallel tasks that are at least 10-100,000 cycles for a good reason; working at such small scales is for people who have the luxury of targeting farther ahead and I wouldn't suggest doing it without help. JS execution and DOM manipulation are single threaded, and that thread is the UI thread, this fact cannot be changed. We could potentially do HTML parsing on off the main thread. Firefox 4 has support for this. It's unclear how much of a win this would be. What Adam is suggesting applies not only to HTML parsing and has seemed to me to be the pragmatic and likely path for the short-term (next 1-2 years). E.g, create buffers (hidden by synchronous interfaces where possible) -- for example, queueing up calls from the parser to the selector matcher -- and then handle those in bulk. In other places, like Adam's example of a call to the parser itself, it's harder, but still possible (while being clean!) by approaches like passing around forceable futures. In a sense, such an approach is already the norm in terms of the interface between JS and the DOM/layout (at least in Firefox): calls to layout etc. are buffered up to avoid repeated or unnecessary evaluation. While I don't think this approach will expose significant parallelism, I think it would match the available hardware -- if anyone is interested in doing such a limit study, we should talk! Shared-state parallel scripting (and concurrent DOM) seem inevitable, but these sorts of things are the minority in the performance profile. E.g., when loading a page, about half the JavaScript time is spent in non-execution tasks like parsing and code generating. They're still a concern in the sense of Amdahl's law and enabling new types of applications, but for getting the web to work satisfyingly on mobile, there seem to be bigger blockers. Until research catches up with the algorithms in browsers, exposing and exploiting coarse-level concurrency between components would be my recommendation (outside of some more obvious algorithms). Regards, - Leo ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] A Parallel Webkit?
Shared-state parallel scripting (and concurrent DOM) seem inevitable I don't know why you believe this -- there is no intention of having shared state threading in ES, nor is there any desire in the ES technical committee to add any for of concurrency primitives to the language. --Oliver ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] A Parallel Webkit?
I don't think that question is very pertinent to the list; it's more of a longterm thing. The es committee has little interest in it for the relevant future and I agree with this given the other needs, available slack time, and induced complexity. Luckily, a somewhat parallel browser does not need it, as I've hopefully made clear. On Jul 24, 2010, at 1:51 PM, Oliver Hunt oli...@apple.com wrote: Shared-state parallel scripting (and concurrent DOM) seem inevitable I don't know why you believe this -- there is no intention of having shared state threading in ES, nor is there any desire in the ES technical committee to add any for of concurrency primitives to the language. --Oliver ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] A Parallel Webkit?
Hi Marchywka , Thanks for your comments about the paralle browser.:) Your point is that exeucution can not benifit from multi-thread multi-core for the following reasons: 1) Inter-thread compete 2) Cache thrashing (false sharing) 3) task offload overhead (BW limitation etc) Generally, all these bullets are true for multi-thread workload whether or not it is browser. Do you have any data for showing these worse in mobile device? 1) Inter-thread compete There are some solutions which can reduce the compete. For example, you can select a suitable execution model (producer-consumer, Bulk synchronous parallel) for a specific workload. 2) Cache thrashing (false sharing) App has to take care of the memory locality and reduce the working set. 3) task offload overhead (BW limitation etc) The task partitioner should offload the coarse task to device or accelator for less communication overhead. Thanks, -Ying 2010/7/23 Mike Marchywka marchy...@hotmail.com I wasn't entirely sure what OP was after of if the reply below adequately addressed his interests. After looking at the link provided, I thought I would make a few comments that may or may not be of much benefit ( for discussion ) but that relate to observations on a few recent browsers on one series of mobile phones. Date: Thu, 22 Jul 2010 22:20:34 +0200 From: abe...@inf.u-szeged.hu To: webkit-dev@lists.webkit.org Subject: Re: [webkit-dev] A Parallel Webkit? Hi Ying, you might be looking for WebKit2, wich is a non-blocking API layer for WebKit and aims to make WebKit more suitable for multicore systems. It supports the split-process model and the thread model as well. The API is currently under development for the Mac and Windows ports of WebKit (Safari), and the Qt port also tries to keep pace with WebKit2 development, but currently lags behind the Mac and Win versions a bit. There is a test browser called MiniBrowser, you can try it if you are interested. You can find more information about WebKit2 at: http://trac.webkit.org/wiki/WebKit2 regards, Andras 2010-07-22 17:43 keltezéssel, gao ying írta: Hi, Is WebKit well parallel for fitting in multicore architecture? And any status for the parallel Webkit? The same idea is from http://www.eecs.berkeley.edu/~lmeyerov/projects/pbrowser/http://www.eecs.berkeley.edu/%7Elmeyerov/projects/pbrowser/ I guess I would just make a few comments about your considerations and our experience. A somewhat different strategy than what you are proposing is to offload some tasks to a more capable device such as a server- simply tokenizing html or compiling JS can be a big benefit in phone CPU and bandwidth (aka time and battery life). Many people rush to parallelism even with only one core or may try to use many cores and then they compete with each other, often thrashing memory caches or worse going to VM ( on smaller desktop computers like mine this is a problem). You don't need to dig too deeply into the literature to find non monotonic graphs of execution time for some task vs number of cores ( more can make things slower). I have seen this with transcoding and profiling on phone simulators- parsing and compiling is a great way to use time and create lots of objects ( and these in java have lots of overhead and many phones only let you use java but in any case we know that temp objects are not free in any case and fragment memory). Another rate limiting step has been the round trip delay to housekeep a connection or do a DNS lookup. Here a proxy with persistent connections properly implemented is a much bigger issue than optimized rendering or well transcoded web pages AFAIK. It may be worth considering making a standard compiled page type rather than worry about some of these other issues for example and cached compiled pages of course greatly reduce problems for everyone. Its important to remember that most of these things involve tradeoffs and there are many resources to consider. So, maybe you can make various arguments ( but with wifi IO doesn't matter or CPU's are only getting faster or memory is only getting cheaper ) and battle out platittudes to defend a given approach but I wouldn't just point to one, like parallelism, and assume that will fix everything and indeed it can make things worse. Making things smaller in a way you don't need to undo ( say use winzip to download and then unzip the html only to compile it etc) is potentially a benefit in any situation as long as radio usage requires power. An immediate concern I would point to in regard to multithreading on desktop and mobile is the need to keep a responsive UI thread- not sure if webkit has addressed this fully but I have seen this as a huge problem on my desktop, mobile, and my own mobile code that I ( carefully LOL) wrote myself. Thanks, -Ying
Re: [webkit-dev] A Parallel Webkit?
Thanks Andras! I will take a look at this. -Ying 2010/7/23 Andras Becsi abe...@inf.u-szeged.hu Hi Ying, you might be looking for WebKit2, wich is a non-blocking API layer for WebKit and aims to make WebKit more suitable for multicore systems. It supports the split-process model and the thread model as well. The API is currently under development for the Mac and Windows ports of WebKit (Safari), and the Qt port also tries to keep pace with WebKit2 development, but currently lags behind the Mac and Win versions a bit. There is a test browser called MiniBrowser, you can try it if you are interested. You can find more information about WebKit2 at: http://trac.webkit.org/wiki/WebKit2 regards, Andras 2010-07-22 17:43 keltezéssel, gao ying írta: Hi, Is WebKit well parallel for fitting in multicore architecture? And any status for the parallel Webkit? The same idea is from http://www.eecs.berkeley.edu/~lmeyerov/projects/pbrowser/http://www.eecs.berkeley.edu/%7Elmeyerov/projects/pbrowser/ Thanks, -Ying ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] A Parallel Webkit?
I wasn't entirely sure what OP was after of if the reply below adequately addressed his interests. WebKit2 seems to have little to do with taking advantage of parallel hardware in browser algorithms like lexing, parsing, selectors, JS compilation, JS execution, layout, DOM interactions, fonts, rendering, etc. There is some benefit: Sam King measured bucketing coarse tasks at the process isolation level gives maybe 10-20% better utilization of cores (a la Google Chrome or predecessors like the OP browser, Charles Reis's work, Gazelle, ...) on a good workload. WebKit2's goal of supporting concurrency might be aimed at foundations for building parallelism into library code, such as further cleaning up threading API or introducing lightweight task parallelism, but the description doesn't talk about such things. In contrast, we're interested in magnitudes of speedup. Parallelism --- memory parallelism (hw + sw prefetching to avoid cache misses), SIMD instructions, multiple cores, etc -- and even sequential stuff (smaller data representation, balancing incrementalization/memoization, etc.). Parallelization is already standard for traditional media libraries within browsers (e.g., GPUs SSE for painting and rendering); for maybe a third of our work, we're just expanding the scope of what algorithms should be tuned kernels in the HPC sense. I guess I would just make a few comments about your considerations and our experience. A somewhat different strategy than what you are proposing is to offload some tasks to a more capable device such as a server- simply tokenizing html or compiling JS can be a big benefit in phone CPU and bandwidth (aka time and battery life). Opera mini is one stab at this, and the OnLive platform shows some of the potential here. However, this is for a limited deployment scenario and I'd actually argue against it from a power, energy, and latency perspective for handheld devices (... assuming you can get parallelization to work). You don't need to dig too deeply into the literature to find non monotonic graphs of execution time for some task vs number of cores ( more can make things slower). I agree -- it's tricky stuff. Worth keeping in mind handhelds will be in the multicore camp, not manycore, for awhile -- we're only now seeing dualcore ones: we don't need arbitrary strong scaling. Furthermore, frameworks like TBB are in a position to automate making the cutoff (which is actually non-obvious as you might want more threads than cores due to memory effects, hyperthreading, etc.). OTOH, for the processing style of browsers (sequences of little tasks), getting speedups isn't easy. I have seen this with transcoding and profiling on phone simulators- parsing and compiling is a great way to use time and create lots of objects ( and these in java have lots of overhead and many phones only let you use java but in any case we know that temp objects are not free in any case and fragment memory). I'm actually surprised a project like WebKit doesn't use, as far as I've been able to tell, many memory pooling etc. optimizations. We've been mindful of this stuff in our work -- it's fairly standard in the performance community. Another rate limiting step has been the round trip delay to housekeep a connection or do a DNS lookup. Here a proxy with persistent connections properly implemented is a much bigger issue than optimized rendering or well transcoded web pages AFAIK. Both the network and the CPU need work. For laptops, the network is typically the bottleneck, and only recently has that been shifting in the smart phone space. Worth noting, even on a fast network and local pages, profilers will show CPU bottlenecks. It may be worth considering making a standard compiled page type rather than worry about some of these other issues for example and cached compiled pages of course greatly reduce problems for everyone. That's great and actually complementary -- parallel serialization of machine-generated formats is preferable to formats like HTML5. A lot of problems lie under the surface here, however: introducing a proxy somewhere introduces latency, not getting benefits on dynamic component loading, etc. I actually view making in-browser algorithms faster as the conservative choice. Its important to remember that most of these things involve tradeoffs and there are many resources to consider. So, maybe you can make various arguments ( but with wifi IO doesn't matter or CPU's are only getting faster or memory is only getting cheaper ) and battle out platittudes to defend a given approach We have never said any of these things. Mobile browsers take too much time processing and the hardware is going parallel in multiple ways; we're just putting 1+1 together. but I wouldn't just point to one, like parallelism, and assume that will fix everything and indeed it can make things worse.
Re: [webkit-dev] A Parallel Webkit?
On Jul 23, 2010, at 4:50 PM, Leo Meyerovich wrote: I wasn't entirely sure what OP was after of if the reply below adequately addressed his interests. WebKit2 seems to have little to do with taking advantage of parallel hardware in browser algorithms like lexing, parsing, selectors, JS compilation, JS execution, layout, DOM interactions, fonts, rendering, etc. JS execution and DOM manipulation are single threaded, and that thread is the UI thread, this fact cannot be changed. JS compilation is also done lazily in webkit so we don't ever end up with multiple pieces of code to compile concurrently. --Oliver ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] A Parallel Webkit?
On Fri, Jul 23, 2010 at 5:27 PM, Oliver Hunt oli...@apple.com wrote: On Jul 23, 2010, at 4:50 PM, Leo Meyerovich wrote: I wasn't entirely sure what OP was after of if the reply below adequately addressed his interests. WebKit2 seems to have little to do with taking advantage of parallel hardware in browser algorithms like lexing, parsing, selectors, JS compilation, JS execution, layout, DOM interactions, fonts, rendering, etc. JS execution and DOM manipulation are single threaded, and that thread is the UI thread, this fact cannot be changed. JS compilation is also done lazily in webkit so we don't ever end up with multiple pieces of code to compile concurrently. We could potentially do HTML parsing on off the main thread. Firefox 4 has support for this. It's unclear how much of a win this would be. Adam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
[webkit-dev] A Parallel Webkit?
Hi, Is WebKit well parallel for fitting in multicore architecture? And any status for the parallel Webkit? The same idea is from http://www.eecs.berkeley.edu/~lmeyerov/projects/pbrowser/ Thanks, -Ying ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] A Parallel Webkit?
Hi Ying, you might be looking for WebKit2, wich is a non-blocking API layer for WebKit and aims to make WebKit more suitable for multicore systems. It supports the split-process model and the thread model as well. The API is currently under development for the Mac and Windows ports of WebKit (Safari), and the Qt port also tries to keep pace with WebKit2 development, but currently lags behind the Mac and Win versions a bit. There is a test browser called MiniBrowser, you can try it if you are interested. You can find more information about WebKit2 at: http://trac.webkit.org/wiki/WebKit2 regards, Andras 2010-07-22 17:43 keltezéssel, gao ying írta: Hi, Is WebKit well parallel for fitting in multicore architecture? And any status for the parallel Webkit? The same idea is from http://www.eecs.berkeley.edu/~lmeyerov/projects/pbrowser/ Thanks, -Ying ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] A Parallel Webkit?
I wasn't entirely sure what OP was after of if the reply below adequately addressed his interests. After looking at the link provided, I thought I would make a few comments that may or may not be of much benefit ( for discussion ) but that relate to observations on a few recent browsers on one series of mobile phones. Date: Thu, 22 Jul 2010 22:20:34 +0200 From: abe...@inf.u-szeged.hu To: webkit-dev@lists.webkit.org Subject: Re: [webkit-dev] A Parallel Webkit? Hi Ying, you might be looking for WebKit2, wich is a non-blocking API layer for WebKit and aims to make WebKit more suitable for multicore systems. It supports the split-process model and the thread model as well. The API is currently under development for the Mac and Windows ports of WebKit (Safari), and the Qt port also tries to keep pace with WebKit2 development, but currently lags behind the Mac and Win versions a bit. There is a test browser called MiniBrowser, you can try it if you are interested. You can find more information about WebKit2 at: http://trac.webkit.org/wiki/WebKit2 regards, Andras 2010-07-22 17:43 keltezéssel, gao ying írta: Hi, Is WebKit well parallel for fitting in multicore architecture? And any status for the parallel Webkit? The same idea is from http://www.eecs.berkeley.edu/~lmeyerov/projects/pbrowser/ I guess I would just make a few comments about your considerations and our experience. A somewhat different strategy than what you are proposing is to offload some tasks to a more capable device such as a server- simply tokenizing html or compiling JS can be a big benefit in phone CPU and bandwidth (aka time and battery life). Many people rush to parallelism even with only one core or may try to use many cores and then they compete with each other, often thrashing memory caches or worse going to VM ( on smaller desktop computers like mine this is a problem). You don't need to dig too deeply into the literature to find non monotonic graphs of execution time for some task vs number of cores ( more can make things slower). I have seen this with transcoding and profiling on phone simulators- parsing and compiling is a great way to use time and create lots of objects ( and these in java have lots of overhead and many phones only let you use java but in any case we know that temp objects are not free in any case and fragment memory). Another rate limiting step has been the round trip delay to housekeep a connection or do a DNS lookup. Here a proxy with persistent connections properly implemented is a much bigger issue than optimized rendering or well transcoded web pages AFAIK. It may be worth considering making a standard compiled page type rather than worry about some of these other issues for example and cached compiled pages of course greatly reduce problems for everyone. Its important to remember that most of these things involve tradeoffs and there are many resources to consider. So, maybe you can make various arguments ( but with wifi IO doesn't matter or CPU's are only getting faster or memory is only getting cheaper ) and battle out platittudes to defend a given approach but I wouldn't just point to one, like parallelism, and assume that will fix everything and indeed it can make things worse. Making things smaller in a way you don't need to undo ( say use winzip to download and then unzip the html only to compile it etc) is potentially a benefit in any situation as long as radio usage requires power. An immediate concern I would point to in regard to multithreading on desktop and mobile is the need to keep a responsive UI thread- not sure if webkit has addressed this fully but I have seen this as a huge problem on my desktop, mobile, and my own mobile code that I ( carefully LOL) wrote myself. Thanks, -Ying ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev _ Hotmail is redefining busy with tools for the New Busy. Get more from your inbox. http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2 ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev