Re: [webkit-dev] Question regarding priorities of subresource content retrieval

2011-02-12 Thread Silvio Ventres
Finished testing a small patch: https://bugs.webkit.org/show_bug.cgi?id=54108
It only prioritizes script loading in body (not sure yet why
doesn't work in head).
Assuming most external-load slowdowns are caused by js from body vs
anything from head, this should fix most horrific worst cases.

Tested on http://solid.eqoppa.com/testlag2.html:
First paint time unpatched:  80 seconds in
First paint time patches:  34 seconds in

Real-world test: http://technorati.com:
First paint time unpatched: 5 seconds
First paint time patched: 2 seconds

Please help test and see if it breaks anything. Didn't see any breaks
here yet :)

If someone wants to help see if there will be any more improvement by
adding the heuristic for js in head or for css in both head/body,
please let know.
For now, was thinking about twiddling with
CachedResourceLoader::preload, but that might be too late if the
engine already blocked on some css.

p.s. if generally patches are discussed off-list, please kindly let
know the correct venue.

--
 silvio
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Question regarding priorities of subresource content retrieval

2011-02-08 Thread Silvio Ventres
This argument - web developer is to blame for choosing a slow
ad/tracking/etc server - is incorrect.
Web developers in general do not have any control over the ad provider
or, frankly, any other type of external functionality provider.
Google Analytics being a good point in case, you would not want most
of the world's web pages to suddenly hang if something happens inside
Google.

The web browser should clearly prioritize developer-controllable
resources over ones that are beyond web developer's control.
Also, as an application run by the user and not by the developer, the
browser should arguably prioritize actual content against
pseudo-content which purpose is functionality that is not visibile to
the actual user, such as ad/tracker scripts. This actual content has
higher probability to be important when sourced from the
domain/subdomain of the webpage itself, based on current trends.

Domain check is a reasonable approximation that fits both purposes.

--
 silvio


On Tue, Feb 8, 2011 at 5:13 AM, Jerry Seeger vikin...@mac.com wrote:
 I'm reasonably sure that javascript in the header must be loaded 
 synchronously, as it might affect the rest of the load. This is why tools 
 like YSlow advise Web designers to move javascript loads that are not needed 
 for rendering until after the rest of the page loads.

 Blocking on loading the css is less clear-cut, as in some cases it could mean 
 several seconds of ugly page. I don't know if it's right or wrong, but a lot 
 of pages out there rely on the CSS being loaded before the page starts to 
 render to avoid terrible layout and the appearance of items meant to be 
 hidden for the seconds it takes the css to load.

 In general, while things could certainly be improved, it's up to the owner of 
 the page to not rely on a a slow ad server, or build the page so the ads load 
 after the primary content.

 Jerry Seeger


 On Feb 7, 2011, at 5:47 PM, Silvio Ventres wrote:

 IE/Opera are delaying only for 4 seconds, same as Mobile Safari
 The reason looks to be the url for the script/css.
 If the url is the same twice, Chrome/Firefox serializes the requests,
 while IE/Opera/MobileSafari launches both requests simultaneously.

 Of course, requesting simultaneously doesn't fix anything, as you can
 see by trying a link-stuffed version at
 http://solid.eqoppa.com/testlag2.html

 This one has 45 css and 38 javascript links. It hangs all browsers nicely.
 The main point here is that it might be acceptable if it's coming from
 the webpage domain itself.
 But the links are coming from a completely different place.

 This is exactly what makes browsing pages with any third-party
 analytics, tracking or ad addons so slow and frustrating.
 Fixing priorities in subresource download should make experience
 considerably more interactive and fun.

 --
 silvio


___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Question regarding priorities of subresource content retrieval

2011-02-08 Thread Silvio Ventres
Do you have any example of scripts or css that are externally sourced,
and where developer cares to reasonably optimize the web page?
The main use case of such external scripts currently is ads and
statistics gatherers for analysis. This, arguably, is not critical
content that the user is interested in.

If your argument is indeed Web developer should have control, then,
when you have no choice but including external scripts (ads, f.e.),
you would probably hate those to break the latency of your website.
If you are talking about the http://muddledramblings.com/ website, for
example, you can clearly see that most scripts there are
domain-internal.
Do you deem your user experience more or less important than Google
Analytics capability ? If Google Analytics hangs, for example, for 4
seconds, would you like the user to wait, or start reading while it
loads?

A change to HTML standard might be a good idea, though the problem
here is that there are millions of pages on the 'net already, and the
developers won't suddenly start changing them.

This heuristic will allow the users to view 90% of the current Web
more interactively.
Keep in mind that at least 38% of all statistics is taken out of thin
air :), but, really, please, show at least two pages which this
heuristic will NOT work on.

--
 silvio

On Tue, Feb 8, 2011 at 6:52 PM, Jerry Seeger vikin...@mac.com wrote:
 My argument is less it's the Web developer's fault than it is the Web 
 developer should have control. I am hardly a sophisticated Web developer but 
 I have javascript from a different  domain that must be loaded first and I 
 have Google analytics, which I should load after the rest of the page (though 
 to be honest I'm not sure I do after my redesign... hm). While I would love 
 it if there were standardized rules for which scripts would be loaded 
 synchronously and which wouldn't, I would hate it if one browser required me 
 to move my scripts to a different domain.

 Having said all that, I hate it when I have to wait for a resource out 
 outside of my control, so I'd love to see a solution to this. If there were a 
 more reliable way than simple domain checking to prioritize content, that 
 would be fantastic. I think ideally this is something for the standards board 
 - perhaps an extension of the script and link tags to specify a priority, or 
 something like that.

 Jerry


 On Feb 8, 2011, at 2:23 AM, Silvio Ventres wrote:

 This argument - web developer is to blame for choosing a slow
 ad/tracking/etc server - is incorrect.
 Web developers in general do not have any control over the ad provider
 or, frankly, any other type of external functionality provider.
 Google Analytics being a good point in case, you would not want most
 of the world's web pages to suddenly hang if something happens inside
 Google.

 The web browser should clearly prioritize developer-controllable
 resources over ones that are beyond web developer's control.
 Also, as an application run by the user and not by the developer, the
 browser should arguably prioritize actual content against
 pseudo-content which purpose is functionality that is not visibile to
 the actual user, such as ad/tracker scripts. This actual content has
 higher probability to be important when sourced from the
 domain/subdomain of the webpage itself, based on current trends.

 Domain check is a reasonable approximation that fits both purposes.

 --
 silvio


 On Tue, Feb 8, 2011 at 5:13 AM, Jerry Seeger vikin...@mac.com wrote:
 I'm reasonably sure that javascript in the header must be loaded 
 synchronously, as it might affect the rest of the load. This is why tools 
 like YSlow advise Web designers to move javascript loads that are not 
 needed for rendering until after the rest of the page loads.

 Blocking on loading the css is less clear-cut, as in some cases it could 
 mean several seconds of ugly page. I don't know if it's right or wrong, but 
 a lot of pages out there rely on the CSS being loaded before the page 
 starts to render to avoid terrible layout and the appearance of items meant 
 to be hidden for the seconds it takes the css to load.

 In general, while things could certainly be improved, it's up to the owner 
 of the page to not rely on a a slow ad server, or build the page so the ads 
 load after the primary content.

 Jerry Seeger


 On Feb 7, 2011, at 5:47 PM, Silvio Ventres wrote:

 IE/Opera are delaying only for 4 seconds, same as Mobile Safari
 The reason looks to be the url for the script/css.
 If the url is the same twice, Chrome/Firefox serializes the requests,
 while IE/Opera/MobileSafari launches both requests simultaneously.

 Of course, requesting simultaneously doesn't fix anything, as you can
 see by trying a link-stuffed version at
 http://solid.eqoppa.com/testlag2.html

 This one has 45 css and 38 javascript links. It hangs all browsers nicely.
 The main point here is that it might be acceptable if it's coming from
 the webpage domain itself

Re: [webkit-dev] Question regarding priorities of subresource content retrieval

2011-02-08 Thread Silvio Ventres
Indeed, the test case just shows the general problem.
It should be changed to include scripts/css sourced from different
places: same-subdomain, same-domain, cross-domain, CDN.
Of course, right now there will be no difference between those.

The bug you filed considers the same problem from another angle.
The only difference is that the domain heuristic might fix the problem
for most of the webpages.
So even if Google decides one day to delay all requests to Doubleclick
ads by 4 seconds, all the web won't hang.

Regarding CDNs, as said before, these can be whitelisted, or left alone.

Currently, both the domain-internal render-critical scripts, CDN
scripts as well as external nonimportant scripts are loaded at same
priority.
You give priority to domain-internal render-critical scripts. This
covers most of the page content. Even if the CDN scripts are not given
priority, the user experience is _still_ better. In addition, now the
web developer does not need to be afraid of the CDN script slowing
down his webpage, so he has _more_ incentive to use it.

Maybe a timer should be added and low-priority resources given
specific time to complete after the high-priority resources are
loaded: 100-200msec. If an external resource is served fast enough
(CDN-based, f.e.) it should be loaded by that time. If it's not fast
enough, why make the user wait ?

Basically, it's the question of preference: would you like the user to
wait on something that you cannot control?

You mentioned that the parser stops when encountering a linked
script/stylesheet reference. Can this be overriden?
Maybe move the external-domain scripts/css to the end of html? That
should be able to be tested by a browser extension.
Will look into it.

--
 silvio


On Tue, Feb 8, 2011 at 6:48 PM, Tony Gentilcore to...@chromium.org wrote:
 Your test case isn't really about prioritization. The HTML5 spec
 defines very specifically when parsing must stop. The two main cases
 are:
 1. Waiting for an external script to download
 2. Waiting for an external stylesheet to download when any script
 block is reached

 In these cases, the parser does not continue parsing the document to
 discover new subresources to download. However, as an optimization,
 the PreloadScanner speculatively scans the source (which it is not
 allowed to parse yet) for any subresources which should probably be
 downloaded. This way when parsing does continue the resources are
 already available or at least have a head start. So if we aren't able
 to scan ahead and at least discover these resources, prioritization is
 moot.

 Now, assume we have discovered all subresources on the page and could
 prioritize them altogether. I'm still not sure I'd buy your argument
 about resources from another domain being less important. Many sites
 use CDNs on different domains to download resources. Also, many sites
 include their JS libraries from common locations. In either of those
 cases, another domain could be holding the critical blocking resource.
 Perhaps it is worth experimenting with the heuristic you suggest, but
 I certainly don't think we can just assert that is the case.

 On Tue, Feb 8, 2011 at 2:23 AM, Silvio Ventres silvio.vent...@gmail.com 
 wrote:
 This argument - web developer is to blame for choosing a slow
 ad/tracking/etc server - is incorrect.
 Web developers in general do not have any control over the ad provider
 or, frankly, any other type of external functionality provider.
 Google Analytics being a good point in case, you would not want most
 of the world's web pages to suddenly hang if something happens inside
 Google.

 The web browser should clearly prioritize developer-controllable
 resources over ones that are beyond web developer's control.
 Also, as an application run by the user and not by the developer, the
 browser should arguably prioritize actual content against
 pseudo-content which purpose is functionality that is not visibile to
 the actual user, such as ad/tracker scripts. This actual content has
 higher probability to be important when sourced from the
 domain/subdomain of the webpage itself, based on current trends.

 Domain check is a reasonable approximation that fits both purposes.

 --
  silvio


 On Tue, Feb 8, 2011 at 5:13 AM, Jerry Seeger vikin...@mac.com wrote:
 I'm reasonably sure that javascript in the header must be loaded 
 synchronously, as it might affect the rest of the load. This is why tools 
 like YSlow advise Web designers to move javascript loads that are not 
 needed for rendering until after the rest of the page loads.

 Blocking on loading the css is less clear-cut, as in some cases it could 
 mean several seconds of ugly page. I don't know if it's right or wrong, but 
 a lot of pages out there rely on the CSS being loaded before the page 
 starts to render to avoid terrible layout and the appearance of items meant 
 to be hidden for the seconds it takes the css to load.

 In general, while things could certainly be improved, it's

[webkit-dev] Question regarding priorities of subresource content retrieval

2011-02-07 Thread Silvio Ventres
Hello.

Can someone point where in the source code is the implementation of
the subresources loading and some documentation regarding its
implementation - as a queue or just child-threads or async functions?

The reason is that the current subresource loading seems to lack any
prioritization and it often occurs that some external 1x1 pixel
tracker or other similarly unimportant page resources block the
rendering of the page completely, and the user is left starting at
contacting ads.doubleclick.com with a blank page. This is very
frustrating as the page render as a whole then depends on the slowest
part and cannot be possibly done faster by any optimizations in
hardware or software on the part of the page owner.

Thus, the proposition is this:
1. Render should only wait for the main HTML/CSS to load from the main
page domain (a page in tumblr.com domain should wait for html/css
files from *.tumblr.com, but not from *.doubleclick.com).
2. Other content load except HTML/CSS should be prioritized as
follows, with placeholders shown until the load is complete - possibly
adding one or more extra render passes, but increasing interactivity.

So, basic priorities:

10 = Highest:   HTML/CSS from main domain (sites.tumblr.com/some_site.html)
9: JS/XHR from main domain
8: HTML/CSS/JS from subdomains in the same domain (ads.tumblr.com/ad_serve.js)

7. Reserved for future use

6. IMG/media from main domain (sites.tubmlr.com/header.png)
5. IMG/media from subdomains in the same domain (ads.tubmlr.com/banner1.png)

4. Optional* HTML/CSS/JS (text) from CDNs
3. Optional* IMG/media from CDNs

2. HTML/CSS/JS from other domains
(*.doubleclick.com/link_210986cv3.php?refer=2323424)
1=Lowest. IMG from other domains (*.doublclick.com/images/track_1x1.gif)

*4 and 3 are optional and would need some kind of a whitelist of
well-known CDN domains.

This prioritization will reduce the latency between the page load
start and a usable render, so even if some external-domain subresource
is nonresponsive, interactivity will not suffer.
Maybe the priorities should be moved to a user-controllable setting,
where more fine-grained rules can be defined. Otherwise, maybe HTML
standard can be extended to provide hints to the browser regarding the
preferred subresource loading order, which should of course be
user-overridable.

Thank you for reading.
This might be a big undertaking but the benefit for the user will be
seen instantly.

--
 silvio
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Question regarding priorities of subresource content retrieval

2011-02-07 Thread Silvio Ventres
The function doesn't seem to get any information regarding domain the
resource is hosted at.

Calling some kind of setResourceDomainType() to set DOMAIN_TYPE to
enum(0=main domain, 1=subdomain within same domain, 2=CDN, 3=external
domain) and then providing that as an additional parameter to
defaultPriorityForResourceType() seems logical. Trying to see where
this domain-sensing function can be called at earliest.

Regarding the performance test, since it depends on multiple resources
with highly differing latency, it would depend on an external
resource. Does the PerformanceTest framework have some kind of latency
simulator?

Thanks for swift replies, btw!

--
 silvio



On 2/7/11, Nate Chapin jap...@google.com wrote:
 The default prioritization is found here:
 http://trac.webkit.org/browser/trunk/Source/WebCore/loader/cache/CachedResource.cpp#L51.
 There are cases where we override this (e.g., I'm pretty sure we load
 favicons at a lower priority than other images)

 On Mon, Feb 7, 2011 at 11:44 AM, Adam Barth aba...@webkit.org wrote:

 There is already some amount of code that's involved with prioritizing
 subresource loads.  See

 http://trac.webkit.org/browser/trunk/Source/WebCore/loader/ResourceLoadScheduler.h
 and
 http://trac.webkit.org/browser/trunk/Source/WebCore/loader/cache/CachedResourceLoader.h
 .

 I suspect the prioritization algorithm could be improved.  A good
 first step is to create a benchmark illustrating the performance
 issues and then write patches that optimize the benchmark.  Please
 consider putting your performance test in
 http://trac.webkit.org/browser/trunk/PerformanceTests/ so that it's
 easy for others to work on as well.

 Adam


 On Mon, Feb 7, 2011 at 11:23 AM, Silvio Ventres
 silvio.vent...@gmail.com wrote:
  Hello.
 
  Can someone point where in the source code is the implementation of
  the subresources loading and some documentation regarding its
  implementation - as a queue or just child-threads or async functions?
 
  The reason is that the current subresource loading seems to lack any
  prioritization and it often occurs that some external 1x1 pixel
  tracker or other similarly unimportant page resources block the
  rendering of the page completely, and the user is left starting at
  contacting ads.doubleclick.com with a blank page. This is very
  frustrating as the page render as a whole then depends on the slowest
  part and cannot be possibly done faster by any optimizations in
  hardware or software on the part of the page owner.
 
  Thus, the proposition is this:
  1. Render should only wait for the main HTML/CSS to load from the main
  page domain (a page in tumblr.com domain should wait for html/css
  files from *.tumblr.com, but not from *.doubleclick.com).
  2. Other content load except HTML/CSS should be prioritized as
  follows, with placeholders shown until the load is complete - possibly
  adding one or more extra render passes, but increasing interactivity.
 
  So, basic priorities:
 
  10 = Highest:   HTML/CSS from main domain (
 sites.tumblr.com/some_site.html)
  9: JS/XHR from main domain
  8: HTML/CSS/JS from subdomains in the same domain (
 ads.tumblr.com/ad_serve.js)
 
  7. Reserved for future use
 
  6. IMG/media from main domain (sites.tubmlr.com/header.png)
  5. IMG/media from subdomains in the same domain (
 ads.tubmlr.com/banner1.png)
 
  4. Optional* HTML/CSS/JS (text) from CDNs
  3. Optional* IMG/media from CDNs
 
  2. HTML/CSS/JS from other domains
  (*.doubleclick.com/link_210986cv3.php?refer=2323424)
  1=Lowest. IMG from other domains (*.doublclick.com/images/track_1x1.gif)
 
  *4 and 3 are optional and would need some kind of a whitelist of
  well-known CDN domains.
 
  This prioritization will reduce the latency between the page load
  start and a usable render, so even if some external-domain subresource
  is nonresponsive, interactivity will not suffer.
  Maybe the priorities should be moved to a user-controllable setting,
  where more fine-grained rules can be defined. Otherwise, maybe HTML
  standard can be extended to provide hints to the browser regarding the
  preferred subresource loading order, which should of course be
  user-overridable.
 
  Thank you for reading.
  This might be a big undertaking but the benefit for the user will be
  seen instantly.
 
  --
   silvio
  ___
  webkit-dev mailing list
  webkit-dev@lists.webkit.org
  http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
 
 ___
 webkit-dev mailing list
 webkit-dev@lists.webkit.org
 http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Question regarding priorities of subresource content retrieval

2011-02-07 Thread Silvio Ventres
IE/Opera are delaying only for 4 seconds, same as Mobile Safari
The reason looks to be the url for the script/css.
If the url is the same twice, Chrome/Firefox serializes the requests,
while IE/Opera/MobileSafari launches both requests simultaneously.

Of course, requesting simultaneously doesn't fix anything, as you can
see by trying a link-stuffed version at
http://solid.eqoppa.com/testlag2.html

This one has 45 css and 38 javascript links. It hangs all browsers nicely.
The main point here is that it might be acceptable if it's coming from
the webpage domain itself.
But the links are coming from a completely different place.

This is exactly what makes browsing pages with any third-party
analytics, tracking or ad addons so slow and frustrating.
Fixing priorities in subresource download should make experience
considerably more interactive and fun.

--
 silvio
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev