Re: [whatwg] HTML resource packages

2010-08-10 Thread Mike Belshe
On Mon, Aug 9, 2010 at 1:40 PM, Justin Lebar justin.le...@gmail.com wrote:

  Can you provide the content of the page which you used in your
 whitepaper?
  (https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820)

 I'll post this to the bug when I get home tonight.  But your comments
 are astute -- the page I used is a pretty bad benchmark for a variety
 of reasons.  It sounds like you probably could hack up a much better
 one.

 a) Looks like pages were loaded exactly once, as per your notes?  How
  hard is it to run the tests long enough to get to a 95% confidence
 interval?

 Since I was running on a simulated network with no random parameters
 (e.g. no packet loss), there was very little variance in load time
 across runs.


I suspect you are right.  Still, it's good due diligence - especially for a
whitepaper :-)  The good news is that if it really is consistent, then it
should be easy...




 d) What did you do about subdomains in the test?  I assume your test
  loaded from one subdomain?

 That's correct.

  I'm betting time-to-paint goes through the roof with resource bundles:-)

 It does right now because we don't support incremental extraction,
 which is why I didn't bother measuring time-to-paint.  The hope is
 that with incremental extraction, we won't take too much of a hit.


Well, here is the crux then.

What should browsers optimize for?  Should we take performance features
which optimize for PLT or time-to-first-paint or something else?  I have
spent a *ton* of time trying to answer this question (as have many others),
and this is just a tough one to answer.

For now, I believe the Chrome/WebKit teams are in agreement that sacrificing
time-to-first render to decrease PLT is a bad idea.  I'm not sure what the
firefox philosophy here is?

One thing we can do to better evaluate features is to simply always measure
both metrics.  If both metrics get better, then it is a clear win.  But
without recording both metrics, we just don't really know how to evaluate if
a feature is good or bad.

Sorry to send you through more work - I am not trying to nix your feature
:-(  I think it is great you are taking the time to study all of this.

Mike







 -Justin

 On Mon, Aug 9, 2010 at 1:30 PM, Mike Belshe m...@belshe.com wrote:
  Justin -
  Can you provide the content of the page which you used in your
 whitepaper?
  (https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820)
  I have a few concerns about the benchmark:
 a) Looks like pages were loaded exactly once, as per your notes?  How
  hard is it to run the tests long enough to get to a 95% confidence
 interval?
 b) As you note in the report, slow start will kill you.  I've verified
  this so many times it makes me sick.  If you try more combinations, I
  believe you'll see this.
 c) The 1.3MB of subresources in a single bundle seems unrealistic to
 me.
   On one hand you say that its similar to CNN, but note that CNN has
  JS/CSS/images, not just thumbnails like your test.  Further, note that
 CNN
  pulls these resources from multiple domains; combining them into one
 domain
  may work, but certainly makes the test content very different from CNN.
  So
  the claim that it is somehow representative seems incorrect.   For more
  accurate data on what websites look like,
  see http://code.google.com/speed/articles/web-metrics.html
 d) What did you do about subdomains in the test?  I assume your test
  loaded from one subdomain?
 e) There is more to a browser than page-load-time.
  Time-to-first-paint
  is critical as well.  For instance, in WebKit and Chrome, we have
 specific
  heuristics which optimize for time-to-render instead of total page load.
   CNN is always cited as a bad page, but it's really not - it just has a
  lot of content, both below and above the fold.  When the user can
 interact
  with the page successfully, the user is happy.  In other words, I know I
 can
  make webkit's PLT much faster by removing a couple of throttles.  But I
 also
  know that doing so worsens the user experience by delaying the time to
 first
  paint.  So - is it possible to measure both times?  I'm betting
  time-to-paint goes through the roof with resource bundles:-)
  If you provide the content, I'll try to run some tests.  It will take a
 few
  days.
  Mike
 
  On Mon, Aug 9, 2010 at 9:52 AM, Justin Lebar justin.le...@gmail.com
 wrote:
 
  On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor 
  simetrical+...@gmail.comsimetrical%2b...@gmail.com
 
  wrote:
   If UAs can assume that files with the same path
   are the same regardless of whether they came from a resource package
   or which, and they have all but a couple of the files cached, they
   could request those directly instead of from the resource package,
   even if a resource package is specified.
 
  These kinds of heuristics are far beyond the scope of resource
  packages as we're planning to implement them.  Again, I think this
  type of behavior is the 

Re: [whatwg] HTML resource packages

2010-08-10 Thread Boris Zbarsky

On 8/10/10 2:40 PM, Mike Belshe wrote:

For now, I believe the Chrome/WebKit teams are in agreement that
sacrificing time-to-first render to decrease PLT is a bad idea.  I'm not
sure what the firefox philosophy here is?


Fairly similar (though we have had people complain at us when we do in 
fact incrementally load a page for 20s that webkit just throws up on the 
screen all at once after sitting there with a blank viewport for 7s, for 
what it's worth)


-Boris


Re: [whatwg] HTML resource packages

2010-08-09 Thread Justin Lebar
On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor simetrical+...@gmail.com wrote:
 If UAs can assume that files with the same path
 are the same regardless of whether they came from a resource package
 or which, and they have all but a couple of the files cached, they
 could request those directly instead of from the resource package,
 even if a resource package is specified.

These kinds of heuristics are far beyond the scope of resource
packages as we're planning to implement them.  Again, I think this
type of behavior is the domain of a large change to the networking
stack, such as SPDY, not a small hack like resource packages.

-Justin

On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor simetrical+...@gmail.com wrote:
 On Fri, Aug 6, 2010 at 7:40 PM, Justin Lebar justin.le...@gmail.com wrote:
 I think this is a fair point.  But I'd suggest we consider the following:

 * It might be confusing for resources from a resource package to show
 up on a page which doesn't opt-in to resource packages in general or
 to that specific resource package.

 Only if the resource package contains a different file from the real
 one.  I suggest we treat this as a pathological case and accept that
 it will be broken and confusing -- or at least we consider how many
 extra optimizations we could make if we did accept that, before
 deciding whether the extra performance is worth the confusion.

 * There's no easy way to opt out of this behavior.  That is, if I
 explicitly *don't* want to load content cached from a resource
 package, I have to name that content differently.

 Why would you want that, if the files are the same anyway?

 * The avatars-on-a-forum use case is less convincing the more I think
 about it.  Certainly you'd want each page which displays many avatars
 to package up all the avatars into a single package.  So you wouldn't
 benefit from the suggested caching changes on those pages.

 I don't see why not.  If UAs can assume that files with the same path
 are the same regardless of whether they came from a resource package
 or which, and they have all but a couple of the files cached, they
 could request those directly instead of from the resource package,
 even if a resource package is specified.  So if twenty different
 people post on the page, and you've been browsing for a while and have
 eighteen of their avatars (this will be common, a handful of people
 tend to account for most posts in a given forum):

 1) With no resource packages, you fetch two separate avatars (but on
 earlier page views you suffered).

 2) With resource packages as you suggest, you fetch a whole resource
 package, 90% of which you don't need.  In fact, you have to fetch a
 resource package even if you have 100% of the avatars on the page!  No
 two pages will be likely to have the same resource package, so you
 can't share cache at all.

 3) With resource packages as I suggest, you fetch only two separate
 avatars, *and* you got the benefits of resource packages on earlier
 pages.  The UA gets to guess whether using resource packages would be
 a win on a case-by-case basis, so in particular, it should be able to
 perform strictly better than either (1) or (2), given decent
 heuristics.  E.g., the heuristic fetch the resource package if I need
 at least two files, fetch the file if I only need one will perform
 better than either (1) or (2) in any reasonable circumstance.

 I think this sort of situation will be fairly common.  Has anyone
 looked at a bunch of different types of web pages and done a breakdown
 of how many assets they have, and how they're reused across pages?  If
 we're talking about assets that are used only on one page (image
 search) or all pages (logos, shared scripts), your approach works
 fine, but not if they're used on a random mix of pages.  I think a lot
 of files will wind up being used on only particular subsets of pages.

 In general, I think we need something like SPDY to really address the
 problem of duplicated downloads.  I don't think resource packages can
 fix it with any caching policy.

 Certainly there are limits to what resource packages can do, but we
 can wind up closer to the limits or farther from them depending on the
 implementation details.



Re: [whatwg] HTML resource packages

2010-08-09 Thread Justin Lebar
 Can you provide the content of the page which you used in your whitepaper?
 (https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820)

I'll post this to the bug when I get home tonight.  But your comments
are astute -- the page I used is a pretty bad benchmark for a variety
of reasons.  It sounds like you probably could hack up a much better
one.

a) Looks like pages were loaded exactly once, as per your notes?  How
 hard is it to run the tests long enough to get to a 95% confidence interval?

Since I was running on a simulated network with no random parameters
(e.g. no packet loss), there was very little variance in load time
across runs.

d) What did you do about subdomains in the test?  I assume your test
 loaded from one subdomain?

That's correct.

 I'm betting time-to-paint goes through the roof with resource bundles:-)

It does right now because we don't support incremental extraction,
which is why I didn't bother measuring time-to-paint.  The hope is
that with incremental extraction, we won't take too much of a hit.

-Justin

On Mon, Aug 9, 2010 at 1:30 PM, Mike Belshe m...@belshe.com wrote:
 Justin -
 Can you provide the content of the page which you used in your whitepaper?
 (https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820)
 I have a few concerns about the benchmark:
    a) Looks like pages were loaded exactly once, as per your notes?  How
 hard is it to run the tests long enough to get to a 95% confidence interval?
    b) As you note in the report, slow start will kill you.  I've verified
 this so many times it makes me sick.  If you try more combinations, I
 believe you'll see this.
    c) The 1.3MB of subresources in a single bundle seems unrealistic to me.
  On one hand you say that its similar to CNN, but note that CNN has
 JS/CSS/images, not just thumbnails like your test.  Further, note that CNN
 pulls these resources from multiple domains; combining them into one domain
 may work, but certainly makes the test content very different from CNN.  So
 the claim that it is somehow representative seems incorrect.   For more
 accurate data on what websites look like,
 see http://code.google.com/speed/articles/web-metrics.html
    d) What did you do about subdomains in the test?  I assume your test
 loaded from one subdomain?
    e) There is more to a browser than page-load-time.  Time-to-first-paint
 is critical as well.  For instance, in WebKit and Chrome, we have specific
 heuristics which optimize for time-to-render instead of total page load.
  CNN is always cited as a bad page, but it's really not - it just has a
 lot of content, both below and above the fold.  When the user can interact
 with the page successfully, the user is happy.  In other words, I know I can
 make webkit's PLT much faster by removing a couple of throttles.  But I also
 know that doing so worsens the user experience by delaying the time to first
 paint.  So - is it possible to measure both times?  I'm betting
 time-to-paint goes through the roof with resource bundles:-)
 If you provide the content, I'll try to run some tests.  It will take a few
 days.
 Mike

 On Mon, Aug 9, 2010 at 9:52 AM, Justin Lebar justin.le...@gmail.com wrote:

 On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor simetrical+...@gmail.com
 wrote:
  If UAs can assume that files with the same path
  are the same regardless of whether they came from a resource package
  or which, and they have all but a couple of the files cached, they
  could request those directly instead of from the resource package,
  even if a resource package is specified.

 These kinds of heuristics are far beyond the scope of resource
 packages as we're planning to implement them.  Again, I think this
 type of behavior is the domain of a large change to the networking
 stack, such as SPDY, not a small hack like resource packages.

 -Justin

 On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor simetrical+...@gmail.com
 wrote:
  On Fri, Aug 6, 2010 at 7:40 PM, Justin Lebar justin.le...@gmail.com
  wrote:
  I think this is a fair point.  But I'd suggest we consider the
  following:
 
  * It might be confusing for resources from a resource package to show
  up on a page which doesn't opt-in to resource packages in general or
  to that specific resource package.
 
  Only if the resource package contains a different file from the real
  one.  I suggest we treat this as a pathological case and accept that
  it will be broken and confusing -- or at least we consider how many
  extra optimizations we could make if we did accept that, before
  deciding whether the extra performance is worth the confusion.
 
  * There's no easy way to opt out of this behavior.  That is, if I
  explicitly *don't* want to load content cached from a resource
  package, I have to name that content differently.
 
  Why would you want that, if the files are the same anyway?
 
  * The avatars-on-a-forum use case is less convincing the more I think
  about it.  Certainly you'd want each page 

Re: [whatwg] HTML resource packages

2010-08-09 Thread Boris Zbarsky

On 8/9/10 4:30 PM, Mike Belshe wrote:

CNN is always cited as a bad page, but it's really not - it just has a lot of 
content, both below and above the
fold.


It's a bad page because 1) It sends hundreds of kilobytes of content for 
no obvious reason whatsoever; most of it is unused and 2) it sends said 
content with no gzip compression.


-Boris


Re: [whatwg] HTML resource packages

2010-08-09 Thread Justin Lebar
The files I used for the rough benchmarks are available in a tarball
at [1].  Live pages are at [2] and [3].

[1] http://people.mozilla.org/~jlebar/respkg/test/benchmark_files.tgz
[2] http://people.mozilla.org/~jlebar/respkg/test/test-pkg.html
[3] http://people.mozilla.org/~jlebar/respkg/test/test-nopkg.html

-Justin

On Mon, Aug 9, 2010 at 1:40 PM, Justin Lebar justin.le...@gmail.com wrote:
 Can you provide the content of the page which you used in your whitepaper?
 (https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820)

 I'll post this to the bug when I get home tonight.  But your comments
 are astute -- the page I used is a pretty bad benchmark for a variety
 of reasons.  It sounds like you probably could hack up a much better
 one.

    a) Looks like pages were loaded exactly once, as per your notes?  How
 hard is it to run the tests long enough to get to a 95% confidence interval?

 Since I was running on a simulated network with no random parameters
 (e.g. no packet loss), there was very little variance in load time
 across runs.

    d) What did you do about subdomains in the test?  I assume your test
 loaded from one subdomain?

 That's correct.

 I'm betting time-to-paint goes through the roof with resource bundles:-)

 It does right now because we don't support incremental extraction,
 which is why I didn't bother measuring time-to-paint.  The hope is
 that with incremental extraction, we won't take too much of a hit.

 -Justin

 On Mon, Aug 9, 2010 at 1:30 PM, Mike Belshe m...@belshe.com wrote:
 Justin -
 Can you provide the content of the page which you used in your whitepaper?
 (https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820)
 I have a few concerns about the benchmark:
    a) Looks like pages were loaded exactly once, as per your notes?  How
 hard is it to run the tests long enough to get to a 95% confidence interval?
    b) As you note in the report, slow start will kill you.  I've verified
 this so many times it makes me sick.  If you try more combinations, I
 believe you'll see this.
    c) The 1.3MB of subresources in a single bundle seems unrealistic to me.
  On one hand you say that its similar to CNN, but note that CNN has
 JS/CSS/images, not just thumbnails like your test.  Further, note that CNN
 pulls these resources from multiple domains; combining them into one domain
 may work, but certainly makes the test content very different from CNN.  So
 the claim that it is somehow representative seems incorrect.   For more
 accurate data on what websites look like,
 see http://code.google.com/speed/articles/web-metrics.html
    d) What did you do about subdomains in the test?  I assume your test
 loaded from one subdomain?
    e) There is more to a browser than page-load-time.  Time-to-first-paint
 is critical as well.  For instance, in WebKit and Chrome, we have specific
 heuristics which optimize for time-to-render instead of total page load.
  CNN is always cited as a bad page, but it's really not - it just has a
 lot of content, both below and above the fold.  When the user can interact
 with the page successfully, the user is happy.  In other words, I know I can
 make webkit's PLT much faster by removing a couple of throttles.  But I also
 know that doing so worsens the user experience by delaying the time to first
 paint.  So - is it possible to measure both times?  I'm betting
 time-to-paint goes through the roof with resource bundles:-)
 If you provide the content, I'll try to run some tests.  It will take a few
 days.
 Mike

 On Mon, Aug 9, 2010 at 9:52 AM, Justin Lebar justin.le...@gmail.com wrote:

 On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor simetrical+...@gmail.com
 wrote:
  If UAs can assume that files with the same path
  are the same regardless of whether they came from a resource package
  or which, and they have all but a couple of the files cached, they
  could request those directly instead of from the resource package,
  even if a resource package is specified.

 These kinds of heuristics are far beyond the scope of resource
 packages as we're planning to implement them.  Again, I think this
 type of behavior is the domain of a large change to the networking
 stack, such as SPDY, not a small hack like resource packages.

 -Justin

 On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor simetrical+...@gmail.com
 wrote:
  On Fri, Aug 6, 2010 at 7:40 PM, Justin Lebar justin.le...@gmail.com
  wrote:
  I think this is a fair point.  But I'd suggest we consider the
  following:
 
  * It might be confusing for resources from a resource package to show
  up on a page which doesn't opt-in to resource packages in general or
  to that specific resource package.
 
  Only if the resource package contains a different file from the real
  one.  I suggest we treat this as a pathological case and accept that
  it will be broken and confusing -- or at least we consider how many
  extra optimizations we could make if we did accept that, before
  deciding whether the 

Re: [whatwg] HTML resource packages

2010-08-06 Thread Christoph Päper
Justin Lebar:
 Christoph Päper christoph.pae...@crissov.de wrote:
 
 Why do you want to put this on the HTML level (exclusively), not the HTTP 
 level?
 
 If you reference an image from a CSS file and include that CSS file in an 
 HTML file which uses resource packages, the image can be loaded from the 
 resource package.

Yeah, it’s still wrong. 

Resource packages in HTML seem okay for the image gallery use case (and then 
could be done with ‘link’), but they’re commonly inappropriate for anything 
referenced from ‘link’, ‘script’ and ‘style’ elements. Your remark on loading 
order just proves this point: you want resource packages referenced before 
‘head’. You should move one step further than the root element, i.e. to the 
transport layer.

Re: [whatwg] HTML resource packages

2010-08-06 Thread Justin Lebar
On Fri, Aug 6, 2010 at 12:46 AM, Christoph Päper
christoph.pae...@crissov.de wrote:
 Justin Lebar:
 Christoph Päper christoph.pae...@crissov.de wrote:

 Why do you want to put this on the HTML level (exclusively), not the HTTP 
 level?

 If you reference an image from a CSS file and include that CSS file in an 
 HTML file which uses resource packages, the image can be loaded from the 
 resource package.

 Yeah, it’s still wrong.

 Resource packages in HTML seem okay for the image gallery use case (and then 
 could be done with ‘link’), but they’re commonly inappropriate for anything 
 referenced from ‘link’, ‘script’ and ‘style’ elements. Your remark on loading 
 order just proves this point: you want resource packages referenced before 
 ‘head’. You should move one step further than the root element, i.e. to the 
 transport layer.

We want resource packages to work for people who don't have the
ability to set custom headers for their pages or who don't even know
what an HTTP header is.  I agree that it's a hack, but I don't
understand how putting the packages information in the html element
makes it inappropriate to load from a resource package resources
referenced in link, script, and style elements.

Is the issue just that the HTML file's |packages| attribute affects
what we load when we see @import url() in a separate CSS file?  This
seems like a feature, not a bug, to me.

SPDY will do this the Right Way, if we're patient.

-Justin


Re: [whatwg] HTML resource packages

2010-08-06 Thread Tab Atkins Jr.
On Fri, Aug 6, 2010 at 12:46 AM, Christoph Päper
christoph.pae...@crissov.de wrote:
 Justin Lebar:
 Christoph Päper christoph.pae...@crissov.de wrote:

 Why do you want to put this on the HTML level (exclusively), not the HTTP 
 level?

 If you reference an image from a CSS file and include that CSS file in an 
 HTML file which uses resource packages, the image can be loaded from the 
 resource package.

 Yeah, it’s still wrong.

 Resource packages in HTML seem okay for the image gallery use case (and then 
 could be done with ‘link’), but they’re commonly inappropriate for anything 
 referenced from ‘link’, ‘script’ and ‘style’ elements. Your remark on loading 
 order just proves this point: you want resource packages referenced before 
 ‘head’. You should move one step further than the root element, i.e. to the 
 transport layer.

This doesn't seem to make sense.  If you want resource packages
referenced before head, then the nearest appropriate location is
still html.  Moving it up to the transport layer isn't *wrong*, but
it's not *necessary* in this case.

~TJ


Re: [whatwg] HTML resource packages

2010-08-06 Thread Aryeh Gregor
On Tue, Aug 3, 2010 at 8:31 PM, Justin Lebar justin.le...@gmail.com wrote:
 We at Mozilla are hoping to ship HTML resource packages in Firefox 4,
 and we wanted to get the WhatWG's feedback on the feature.

 For the impatient, the spec is here:

    http://people.mozilla.org/~jlebar/respkg/

I have some concerns about caching behavior here, which I've mentioned
before.  Consider a site that has a landing page that has lots of
first-time viewers.  To accelerate that page view, you might want to
add a resource package containing all the assets on the page, to speed
up views in the cold cache case.  Some of those assets will be reused
on other pages, and some will not.

When the user navigates to another page, what's supposed to happen?
If you hadn't used resource packages at all, they would have a hot
cache, so they'd get all the shared assets on every subsequent page
view for free.  But now they don't -- instead of the first view being
slow, it's the second view, when they leave the landing page.  This
isn't a big improvement.

So if resource packages don't share caches, you need to either give up
on caching, put a given file only in one resource package on your
whole site.  The latter is not practical if pages use small, fairly
random subsets of your assets and it's not feasible to package them
all on every page view.  Think avatars on a web forum: you might have
20 different avatars displayed per page, from a pool of tens of
thousands or more.  Do you have to decide between not using resource
packages and not getting any caching?

You've said before that your goal in this requirement is
predictability -- if there's an inconsistency between different
resource packages or between a resource package and the real file, you
don't want users to get different results depending on what order they
visit the pages in.  This is fair enough, but I'm worried that the
caching problems this approach causes will make it more of a hindrance
than a benefit for a wide class of use-cases.  There's some possible
inconsistency anyway whenever caching is permitted at all, because if
the page provides incorrect caching headers, the UA might have an
out-of-date copy.  Also, different browsers will be inconsistent too,
until all UAs in common use have implemented resource packages -- some
will use the packaged file and some the real file.  Is the extra
inconsistency from letting the caches mix really too much to ask for
the cacheability benefits?  I don't think so.


Re: [whatwg] HTML resource packages

2010-08-06 Thread Justin Lebar
 So if resource packages don't share caches, you need to either give up
 on caching, [or] put a given file only in one resource package on your
 whole site.  The latter is not practical if pages use small, fairly
 random subsets of your assets and it's not feasible to package them
 all on every page view.  Think avatars on a web forum

I think this is a fair point.  But I'd suggest we consider the following:

* It might be confusing for resources from a resource package to show
up on a page which doesn't opt-in to resource packages in general or
to that specific resource package.

* There's no easy way to opt out of this behavior.  That is, if I
explicitly *don't* want to load content cached from a resource
package, I have to name that content differently.

* The avatars-on-a-forum use case is less convincing the more I think
about it.  Certainly you'd want each page which displays many avatars
to package up all the avatars into a single package.  So you wouldn't
benefit from the suggested caching changes on those pages.

You might benefit on a user profile page which just displays one
avatar.  You might try and be clever and leave the avatar out of the
profile page's resource package on the assumption that the UA already
has that avatar in its cache.  But then your page would load slower
for users who visited the profile page without first getting the
avatar from another resource package.

Maybe you'd benefit from the suggested changes if you'd half-deployed
resource packages on your site, so some pages had packages and others
didn't.  But I don't think that's a use case we should design for.

In general, I think we need something like SPDY to really address the
problem of duplicated downloads.  I don't think resource packages can
fix it with any caching policy.

-Justin

On Fri, Aug 6, 2010 at 2:17 PM, Aryeh Gregor simetrical+...@gmail.com wrote:
 On Tue, Aug 3, 2010 at 8:31 PM, Justin Lebar justin.le...@gmail.com wrote:
 We at Mozilla are hoping to ship HTML resource packages in Firefox 4,
 and we wanted to get the WhatWG's feedback on the feature.

 For the impatient, the spec is here:

    http://people.mozilla.org/~jlebar/respkg/

 I have some concerns about caching behavior here, which I've mentioned
 before.  Consider a site that has a landing page that has lots of
 first-time viewers.  To accelerate that page view, you might want to
 add a resource package containing all the assets on the page, to speed
 up views in the cold cache case.  Some of those assets will be reused
 on other pages, and some will not.

 When the user navigates to another page, what's supposed to happen?
 If you hadn't used resource packages at all, they would have a hot
 cache, so they'd get all the shared assets on every subsequent page
 view for free.  But now they don't -- instead of the first view being
 slow, it's the second view, when they leave the landing page.  This
 isn't a big improvement.

 So if resource packages don't share caches, you need to either give up
 on caching, put a given file only in one resource package on your
 whole site.  The latter is not practical if pages use small, fairly
 random subsets of your assets and it's not feasible to package them
 all on every page view.  Think avatars on a web forum: you might have
 20 different avatars displayed per page, from a pool of tens of
 thousands or more.  Do you have to decide between not using resource
 packages and not getting any caching?

 You've said before that your goal in this requirement is
 predictability -- if there's an inconsistency between different
 resource packages or between a resource package and the real file, you
 don't want users to get different results depending on what order they
 visit the pages in.  This is fair enough, but I'm worried that the
 caching problems this approach causes will make it more of a hindrance
 than a benefit for a wide class of use-cases.  There's some possible
 inconsistency anyway whenever caching is permitted at all, because if
 the page provides incorrect caching headers, the UA might have an
 out-of-date copy.  Also, different browsers will be inconsistent too,
 until all UAs in common use have implemented resource packages -- some
 will use the packaged file and some the real file.  Is the extra
 inconsistency from letting the caches mix really too much to ask for
 the cacheability benefits?  I don't think so.



Re: [whatwg] HTML resource packages

2010-08-04 Thread Christoph Päper
Justin Lebar:

 We at Mozilla are hoping to ship HTML resource packages in Firefox 4,
http://people.mozilla.org/~jlebar/respkg/
| html packages='[pkg1.zip img1.png script.js styles/style.css]
|[static/pkg2.zip]'
 A page indicates in its html element that it uses one or more resource 
 packages (…).

Why do you want to put this on the HTML level (exclusively), not the HTTP level?
As far as I undestand it, authors would usually put stylesheets, scripts and 
decorative images, but not HTML files into a resource package. These are 
usually common to several pages or the entire site or domain. Images might be 
referenced from within HTML or CSS files.

Why did you decide against link rel=resource-package 
href=pkg1.zip#files='img1.png,…'/ or something like that? (The hash part is 
just guesswork.)

* Argument: What about incremental rendering? 
If there are, for instance, lots of (content) images in the resource file I 
will see them all at once as soon as the ZIP has been downloaded completely and 
decompressed, but with single files I would have seen them appear one after the 
other, which might have been enough.

Re: [whatwg] HTML resource packages

2010-08-04 Thread James May
On 4 August 2010 20:08, Christoph Päper christoph.pae...@crissov.de wrote:
 * Argument: What about incremental rendering?
 If there are, for instance, lots of (content) images in the resource file I 
 will see them all at once as soon as the ZIP has been downloaded completely 
 and decompressed, but with single files I would have seen them appear one 
 after the other, which might have been enough.

ZIP files are progressively renderable, dependant on file order.


Re: [whatwg] HTML resource packages

2010-08-04 Thread Diego Perini
On Wed, Aug 4, 2010 at 12:11 PM, James May wha...@fowlsmurf.net wrote:

 On 4 August 2010 20:08, Christoph Päper christoph.pae...@crissov.de
 wrote:
  * Argument: What about incremental rendering?
  If there are, for instance, lots of (content) images in the resource file
 I will see them all at once as soon as the ZIP has been downloaded
 completely and decompressed, but with single files I would have seen them
 appear one after the other, which might have been enough.

 ZIP files are progressively renderable, dependant on file order.


In my experience gzip compression is blocking browser rendering until the
compressed file has been received completely.

I believe this is the reason we should not compress the HTML source, just
its external binary components.

I don't think the browser can separately decompress each block of a chunked
transfer as it arrives, am I wrong ?


Diego Perini


Re: [whatwg] HTML resource packages

2010-08-04 Thread Philip Taylor
On Wed, Aug 4, 2010 at 1:31 AM, Justin Lebar justin.le...@gmail.com wrote:
 We at Mozilla are hoping to ship HTML resource packages in Firefox 4,
 and we wanted to get the WhatWG's feedback on the feature.

 For the impatient, the spec is here:

    http://people.mozilla.org/~jlebar/respkg/

It seems a bit surprising that [pkg.zip img1.png img2.png] provides
more files than [pkg.zip img1.png] but *fewer* files than [pkg.zip]
(which includes all files). I can imagine people would write code
like:

  print html packages='[cached-image-thumbnails.zip  . (join  ,
@thumbnails_which_are_not_out_of_date) . ]';

(intending the package to be updated infrequently, and used only for
images that haven't been modified since the last package update), and
they would get completely the wrong behaviour when the list is empty.
So maybe [pkg.zip] should mean no files (vs pkg.zip which still
means all files).


Filenames in zips are byte-strings, not Unicode-character-strings.
What should happen with non-ASCII in the zip's list of contents?
People will use standard zip programs and frequently end up with
various random character encodings in their file - would browsers
guess or decode as CP437 or decode as UTF-8 or fail? would they look
at the zip header's language encoding flag? etc.


What happens if the document contains multiple html elements (not
all the root element)? (e.g. if it's XHTML, or the elements are added
by scripts). The packages spec seems to assume there is only ever one.


The note at the end of 4.1 seems to be about avoiding problems like
http://evil.com/ saying:

html packages=eviloverride.zip !-- gets downloaded from evil.com --
base href=http://bank.com/;
img src=http://bank.com/logo.png; !-- this shouldn't be
allowed to come from the .zip --

Why is this particular example an important problem? If the attacker
wants to insert their own files into their own pages, they can just do
it directly without using packages. Since this is (I assume) only used
for resources like images and scripts and stylesheets, and not for a
hrefs or iframe hrefs, I don't see how it would let the attacker
circumvent any same-origin restrictions or do anything else dangerous.

The opposite way seems more dangerous, where evil.com says:

html 
packages=http://evil.com/redirect.cgi?http://secret-bank-intranet-server/packages.zip;
img src=http://evil.com/logo.png;
!-- now use canvas to read the pixel data of the secret logo,
since it was loaded from the evil.com origin --

Is anything stopping that?

In 4.3 step 2: What is pkg-url initialised to? (The package href of p?)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] HTML resource packages

2010-08-04 Thread Kornel Lesiński
On 4 Aug 2010, at 11:46, Diego Perini wrote:

  * Argument: What about incremental rendering?
  If there are, for instance, lots of (content) images in the resource file I 
  will see them all at once as soon as the ZIP has been downloaded completely 
  and decompressed, but with single files I would have seen them appear one 
  after the other, which might have been enough.
 
 ZIP files are progressively renderable, dependant on file order.
 
 In my experience gzip compression is blocking browser rendering until the 
 compressed file has been received completely.
 
 I believe this is the reason we should not compress the HTML source, just its 
 external binary components.
 
 I don't think the browser can separately decompress each block of a chunked 
 transfer as it arrives, am I wrong ?

You are wrong. gzip compression is streamable and browsers can uncompress parts 
of gzipped file as it is downloaded. gzip only needs to buffer some data before 
returning uncompressed chunk, but it's only few KB. Chunks of gzipped data 
don't have to align with chunked HTTP encoding (those are independent layers).

-- 
regards, Kornel



Re: [whatwg] HTML resource packages

2010-08-04 Thread timeless
People should probably consider reading the Web Apps Widgets working
group archives (they're public) about widget packaging.

There are long discussions about zip and gzip, etc.

http://www.w3.org/TR/widgets/#zip-archive

Especially http://www.w3.org/TR/widgets/#character-sets covers character sets.

As for zip streaming / gzip streaming...
Officially Zip technically has ways to construct archives which are
painful. In practice I don't think that's a real problem (beyond that
user agents would need to ensure to fail any packages which abuse
those features).
People tend to come late to the game and say why didn't you use
gzip. The general short answer is that gzip doesn't cover a file
container format at all, and browsers tend to already support zip. So
the cost of using zip is negligible whereas adding something else
which is messy (e.g. tar, star, pax) is painful. And if you think that
tar is well specified, I have a bridge I'd like to sell you.


Re: [whatwg] HTML resource packages

2010-08-04 Thread Justin Lebar
 Brett Zamir bret...@yahoo.com wrote:
 1) I think it would be nice to see explicit confirmation in the spec that 
 this works with offline caching.

Yes.  I'll do that.

 2) Could data files such as .txt, .json, or .xml files be used as part of
 such a package as well?

 3) Can XMLHttpRequest be made to reference such files and get them from the
 cache, and if so, when referencing only a zip in the packages attribute, can
 XMLHttpRequest access files in the zip not spelled out by a tag like link/?
 I think this would be quite powerful/avoid duplication, even if it adds
 functionality (like other HTML5 features) which would not be available to
 older browsers.

This is tricky.  The problem is: If you have an img on a page which might be
able to be served from a resource package, we'll block the download of the
image until can either serve the request from a resource package or can be sure
that no package contains the image.

I can imagine this behavior being confusing with XMLHttpRequests.  On the other
hand, it could certainly be powerful when used correctly.

I think the natural thing is go ahead and treat things requested by an
XMLHttpRequest the same as anything else on a page and retrieve them from
packages as possible.  If you really don't want your XMLHttpRequest to block on
a resource package, you can always use a POST.  But I need to investigate more
to determine whether this makes sense.

 4) Could such a protocol also be made to accommodate profiles of packages,
 e.g., by a namespace being allowable somewhere for each package?

This sounds way outside the scope of what we're trying to do with resource
packages.  I'm all for designing for the future, but I don't think we want to
introduce the complexity even of these namespaces unless we intend to use them
immediately.

 Maciej Stachowiak m...@apple.com wrote:

 Have you done any performance testing of this feature, and if so can you 
 share any of that data?

There's a document (PDF) with some rough performance numbers in the bug:

https://bugzilla.mozilla.org/attachment.cgi?id=455820

Although the results are preliminary, I think doing much more than this on a
simulated network for a test page might be going a bit overboard.  Results from
real pages over real networks would be much more meaningful at this point.

 Separately, I am curious to hear how http headers are handled; it's a TODO in
 the spec, and what the TODO says seems poor for the Content-Type header in
 particular. It would make it hard to use package resources in any context
 that looks at the MIME type rather than always sniffing. Any thoughts on
 this?

The intent is for UAs to sniff the content-type of anything coming from a
resource package, so I think that TODO needs to be turned on its head: The UA
shouldn't apply any of the response headers from the resource package to its
elements.

 Christoph Päper christoph.pae...@crissov.de wrote:
 A page indicates in its html element that it uses one or more resource 
 packages (…).

 Why do you want to put this on the HTML level (exclusively), not the HTTP 
 level?
 ...
 Images might be referenced from within HTML or CSS files.

If you reference an image from a CSS file and include that CSS file in an HTML
file which uses resource packages, the image can be loaded from the resource
package.

 Why did you decide against link rel=resource-package
 href=pkg1.zip#files='img1.png,…'/ or something like that? (The hash part
 is just guesswork.)

We actually originally spec'ed resource packages with the link tag, but we
encountered some difficulties with this.  For example, it led to confusing
behavior when a resource package was defined after a link rel='javascript'.
Do we load the script from the network, or do we wait until we've received the
whole head before loading any scripts?

Resource packages as a link also interacted poorly with Mozilla's speculative
parsing algorithm, which tries to download resources before we run the page's
scripts.  We probably could have come up with semantics which didn't run into
problems with our own speculative parsing implementation, but we realized it
would be difficult to spec it in such a way that we didn't make things very
difficult for *someone*.

 * Argument: What about incremental rendering?

The spec (and our implementation in Firefox) cares deeply about incremental
rendering.  Although the zip format isn't strictly suitable for incremental
extraction, I defined alternate semantics in the spec which should work.

Zip is better than tar-gz for this kind of thing for two reasons:

 * Zip file headers are uncompressed, so you don't have to extract the whole
   file in order to tell what's inside.

 * Entries in a zip file are individually compressed.  Although this might
   cause you to compress less effectively, you can compress all your files
   ahead of time and construct a zip file on the fly pretty very cheaply.

 Philip Taylor excors+wha...@gmail.com wrote:
 It seems a bit surprising that 

Re: [whatwg] HTML resource packages

2010-08-04 Thread Philip Taylor
On Wed, Aug 4, 2010 at 9:01 PM, Justin Lebar justin.le...@gmail.com wrote:
 What happens if the document contains multiple html elements (not
 all the root element)? (e.g. if it's XHTML, or the elements are added
 by scripts). The packages spec seems to assume there is only ever one.

 The packages attribute should work like the manifest attribute currently 
 works.
 I don't see language in the cache manifest section of HTML5 (6.6) specifying
 what happens when there are multiple html elements, so I hope I don't need 
 to
 specify this either.  :)

http://whatwg.org/html#attr-html-manifest says:

  The manifest attribute only has an effect during the early stages
of document load. Changing the attribute dynamically thus has no
effect (and thus, no DOM API is provided for this attribute).

Its effect is triggered from http://whatwg.org/html#parser-appcache
(html token in the before html insertion mode) or from
http://whatwg.org/html#read-xml , so it will only ever run for the
root html element of the document.

The packages attribute is defined as running Whenever the packages
attribute is changed (including when the document is first loaded, if
its html element has a packages attribute), so it's not the same.
If you do want it to work the same then you'll need to hook into the
parser and ignore dynamic updates.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] HTML resource packages

2010-08-04 Thread Justin Lebar
 If you do want it to work the same then you'll need to hook into the
 parser and ignore dynamic updates.

Indeed.  And since I explicitly *do* want dynamic updates, it'll need to change.

Thanks.

On Wed, Aug 4, 2010 at 1:32 PM, Philip Taylor excors+wha...@gmail.com wrote:
 On Wed, Aug 4, 2010 at 9:01 PM, Justin Lebar justin.le...@gmail.com wrote:
 What happens if the document contains multiple html elements (not
 all the root element)? (e.g. if it's XHTML, or the elements are added
 by scripts). The packages spec seems to assume there is only ever one.

 The packages attribute should work like the manifest attribute currently 
 works.
 I don't see language in the cache manifest section of HTML5 (6.6) specifying
 what happens when there are multiple html elements, so I hope I don't need 
 to
 specify this either.  :)

 http://whatwg.org/html#attr-html-manifest says:

  The manifest attribute only has an effect during the early stages
 of document load. Changing the attribute dynamically thus has no
 effect (and thus, no DOM API is provided for this attribute).

 Its effect is triggered from http://whatwg.org/html#parser-appcache
 (html token in the before html insertion mode) or from
 http://whatwg.org/html#read-xml , so it will only ever run for the
 root html element of the document.

 The packages attribute is defined as running Whenever the packages
 attribute is changed (including when the document is first loaded, if
 its html element has a packages attribute), so it's not the same.
 If you do want it to work the same then you'll need to hook into the
 parser and ignore dynamic updates.

 --
 Philip Taylor
 exc...@gmail.com



Re: [whatwg] HTML resource packages

2010-08-04 Thread Diego Perini
2010/8/4 Kornel Lesiński kor...@geekhood.net

 On 4 Aug 2010, at 11:46, Diego Perini wrote:

   * Argument: What about incremental rendering?
   If there are, for instance, lots of (content) images in the resource
 file I will see them all at once as soon as the ZIP has been downloaded
 completely and decompressed, but with single files I would have seen them
 appear one after the other, which might have been enough.
 
  ZIP files are progressively renderable, dependant on file order.
 
  In my experience gzip compression is blocking browser rendering until
 the compressed file has been received completely.
 
  I believe this is the reason we should not compress the HTML source, just
 its external binary components.
 
  I don't think the browser can separately decompress each block of a
 chunked transfer as it arrives, am I wrong ?

 You are wrong. gzip compression is streamable and browsers can uncompress
 parts of gzipped file as it is downloaded. gzip only needs to buffer some
 data before returning uncompressed chunk, but it's only few KB. Chunks of
 gzipped data don't have to align with chunked HTTP encoding (those are
 independent layers).


Tank you for the informations and correcting my statements. I just tried and
definitely chunked and gzip compression can happen at the same time.

The problem is I have a strange effect on my pages if I enable Apache
SetOutputFilter DEFLATE, the pages progress and rendering are different.

It works well with zlib.output_compression, more or less with no visible
changes from non compressed. I will have to dig what makes this difference.

Diego Perini



 --
 regards, Kornel




[whatwg] HTML resource packages

2010-08-03 Thread Justin Lebar
We at Mozilla are hoping to ship HTML resource packages in Firefox 4,
and we wanted to get the WhatWG's feedback on the feature.

For the impatient, the spec is here:

http://people.mozilla.org/~jlebar/respkg/

and the bug (complete with builds you can try and some preliminary
performance numbers) is here:

https://bugzilla.mozilla.org/show_bug.cgi?id=529208


You can think of resource packages as image spriting 2.0.  A page
indicates in its html element that it uses one or more resource
packages (which are just zip files).  Then when that page requests a
resource (be it an image, a css file, a script, or whatever), the
browser first checks whether one of the packages contains the
requested resource.  If so, the browser uses the resource out of the
package instead of making a separate HTTP request for the resource.

There's of course more detail than that, of course.  Hopefully it's
(mostly) clear in the spec.

I envision two classes of users of resource packages.  I'll call the
first resource-constrained developers.  These developers care about
how fast their page is (who doesn't?), but can't spend weeks speeding
up their page.  For these developers, resource packages are an easy
way to make their pages faster without going through the pain of
spriting their images and packaging their js/css.

The other class of users are the resource-unconstrained developers;
think Google or Facebook.  These developers have already put a huge
amount of effort into making their pages fast, and a naive application
of resource packages is unlikely to make them any faster.  But these
developers may be able to use resource packages cleverly to gain
speedups.  In particular, nobody (to my knowledge) currently sprites
content images, such as the results of an image search.  A determined
set of developers should be able to construct resource packages for
image search results on the fly and save some HTTP requests.


So we can avoid rehashing here the common objections to resource
packages, here's a brief overview of the arguments I've heard against
the feature and my responses.

* Argument: Packaging isn't the way forward.  When you change one
resource in a package you have to change the whole package and so the
user has to re-download all the bits when most of what was in their
cache would have been fine.

This is of course correct, but we don't think it eliminates the
utility of resource packages.  The resource-constrained developer is
probably happy with anything which speeds up page loads, even if it's
not optimal when one part of the page changes.  And the
resource-unconstrained developer probably won't find resource packages
too useful for non-dynamic content, so caching isn't an issue in that
case.

* Argument: We can already package things pretty well.  Mozilla should
instead be focusing on improving caching (or something else).

I'd contend that we don't package particularly well in general.  The
Facebook homepage loads 100 separate resources on a cold cache, and
they certainly care about speed.  But anyway, this is just one
project.  We're also looking at caching.  :)

* Argument: Isn't this subsumed by HTTP pipelining?

Mostly.  But we can't turn on HTTP pipelining because transparent
proxies break it.

Resource packages have the further benefit that they allow page
authors to explicitly set the order in which the UA will download the
resources -- with pipelining, an important resource might get stuck
behind a large, unimportant resource, while with resource packages,
the UA always downloads resources in the order they appear in the zip
file.

Last, my understanding is that the HTTP pipeline isn't particularly
deep, so perhaps resource packages fill the TCP pipe better on
high-latency connections.  I haven't looked into this, though.

* Argument: What about SPDY?

I think SPDY should subsume resource packages.  But its deployment
will require changes to both web clients and servers, so it will
probably take a while after it's released before it's available on all
web servers.  And we have no idea when to expect SPDY to be ready for
production.  Resource packages, in contrast, are something we can have
Right Now.

Additionally, since resource packages are backwards-compatible -- a
page which specifies resource packages should display just fine in a
browser which doesn't support them -- we should be able to turn off
resource packages in the future if we decide we don't want them
anymore.


We'd love to hear what you think of the specification and our implementation.

-Justin


Re: [whatwg] HTML resource packages

2010-08-03 Thread Tab Atkins Jr.
On Tue, Aug 3, 2010 at 5:31 PM, Justin Lebar justin.le...@gmail.com wrote:
 We at Mozilla are hoping to ship HTML resource packages in Firefox 4,
 and we wanted to get the WhatWG's feedback on the feature.

 For the impatient, the spec is here:

    http://people.mozilla.org/~jlebar/respkg/

 and the bug (complete with builds you can try and some preliminary
 performance numbers) is here:

    https://bugzilla.mozilla.org/show_bug.cgi?id=529208
[snip]
 We'd love to hear what you think of the specification and our implementation.

I love it!  You guys seem to have hit all the big points while
sidestepping the obvious problems.

Resource packages as Sprites 2.0 makes me very, very happy (and also
more confident about removing any attempt at a spriting solution from
CSS - it's the wrong layer).

~TJ


Re: [whatwg] HTML resource packages

2010-08-03 Thread Brett Zamir

 This is and was a great idea. A few points/questions:

1) I think it would be nice to see explicit confirmation in the spec 
that this works with offline caching.


2) Could data files such as .txt, .json, or .xml files be used as part 
of such a package as well?


3) Can XMLHttpRequest be made to reference such files and get them from 
the cache, and if so, when referencing only a zip in the packages 
attribute, can XMLHttpRequest access files in the zip not spelled out by 
a tag like link/? I think this would be quite powerful/avoid 
duplication, even if it adds functionality (like other HTML5 features) 
which would not be available to older browsers.


4) Could such a protocol also be made to accommodate profiles of 
packages, e.g., by a namespace being allowable somewhere for each package?


Thus, if a package is specified as say being under the XProc (XML 
Pipelining) namespace profile, the browser would know it could 
confidently look for a manifest file with a given name and act 
accordingly if the profile were eventually formalized through future 
specifications or implemented by general purpose scripting libraries or 
browser extensions, etc.


Another example would be if a file packaging format were referenced by a 
page, allowing, along with a set of files, a manifest format like METS 
to be specified and downloaded, describing a sitemap for a package of 
files (perhaps to be added immediately to the user's IndexedDB database, 
navigated Gopher-like, etc.) and then made navigable online or offline 
if the files were included in the zip, thus allowing a single HTTP 
request to download a whole site (e.g., if a site offered a collection 
of books).


And manifest files might be made to specify which files should be 
updated at a specific time independently of the package (e.g., checking 
periodically for an updated manifest file outside of a zip which could 
point to newer versions).


Note: the above is not asking browsers to implement any such additional 
complex functionality here and now; rather, it is just to allow for the 
possibility of automated discovery of package files having a particular 
structure (e.g., with specifically named manifest files to indicate how 
to interpret the package contents) by providing a programmatically 
accessible namespace for each package which could be unique per 
application and interpreted in particular ways, including by general 
purpose JavaScript libraries. This is not talking about adding 
namespaces to HTML itself, but rather for specifying package profiles.


Such extensibility would, as far as I can see it, allow for some very 
powerful declarative styles of programming in relation to handling of 
multiple files (whether resource files, data files, or complete pages), 
while piggybacking on the proposal's ability to minimize the HTTP 
requests needed to get them.


best wishes,
Brett


On 8/4/2010 8:31 AM, Justin Lebar wrote:

We at Mozilla are hoping to ship HTML resource packages in Firefox 4,
and we wanted to get the WhatWG's feedback on the feature.

For the impatient, the spec is here:

 http://people.mozilla.org/~jlebar/respkg/

and the bug (complete with builds you can try and some preliminary
performance numbers) is here:

 https://bugzilla.mozilla.org/show_bug.cgi?id=529208


You can think of resource packages as image spriting 2.0.  A page
indicates in itshtml  element that it uses one or more resource
packages (which are just zip files).  Then when that page requests a
resource (be it an image, a css file, a script, or whatever), the
browser first checks whether one of the packages contains the
requested resource.  If so, the browser uses the resource out of the
package instead of making a separate HTTP request for the resource.

There's of course more detail than that, of course.  Hopefully it's
(mostly) clear in the spec.

I envision two classes of users of resource packages.  I'll call the
first resource-constrained developers.  These developers care about
how fast their page is (who doesn't?), but can't spend weeks speeding
up their page.  For these developers, resource packages are an easy
way to make their pages faster without going through the pain of
spriting their images and packaging their js/css.

The other class of users are the resource-unconstrained developers;
think Google or Facebook.  These developers have already put a huge
amount of effort into making their pages fast, and a naive application
of resource packages is unlikely to make them any faster.  But these
developers may be able to use resource packages cleverly to gain
speedups.  In particular, nobody (to my knowledge) currently sprites
content images, such as the results of an image search.  A determined
set of developers should be able to construct resource packages for
image search results on the fly and save some HTTP requests.


So we can avoid rehashing here the common objections to resource
packages, here's a brief overview of the arguments I've 

Re: [whatwg] HTML resource packages

2010-08-03 Thread Maciej Stachowiak

On Aug 3, 2010, at 5:31 PM, Justin Lebar wrote:

 We at Mozilla are hoping to ship HTML resource packages in Firefox 4,
 and we wanted to get the WhatWG's feedback on the feature.
 
 For the impatient, the spec is here:
 
http://people.mozilla.org/~jlebar/respkg/
 
 and the bug (complete with builds you can try and some preliminary
 performance numbers) is here:
 
https://bugzilla.mozilla.org/show_bug.cgi?id=529208

Have you done any performance testing of this feature, and if so can you share 
any of that data?

I'm particularly interested in:

* Effect of using a resource package on page-load time, in the initial fully 
uncached case.
* Effect of using a resource package on page-load time, in the case where the 
resources in the package have expired but not have changed.
* Effect of using a resource package on page-load time, in the case where the 
resources in the package have expired and a subset of them have changed. (This 
could still be a win for packages.)
* Effect of using a resource package on page-load time, in the case where 
everything in the package is cached.

These are probably most interesting under high-latency network conditions (real 
or simulated). You address these points qualitatively in your comments but I'd 
love to see some numbers. That would make it easier to evaluate the performance 
tradeoffs.


Separately, I am curious to hear how http headers are handled; it's a TODO in 
the spec, and what the TODO says seems poor for the Content-Type header in 
particular. It would make it hard to use package resources in any context that 
looks at the MIME type rather than always sniffing. Any thoughts on this?


In general I am in favor of features that can improve page load times and which 
are 


Cheers,
Maciej