On Jul 13, 2009, at 3: 01PM, Aryeh Gregor wrote:
How about you have an extra HTTP header like "X-Content-Hash"? This
could provide a SHA256 hash (or something else that looks safe for
now, progressively upgradeable) of the content. The browser can keep
its cached copies of these files indexed by hash. If it tries
downloading a file, and notices that the hash is the same as a file
already downloaded, it can terminate the HTTP connection and use the
existing file (even if it's from a different site). It will then
proceed as though it had actually downloaded the file: e.g., it will
respect the Expires headers separately (two sites might serve the same
file but have different expectations about how likely it is to
change).
I think thats brilliant. Its a cache that works across all sites.
On Jul 13, 2009, at 3: 01PM, Aryeh Gregor wrote:
The most obvious place to solve this seems to be HTTP, not HTML. HTTP
is closer to the resource itself. If you do something with HTML, like
an extra <link> attribute, then you're going to get authors updating
the HTML but not the thing it points to or vice versa. An ETag-like
solution would be implemented either in the web server or whatever
script is serving the content, and those should always know whether
the file has changed. (Modulo pathological behavior like something
changing the file and then forging the mtime/ctime.)
I agree.
On Jul 13, 2009, at 4: 20PM, Aryeh Gregor wrote:
Does anyone have statistics on how useful this would be in real life?
I suspect only marginally.
I think this is the most important aspect of this idea. We don't yet
know if this is worth doing yet.
It is likely only preventing the initial download of some files.
However some of the initial framework sizes are getting hefty: (These
were the sizes just pulled from google's hosted libraries for the
latest versions)
91K dojo.xd.js
79K ext-core.js
182K jquery-ui.min.js
56K jquery.min.js
127K prototype.js
2.6K scriptaculous.js
10K swfobject.js
27K yuiloader-min.js
I'm guessing that mobile browsers would benefit from not needing to
download a few of those 100KB downloads and use up an HTTP Connection
to do so. And speaking of mobile, there are some mobile specific web
application frameworks (I'm thinking of ones for iPhone web apps) that
don't have a single cache point (like google) that weigh in pretty
heavily.
Real statistics would make it obvious wether or not this is a good
idea. But even still, I like the HTTP idea, because at that point its
just a more efficient way to cache files, by content, across all the
entire web, rather then site specific and by name/URL.
- Joe