On Jul 13, 2009, at 3: 01PM, Aryeh Gregor wrote:
How about you have an extra HTTP header like "X-Content-Hash"?  This
could provide a SHA256 hash (or something else that looks safe for
now, progressively upgradeable) of the content.  The browser can keep
its cached copies of these files indexed by hash.  If it tries
downloading a file, and notices that the hash is the same as a file
already downloaded, it can terminate the HTTP connection and use the
existing file (even if it's from a different site).  It will then
proceed as though it had actually downloaded the file: e.g., it will
respect the Expires headers separately (two sites might serve the same
file but have different expectations about how likely it is to
change).


I think thats brilliant.  Its a cache that works across all sites.

On Jul 13, 2009, at 3: 01PM, Aryeh Gregor wrote:
The most obvious place to solve this seems to be HTTP, not HTML.  HTTP
is closer to the resource itself.  If you do something with HTML, like
an extra <link> attribute, then you're going to get authors updating
the HTML but not the thing it points to or vice versa.  An ETag-like
solution would be implemented either in the web server or whatever
script is serving the content, and those should always know whether
the file has changed.  (Modulo pathological behavior like something
changing the file and then forging the mtime/ctime.)

I agree.


On Jul 13, 2009, at 4: 20PM, Aryeh Gregor wrote:
Does anyone have statistics on how useful this would be in real life?
I suspect only marginally.

I think this is the most important aspect of this idea. We don't yet know if this is worth doing yet.

It is likely only preventing the initial download of some files. However some of the initial framework sizes are getting hefty: (These were the sizes just pulled from google's hosted libraries for the latest versions)

    91K  dojo.xd.js
    79K  ext-core.js
   182K  jquery-ui.min.js
    56K  jquery.min.js
   127K  prototype.js
   2.6K  scriptaculous.js
    10K  swfobject.js
    27K  yuiloader-min.js

I'm guessing that mobile browsers would benefit from not needing to download a few of those 100KB downloads and use up an HTTP Connection to do so. And speaking of mobile, there are some mobile specific web application frameworks (I'm thinking of ones for iPhone web apps) that don't have a single cache point (like google) that weigh in pretty heavily.

Real statistics would make it obvious wether or not this is a good idea. But even still, I like the HTTP idea, because at that point its just a more efficient way to cache files, by content, across all the entire web, rather then site specific and by name/URL.

- Joe

Reply via email to