Re: [whatwg] AppCache-related e-mails
On Tue, Aug 2, 2011 at 5:23 PM, Michael Nordman micha...@google.com wrote: On Mon, 13 Jun 2011, Michael Nordman wrote: Let's say there's a page in the cache to be used as a fallback resource, refers to the manifest by relative url... html manifest='x' Depending on the url that invokes the fallback resource, 'x' will be resolved to different absolute urls. When it doesn't match the actual manifest url, the fallback resource will get tagged as FOREIGN and will no longer be used to satisfy main resource loads. I'm not sure if this is a bug in chrome or a bug in the appcache spec just yet. I'm pretty certain that Safari will have the same behavior as chrome in this respect (the same bug). The value of the manifest attribute is interpreted as relative to the location of the loaded document in chrome and all webkit based browsers and that value is used to detect foreign'ness. The workaround/solution for this is to NOT put a manifest attribute in the html tag of the fallback resource (or to put either an absolute url or host relative url as the manifest attribute value). Or just make sure you always use relative URLs, even in the manifest. I don't really understand the problem here. Can you elaborate further? Suppose the fallback resource is setup like this... FALLBACK: / FallbackPage.html ... and that page contains a relative link to the manifest in its html tag like so... html manifest=file.manifest Any server request that fails under / will get FallbackPage.html in response. For example... /SomePage.html When the fallback is used in this case the manifest url will be interpreted as /file.manifest /Some/Other/Page.html And in this case the manifest url will be interpreted as /Some/Other/file.manifest On Fri, 1 Jul 2011, Michael Nordman wrote: Cross-origin resources listed in the CACHE section aren't retrieved with the 'Origin' header This is incorrect. They are fetched with the origin of the manifest. What makes you say no Origin header is included? I don't see mention of that in the draft? If that were the case then this wouldn't be an issue. I'm not familiar with CORS usage. Do xorigin subresource loads of all kinds (.js, .css, .png) carry the Origin header? I can imagine a server implementation that would examine the Origin header upfront, and if it didn't like what it saw, instead of computing the response without the origin listed in the Access-Control-Allow-Origin response header... it just wouldn't compute the response body and return an empty response without the origin listed in the Access-Control-Allow-Origin response header. If general subresource loads aren't sent with the Origin header, fetching all manifest listed resource with that header set could cause problems. According to some documentation over at mozilla'land, the value of the Origin header is different depending on the source of the request. https://wiki.mozilla.org/Security/Origin#When_Origin_is_served_.28and_when_it_is_.22null.22.29 So i think including Origin:manifestUrlOrigin when fetching all resources to populate an appcache could be the source of some subtle bugs.
Re: [whatwg] AppCache-related e-mails
On Tue, Aug 2, 2011 at 4:55 PM, Ian Hickson i...@hixie.ch wrote: On Tue, 2 Aug 2011, Michael Nordman wrote: If you actively want to seek out old manifests, sure, but what's the use case for doing that? It would be like trying to actively evict things from HTTP caches. You should talk to some app developers. View source on angry birds for a use case, they are doing this to get rid of stale version tied to old manifest urls. But why? I couldn't figure out the use case from the source you mention. This is a message I recently received from a different developer using the appcache that would also like to see more in the way of being able to manage the set of appcaches in the system. Please see the use cases listed towards the end. Hi Michael, Greg. I'm writing to advise you of a requirement I'd like to see appcache fulfill in the medium term. We've spoken about it before, but only in the general context of 'what would you like to see in the future'. No releases are gated on this feature, so I guess we're talking M15 or thereabouts. Feel free to cross-post this to a list you deem relevant for wider review and discussion. The feature is a javascript API to enable the creation, enumeration, update, and deletion of appcaches on the current origin. Calls might look something like this: /** Creates a new cache or updates an existing one with the given manifest URL. Manifest URL must be in the same origin as the JS */ createOrUpdateCache(String manifestUri, completionCallback, errorCallback); /** Enumerates the caches present on the current origin */ enumerateCaches(CacheEnumerationCallback callback, ErrorCallback errorCallback); interface CacheEnumerationCallback { void handleEvent(Cache[] caches); } interface Cache { number getManifestUri(); number getSizeInBytes(); String getManifestAsText(); String[] getMasterEntryUris(); String[] getExplicitEntryUris(); FallbackEntry[] getFallbackEntries(); String[] getNetworkWhitelistUris(); boolean isNetworkWhitelistOpen(); DateTime getCreationTime(); DateTime getLastManifestFetchTime(); // The last time the manifest was fetched and checked for differences DateTime getLastUpdateTime(); // The last time a manifest fetch caused an actual update DateTime getLastAccessTime(); // The last time the cache actually bound to a browsing context // Maybe some APIs to signal whether the cache is currently being updated, and whether there is currently a running browsing context bound to it. void delete(... some callbacks ...); // Probably fails if there's a running browsing context bound to the cache void update(... some callbacks ...); // I guess a no-op if an update is currently in progress or maybe even if it happened very recently } interface FallbackEntry { String getTriggerUri(); String getTargetUri(); } Additional characteristics: * Must be usable from pages not themselves bound to an appcache, as long as they are served from the same origin as the caches being operated on. * Must work from workers, shared workers, and background pages, again subject to a same origin check. The above is a very rough sketch, and needs a bunch of work, but illustrates the features we'd find useful. An obvious flaw is that it doesn't fit in with the system of progress events etc on the current API, but there are probably many others. View it mainly as a list of requirements. Our use cases are as follows: * Docs maintains a set of appcaches which it uses for various purposes. Each editor, for example, has a cache. There are also cases where different documents require different versions of the same editor. * The set of caches required on a particular browser depends on the documents synced there. A given set of documents will require a particular (much smaller) set of caches to open. The set of caches required on a given browser is therefore dynamic, changing as documents enter and leave the set of those synchronized. * Each time anybody opens a docs property, and perhaps during the lifetimes of some of them, we perform a procedure called 'appcache maintenance', which ensures that the caches necessary for the current set of documents are synced. This is a fairly nasty process involving many iframes, but it works. We would like, however, to make this code much simpler, not have it involve the iframes, and make the process of piping progress events back to the host application less awful. Right now it's such a pain we're not bothering with it. * We'd like to perform appcache maintenance on existing caches less often, reducing server load. The timestamps included above would allow us to do that. * When an appcache is no longer needed by the current set of documents, it is currently just left there. We would like to be able to clean it up. * We would like to be able to perform our appcache maintenance procedure from a shared worker, as we have one that can bring new documents into storage. Right now that is
Re: [whatwg] AppCache-related e-mails
On the subject of diagnostics for appcache: On Wed, 8 Jun 2011, Patrick Mueller wrote: On Wed, Jun 8, 2011 at 15:21, Ian Hickson i...@hixie.ch wrote: On Tue, 1 Feb 2011, Patrick Mueller wrote: I just tested Chrome beta this morning and saw nothing interesting in appcache error events, however progress events have now grown loaded and total properties (think those were the names, and I think they're new-ish). That's nice, as I can provide a progress meter during cache load/reload. I wouldn't mind having the URL of the resource being loaded (that was just loaded?) as well as those numbers. And for errors it would be nice to know, in the case of an error caused by a cache manifest entry 404'ing (or otherwise unavailable), what URL it was. HTTP error code, if appropriate, etc. In theory, we don't want to expose this information because it can be used to introspect intranets. I never considered that introspect internets angle. I guess the thought is that a rogue site could send a manifest with pointers to files inside someone's intranet, and then get someone inside that intranet to load that manifest, at which point JavaScript could have access to which URLs returned 200's, etc. Nasty. Right. Is this just an issue if the manifest or originating document's origin is different than a file listed in the manifest itself? Perhaps errors on these entries would less diagnostic data available for them - perhaps no diagnostic data. That would kind of fit with other cross-origin access capabilities. That might work. What kind of information would be most useful? Should it be in the same format from every browser or should it be detailed and freeform? Start with URL, because we know a URL was involved. Then allow for an optional vendor-specific freeform message. Vendor-specific messages end up being parsed by scripts, and shortly after that we end up having to hard-code those messages as the spec. So I'd rather not add a freeform message! What is the URL for? Can you describe the way this information would be used in a user interface or however it would be used? I'm just trying to make sure we address the actual problems that need addressing. Regarding TLS and cross-origin requests: On Thu, 16 Jun 2011, Michael Nordman wrote: On Tue, 8 Feb 2011, Michael Nordman wrote: Just had an offline discussion about this and I think the answer can be much simpler than what's been proposed so far. All we have to do for cross-origin HTTPS resources is respect the cache-control no-store header. Let me explain the rationale... first let's back up to the motivation for the restrictions on HTTPS. They're there to defeat attacks that involve physical access the the client system, so the attacker cannot look at the cross-origin HTTS data stored in the appcache on disk. But the regular disk cache stores HTTPS data provided the cache-control header doesn't say no-store, so excluding this data from appcaching does nothing to defeat that attack. Maybe the spec changes to make are... 1) Examine the cache-control header for all cross-origin resources (not just HTTPS), and only allow them if they don't contain the no-store directive. 2) Remove the special-case restriction that is currently in place only for HTTPS cross-origin resources. On Wed, 30 Mar 2011, Michael Nordman wrote: Fyi: This change has been made in chrome. * respect no-store headers for cross-origin resources (only for HTTPS) * allow HTTPS cross-origin resources to be listed in manifest hosted on HTTPS This seems reasonable. Done. I had proposed respecting the no-store directive only for cross-origin resources. The current draft is examining the no-store directive for all resources without regard for their origin. The intent behind the proposed change was to allow authors to continue to override the no-store header for resources in their origin, and to disallow that override only for cross-origin resources. The proposed change is less likely to break existing apps, and I think there are valid use cases for the existing behavior where no-store can be overriden by explicit inclusion in an appcache. I guess we can restrict no-store to cross-origin HTTPS resources, but it seems far easier to explain that no-store in general is honoured. Otherwise you end up with these weird situations where some resources can be cached and some can't, and the only reason one can or can't be stored is where the manifest is, but only if it has no-store, etc... It gets rather confusing. Also, what use cases are there for specifying no-store that don't apply across all resources? On the topic of appcache being used to cache everything but the main page: On Wed, 29 Jun 2011, Felix Halim wrote: On Thu, Jun 9, 2011 at 3:21 AM, Ian Hickson i...@hixie.ch wrote: If
Re: [whatwg] AppCache-related e-mails
A common request that maybe we can agree upon is the ability to list the manifests that are cached and to delete them via script. Something like... String[] window.applicationCache.getManifests(); // returns appcache manifest for the origin void window.applicationCache.deleteManifest(manifestUrl); This is trivial to do already; just return 404s for all the manifests you no longer want to keep around. It involves creating hidden iframes loaded with pages that refer to the manifests to be deleted, straightforward but gunky. 0. [DONE] A means of not invoking the fallback resource for some error responses that would generally result in the fallback resource being returned. An additional response header would suite they're needs... something like... x-chromium-appcache-fallback-override: disallow-fallback If a response header is present with that value, the fallback response would not be returned. http://code.google.com/p/chromium/issues/detail?id=82066 What's the use case? When would you ever want to show the user an error yet really desire to indicate that it's an error and not a 200 OK response? Google Docs. Instead of seeing a fallback page that erroneously says You must be offline and this document is not available., they wanted to show the actual error page generated by the server in the case of a deleted document or when the user doesn't have rights to access that doc.
Re: [whatwg] AppCache-related e-mails
On Tue, 2 Aug 2011, Michael Nordman wrote: A common request that maybe we can agree upon is the ability to list the manifests that are cached and to delete them via script. Something like... String[] window.applicationCache.getManifests(); // returns appcache manifest for the origin void window.applicationCache.deleteManifest(manifestUrl); This is trivial to do already; just return 404s for all the manifests you no longer want to keep around. It involves creating hidden iframes loaded with pages that refer to the manifests to be deleted, straightforward but gunky. If you actively want to seek out old manifests, sure, but what's the use case for doing that? It would be like trying to actively evict things from HTTP caches. 0. [DONE] A means of not invoking the fallback resource for some error responses that would generally result in the fallback resource being returned. An additional response header would suite they're needs... something like... x-chromium-appcache-fallback-override: disallow-fallback If a response header is present with that value, the fallback response would not be returned. http://code.google.com/p/chromium/issues/detail?id=82066 What's the use case? When would you ever want to show the user an error yet really desire to indicate that it's an error and not a 200 OK response? Google Docs. Instead of seeing a fallback page that erroneously says You must be offline and this document is not available., they wanted to show the actual error page generated by the server in the case of a deleted document or when the user doesn't have rights to access that doc. I don't see what's wrong with using 200 OK for that case. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] AppCache-related e-mails
On Tue, Aug 2, 2011 at 4:40 PM, Ian Hickson i...@hixie.ch wrote: On Tue, 2 Aug 2011, Michael Nordman wrote: A common request that maybe we can agree upon is the ability to list the manifests that are cached and to delete them via script. Something like... String[] window.applicationCache.getManifests(); // returns appcache manifest for the origin void window.applicationCache.deleteManifest(manifestUrl); This is trivial to do already; just return 404s for all the manifests you no longer want to keep around. It involves creating hidden iframes loaded with pages that refer to the manifests to be deleted, straightforward but gunky. If you actively want to seek out old manifests, sure, but what's the use case for doing that? It would be like trying to actively evict things from HTTP caches. 0. [DONE] A means of not invoking the fallback resource for some error responses that would generally result in the fallback resource being returned. An additional response header would suite they're needs... something like... x-chromium-appcache-fallback-override: disallow-fallback If a response header is present with that value, the fallback response would not be returned. http://code.google.com/p/chromium/issues/detail?id=82066 What's the use case? When would you ever want to show the user an error yet really desire to indicate that it's an error and not a 200 OK response? Google Docs. Instead of seeing a fallback page that erroneously says You must be offline and this document is not available., they wanted to show the actual error page generated by the server in the case of a deleted document or when the user doesn't have rights to access that doc. I don't see what's wrong with using 200 OK for that case. You should talk to the app developers. I think there are other consumers of these urls besides the browser. To change the status code to 200 would break those other consumers. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] AppCache-related e-mails
On Tue, Aug 2, 2011 at 4:40 PM, Ian Hickson i...@hixie.ch wrote: On Tue, 2 Aug 2011, Michael Nordman wrote: A common request that maybe we can agree upon is the ability to list the manifests that are cached and to delete them via script. Something like... String[] window.applicationCache.getManifests(); // returns appcache manifest for the origin void window.applicationCache.deleteManifest(manifestUrl); This is trivial to do already; just return 404s for all the manifests you no longer want to keep around. It involves creating hidden iframes loaded with pages that refer to the manifests to be deleted, straightforward but gunky. If you actively want to seek out old manifests, sure, but what's the use case for doing that? It would be like trying to actively evict things from HTTP caches. You should talk to some app developers. View source on angry birds for a use case, they are doing this to get rid of stale version tied to old manifest urls. 0. [DONE] A means of not invoking the fallback resource for some error responses that would generally result in the fallback resource being returned. An additional response header would suite they're needs... something like... x-chromium-appcache-fallback-override: disallow-fallback If a response header is present with that value, the fallback response would not be returned. http://code.google.com/p/chromium/issues/detail?id=82066 What's the use case? When would you ever want to show the user an error yet really desire to indicate that it's an error and not a 200 OK response? Google Docs. Instead of seeing a fallback page that erroneously says You must be offline and this document is not available., they wanted to show the actual error page generated by the server in the case of a deleted document or when the user doesn't have rights to access that doc. I don't see what's wrong with using 200 OK for that case. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] AppCache-related e-mails
On Tue, 2 Aug 2011, Michael Nordman wrote: If you actively want to seek out old manifests, sure, but what's the use case for doing that? It would be like trying to actively evict things from HTTP caches. You should talk to some app developers. View source on angry birds for a use case, they are doing this to get rid of stale version tied to old manifest urls. But why? I couldn't figure out the use case from the source you mention. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] AppCache-related e-mails
On Mon, 13 Jun 2011, Michael Nordman wrote: Let's say there's a page in the cache to be used as a fallback resource, refers to the manifest by relative url... html manifest='x' Depending on the url that invokes the fallback resource, 'x' will be resolved to different absolute urls. When it doesn't match the actual manifest url, the fallback resource will get tagged as FOREIGN and will no longer be used to satisfy main resource loads. I'm not sure if this is a bug in chrome or a bug in the appcache spec just yet. I'm pretty certain that Safari will have the same behavior as chrome in this respect (the same bug). The value of the manifest attribute is interpreted as relative to the location of the loaded document in chrome and all webkit based browsers and that value is used to detect foreign'ness. The workaround/solution for this is to NOT put a manifest attribute in the html tag of the fallback resource (or to put either an absolute url or host relative url as the manifest attribute value). Or just make sure you always use relative URLs, even in the manifest. I don't really understand the problem here. Can you elaborate further? Suppose the fallback resource is setup like this... FALLBACK: / FallbackPage.html ... and that page contains a relative link to the manifest in its html tag like so... html manifest=file.manifest Any server request that fails under / will get FallbackPage.html in response. For example... /SomePage.html When the fallback is used in this case the manifest url will be interpreted as /file.manifest /Some/Other/Page.html And in this case the manifest url will be interpreted as /Some/Other/file.manifest On Fri, 1 Jul 2011, Michael Nordman wrote: Cross-origin resources listed in the CACHE section aren't retrieved with the 'Origin' header This is incorrect. They are fetched with the origin of the manifest. What makes you say no Origin header is included? I don't see mention of that in the draft? If that were the case then this wouldn't be an issue. I'm not familiar with CORS usage. Do xorigin subresource loads of all kinds (.js, .css, .png) carry the Origin header? I can imagine a server implementation that would examine the Origin header upfront, and if it didn't like what it saw, instead of computing the response without the origin listed in the Access-Control-Allow-Origin response header... it just wouldn't compute the response body and return an empty response without the origin listed in the Access-Control-Allow-Origin response header. If general subresource loads aren't sent with the Origin header, fetching all manifest listed resource with that header set could cause problems.
Re: [whatwg] AppCache-related e-mails
Le 29 juin 2011 à 05:27, Felix Halim a écrit : Suppose the content of the main page change very often (like news site). In this case, you don't want to cache the main page since the users want to see the latest main page, not the cached ones when they open the main page later. Did you also check ESI? http://www.w3.org/TR/esi-lang For example in http://symfony.com/doc/2.0/book/http_cache.html#edge-side-includes -- Karl Dubost - http://dev.opera.com/ Developer Relations Tools, Opera Software
Re: [whatwg] AppCache-related e-mails
Þann fim 7.júl 2011 05:30, skrifaði Felix Halim: On Thu, Jul 7, 2011 at 3:57 AM, Karl Dubostka...@opera.com wrote: http://uhunt.felix-halim.net/id/339 I'll look into your site when I've slept, but FYI, you're mandated to provide a title for your document. You should probably provide a title of uHunt, and append to the title's innerHTML as further information becomes available. [/nitpick]
Re: [whatwg] AppCache-related e-mails
Felix, Le 29 juin 2011 à 05:27, Felix Halim a écrit : Suppose the content of the main page change very often (like news site). In this case, you don't want to cache the main page since the users want to see the latest main page, not the cached ones when they open the main page later. Is there a web site which exhibits exactly the issue you are mentioning. Or could you set up a mini Web site exhibiting the issue. I have read the full thread, and I still do not understand what you are trying to solve. HTTP cache is about setting user interactions. There is no good values, just the values you decide that would make sense. HTTP Cache can already handle a lot of cases (offline/online) without even using AppCache, specifically when it is only content. -- Karl Dubost - http://dev.opera.com/ Developer Relations Tools, Opera Software
Re: [whatwg] AppCache-related e-mails
On Thu, Jul 7, 2011 at 3:57 AM, Karl Dubost ka...@opera.com wrote: Felix, Le 29 juin 2011 à 05:27, Felix Halim a écrit : Suppose the content of the main page change very often (like news site). In this case, you don't want to cache the main page since the users want to see the latest main page, not the cached ones when they open the main page later. Is there a web site which exhibits exactly the issue you are mentioning. Or could you set up a mini Web site exhibiting the issue. I have read the full thread, and I still do not understand what you are trying to solve. HTTP cache is about setting user interactions. There is no good values, just the values you decide that would make sense. HTTP Cache can already handle a lot of cases (offline/online) without even using AppCache, specifically when it is only content. This is a real example. I build a site like: http://uhunt.felix-halim.net/id/339 That is is mine, and there is another ids like: http://uhunt.felix-halim.net/id/32900 http://uhunt.felix-halim.net/id/1133 And thousands of other IDs. Usually people look into few dozens IDs and not all thousands of them. Each ID has a large-unique-frequently-changing data attached to them (about 400KB). Obviously, if I do a clean separation, and store the static framework in AppCache, and the frequently changing data in localStorage, I can only cache 10 ids data or so. What I want is a 5MB pageStorage quota per page id. So that I can store the frequently changing data to it rather than the shared localStorage which uses the 5MB domain quota. In this case, any users can essentially view a lot more ids without having to worry exceeding the localStorage quota as long as I know that per page takes far less than 5MB. Of course I can implement my own cache revocation like deleting from localStorage for ids that are less viewed. But this job is best left to the browsers. Browsers can remove any page that is not viewed anymore and the pageStorage associated to it. Is that clear enough? on why we need pageStorage? Now the problem is, how do you use AppCache + pageStorage? They are conflicting each other in terms of URL. I can use AppCache to cache the static framework I have to URL like: http://uhunt.felix-halim.net/id Then a pageStorage can be created for each different hashbang: http://uhunt.felix-halim.net/id#339 That will give me 5MB for id = 339 And: http://uhunt.felix-halim.net/id#32900 That will give me ANOTHER 5MB for id = 32900 and so on. Then the browser can decide which URL are less frequently accessed and destroy the pageStorage associated to it if the browser has no space left. Even if my script is malicious and I create unlimited number of hashbang to get unlimited quota, the browser can just remove and store only let say 100 latest or most frequently used hashbang. So, should be perfectly fine to have pageStorage attached to a hashbang value. This will help in web application developers to cleanly separates the static from the dynamic and have nothing to worry about managing their cache replacement policies, or worry about the limitation of 5MB of localStorage or any other storage! This will also help the browser to dissect what's static and what's dynamic! It's a WIN-WIN strategy for all. I think everybody knows that if I directly use AppCache to this url: http://uhunt.felix-halim.net/id/339 What will happen? I will have to refresh twice to get the latest statistics of my page! Now, if somehow AppCache can make the main page ONLINE (that is, so that I don't need to refresh twice). Then, all the discussions of pageStorage above and quota becomes meaningless! So, my proposals is either to make the main page of the AppCache ONLINE, or support pageStorage for hashbangs. Do you have suggestions on this? Felix Halim
Re: [whatwg] AppCache-related e-mails
On Thu, Jul 7, 2011 at 1:30 PM, Felix Halim felix.ha...@gmail.com wrote: So, my proposals is either to make the main page of the AppCache ONLINE, or support pageStorage for hashbangs. Now when I think about the pageStorage again, actually we don't need hashbangs! We can just say: pageStorage['339'] = { here is my 5 MB JSON data for 339 } pageStorage['32900'] = { here is my 5 MB JSON data for 32900 } That should perfectly works well and the browser can silently destroy the content of any of the less used ID, ANYTIME. So, the usage is to not always assume the content of pageStorage exists and treat it purely as cache that can be gone at anytime. So, yes, we can use pageStorage to any page associated to the page URL without the hashbang (as if the hashbang is stripped off). The quota per key/value pair is 5 MB and can be removed by browser anytime. How about that? This can fulfil my need to get a clean separation. Felix Halim
Re: [whatwg] AppCache-related e-mails
If you have 100mb of junk, it won't fit in my browser's http cache either. And that's a good thing, there are other sites I visit that are more important. However, a browser is within its rights to detect that its user uses a site so heavily as to justify increasing that site's cache allocation. That's a QoI detail. Note that if 70% of your 100mb is duplicated framework, it's possible that a better implementation of your site could fit into a 25mb cache...
Re: [whatwg] AppCache-related e-mails
On Sun, Jul 3, 2011 at 3:17 PM, timeless timel...@gmail.com wrote: If you have 100mb of junk, it won't fit in my browser's http cache either. And that's a good thing, there are other sites I visit that are more important. However, a browser is within its rights to detect that its user uses a site so heavily as to justify increasing that site's cache allocation. That's a QoI detail. Yes, the quota system can be improved heuristically. Note that if 70% of your 100mb is duplicated framework, it's possible that a better implementation of your site could fit into a 25mb cache... The only way to remove the duplication is to use a single URL for the web app (then apply AppCache to that URL), and uses shebang to uniquely identify the page: http://bla/page#!id=10 Then all the non-duplicated data are stored in localStorage/indexedDB. I would love this, however as long as the quota system is still 5MB (or an equivalent page storage quota), I don't feel inclined to make that changes to my site yet. The browsers vendors have to move first to design better quota systems. Or is there any other way to cleanly do the separation without shebang? FYI, I want my page to be able to be linked (referenced / bookmarked) from other sites. Btw, does anyone know why Facebook abandoned the usage of shebang? Felix Halim
Re: [whatwg] AppCache-related e-mails
Felix Halim felix.ha...@gmail.com schrieb am Sun, 3 Jul 2011 15:41:54 +0800: Btw, does anyone know why Facebook abandoned the usage of shebang? If they did so, then rightly so. Hashbangs are a thouroughly bad idea: http://isolani.co.uk/blog/javascript/BreakingTheWebWithHashBangs -- Nils Dagsson Moskopp // erlehmann http://dieweltistgarnichtso.net
Re: [whatwg] AppCache-related e-mails
On Sun, Jul 3, 2011 at 8:21 PM, Nils Dagsson Moskopp n...@dieweltistgarnichtso.net wrote: Felix Halim felix.ha...@gmail.com schrieb am Sun, 3 Jul 2011 15:41:54 Btw, does anyone know why Facebook abandoned the usage of shebang? If they did so, then rightly so. Hashbangs are a thouroughly bad idea: http://isolani.co.uk/blog/javascript/BreakingTheWebWithHashBangs After reading that article, it seemed that re-structuring the website to separate the static from dynamic doesn't worth much afterall, unless there is other method than hash-bang. The follow up question is that: is there another way to achieve a clean separation without using hash-bang? Another question is that how do we bookmark/link AppCached pages without using hash-bang? I cannot find discussions about it. Felix Halim
Re: [whatwg] AppCache-related e-mails
Felix Halim felix.ha...@gmail.com schrieb am Sun, 3 Jul 2011 23:16:16 +0800: […] After reading that article, it seemed that re-structuring the website to separate the static from dynamic doesn't worth much afterall, unless there is other method than hash-bang. The follow up question is that: is there another way to achieve a clean separation without using hash-bang? You may be looking for this: http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html? -- Nils Dagsson Moskopp // erlehmann http://dieweltistgarnichtso.net
Re: [whatwg] AppCache-related e-mails
On Sat, Jul 2, 2011 at 8:14 AM, Bjartur Thorlacius svartma...@gmail.com wrote: Şann fös 1.júl 2011 03:22, skrifaği Felix Halim: I'm looking for a solution that doesn't require modifying anything except adding a manifest. I recommend fixing your website. As others have stated, this has practical benefits, in the online as well as the offline case. I don't mind fixing my website, if I really have to! If AppCache have an option to always view the main page online, I won't have to do anything. however, if we don't have pageStorage, even we have a clean dynamic separation, it will quickly run out of space if we use localStorage since the localStorage quota is per domain. Nobody's forcing you to use localStorage. How do you figure using pageStorage or localStorage will be less work than using iframes or other linking methods already proposed? It's not about the amount of work that matters, it's the quota I'm talking about. Let's see an example: I have a dynamic page with this url: http://bla/page?id=10 The content inside is changing very frequently, lets say every hour. Of course, I want the browser to cache the latest version. Then specify the applicable HTTP headers with informative values. HTTP caching hasn't stopped working, nor is it barred from improving. There is space for implementations to improve while complying with current specifications. All you have to do is split dynamic resources from static, read the RFC and send the appropriate headers. Of course this method has the drawback of requiring a request/response pair for every resource transferred over HTTP. Remember that I also want those URL to be available even if the user is offline. HTTP Cache is not that powerful, AppCache is. In that case, my cleanly separated static and dynamic will have no effect! Because all the statics get duplicated for each App Cache. It will be the same as if I don't have the framework! I'm not following your line of thinking. Why do you insist on using an App Cache for each page rather than a shared cache for all your resources? I do want to use shared cache for shared resources and page cache for non-shared resources (unique to that page). However, the non-shared resources will become too large to fit in 5MB quota. Remember I have different non-shared content for id=10, id=11, ..., id=10, I don't think that will fit in localStorage. Are you certain that users wish to archive every single dynamic resource they fetch from your site? Disposition of any significant amount of storage should be in the hands of the user, if indirectly through the user agent. Take handhelds. Users only view the resources they want. When they viewed it, I want it to be there for offline use or for performance reasons. I expect the users only view (and cache) few hundreds of them. They cannot cache what they didn't view / open. It is OK for the browser to not cache it if it doesn't any storage left. I am satisfied if there is a page storage quota of 5MB given per page (not per domain). This will solve all my problems (of course by restructuring my site). If only I can store the dynamic content into a pageStorage (assuming different URL - including the shebang bookmark has different pageStorage), then I won't be running out of storage if I keep one page within 5MB. So And you're sure this is a good thing, because? Because currently, browsers can handle a page content 5MB very fast. I think it is OK for a page (not a domain) to have 5MB data quota. If you are building games, perhaps need more than that (it has to go to the web store to get unlimited permission). However, for regular pages, 5MB currently is more than enough. 5MB per domain is too small! http://bla/page#!id=10 You *can't* allocate a quota per URI fragment, as a script in the page could create new ones as wanted. Yes I know, that was only for an example to point out that: If I use shared cache: http://bla/page I will run out of quota quickly. If I use parameters like this: http://bla/page?id=10 I will have to refresh TWICE to get the latest content (annoying). If I can use: http://bla/page#!id=10 I get the best of both worlds, that is I have shared static cache, and I won't run out of quota for the non-shared-dynamic cache since the quota is 5MB per hash value. I know that this has a security hole that the script can just generate random url to get more quota. My suggestion is to give quota to hash value for the first time the page is loaded, so a later script modification will be linked to the original hash value's quota. So, we have seen how the AppCache fails to satisfy certain usecase and how pageStorage is needed to make the alternative solution works. Show how either the HTTP specification or common practice forbids HTTP caches from satisfy your use cases. I think it's clear that HTTP Cache is inferior to AppCache. What HTTP Cache can, it can be overridden by AppCache. AppCache
Re: [whatwg] AppCache-related e-mails
A common request that maybe we can agree upon is the ability to list the manifests that are cached and to delete them via script. Something like... String[] window.applicationCache.getManifests(); // returns appcache manifest for the origin void window.applicationCache.deleteManifest(manifestUrl); I think it's clear from this discussion (and others) that the overall appcache feature set leaves something to be desired, but it's less clear how to best satisfy the desirements. Until there is some clarity, it's hard to see how the community is going to make progress. Personally, I think whats needed to move things forward is for browser vendors to do some independent innovating to see what works and what doesn't work. @Hixie... any idea when the appcache feature set will be up for a growth spurt? I think there's an appetite for another round of features in the offline app developers that i communicate with. There's been some recent interest here in pursuing a means of programatically producing a response instead of just returning static content. Who implements it currently? Is there a test suite? Those are the main things that would gate a dramatic addition of new features. Well, nobody yet; but I have a roadmap in mind that builds up to that. Much of the discussion in this thread has been on the second item. Mobile developers are particularly interested 2 to avoid HTTP cache churn and the cost of HTTP cache validation. In this roadmap, you can see that it would also allow pages vended from servers to make use of executable intercept handlers. -1. [DONE] Support for cross-origin HTTPS resources. http://code.google.com/p/chromium/issues/detail?id=69594 0. [DONE] A means of not invoking the fallback resource for some error responses that would generally result in the fallback resource being returned. An additional response header would suite they're needs... something like... x-chromium-appcache-fallback-override: disallow-fallback If a response header is present with that value, the fallback response would not be returned. http://code.google.com/p/chromium/issues/detail?id=82066 1. [UNDER CONFUSING DISCUSSION] Allow a syntax to associate a page with an application cache, but does not add that page to the cache. A common feature request also mentioned on the whatwg list, but it's not getting any engagement from other browser vendors or the spec writer (which is kind of frustrating). The premise is to allow pages vended from a server to take advantage of the resources in an application cache when loading subresources. A perfectly reasonable request, http useManifest='x'. 2. Introduce a new manifest file section to INTERCEPT requests into a prefix matched url namespace and satisfy them with a cached resource. The resulting page would be free to interpret the location url and act accordingly based on the path and query elements beyond the prefix matched url string. This section would be similar to the FALLBACK section in that prefix matching is involved, but different in that instead of being used only in the case of a network/server error, the cached INTERCEPT resource would be used immediately w/o first going to the server. INTERCEPT: urlprefix redirect newlocationurl urlprefix return cachedresourceurl Here's where the INTERCEPT namespace could fit into the changes to the network model. if (url is EXPLICITLY_CACHED) // exact match return cached_response; if (url is in NETWORK namespace) // prefix match return network_response_as_usual; if (url is in INTERCEPT namespace) // prefix match this is the new section return handle_intercepted_request_accordingly if (url is in FALLBACK namespace) // prefix match return network_response_but_fallback_where_needed; if (ONLINE_WILDCARD) return network_response; otherwise return synthesized_error_response; 3. Allow an INTERCEPT cached resources to be executable. Instead of simply returning the cached resource or redirect in response to the request, load it into a background worker context (if not already loaded) and invoke a function in that context to asynchronously compute response headers and body based on the request headers (including cookie) and body. The background worker would have access to various local storage facilities (fileSystem, indexed/sqlDBs) as well as the ability to make network requests via XHR. INTERCEPT: urlprefix execute cachedexecutableresourceurl 4. Create a syntax to allow FALLBACK resources to be similarly executable in a background worker context. 5. Some kind of auto-update policy where the appcache is refreshed w/o the app running. There are a couple of features that are not on this list that I want to call out: * The ability to add(url) and remove(url) the appcache is not on the list. FileSystem urls cover a lot of this already, and the ability to cache adhoc resources and later load them via http urls could be composed out of the filesystem and
Re: [whatwg] AppCache-related e-mails
Þann fös 1.júl 2011 03:22, skrifaði Felix Halim: I'm looking for a solution that doesn't require modifying anything except adding a manifest. I recommend fixing your website. As others have stated, this has practical benefits, in the online as well as the offline case. As I said before, separating dynamic from the static will work, Great! however, if we don't have pageStorage, even we have a clean dynamic separation, it will quickly run out of space if we use localStorage since the localStorage quota is per domain. Nobody's forcing you to use localStorage. How do you figure using pageStorage or localStorage will be less work than using iframes or other linking methods already proposed? Let's see an example: I have a dynamic page with this url: http://bla/page?id=10 The content inside is changing very frequently, lets say every hour. Of course, I want the browser to cache the latest version. Then specify the applicable HTTP headers with informative values. HTTP caching hasn't stopped working, nor is it barred from improving. There is space for implementations to improve while complying with current specifications. All you have to do is split dynamic resources from static, read the RFC and send the appropriate headers. Of course this method has the drawback of requiring a request/response pair for every resource transferred over HTTP. So, it seemed that AppCache is a perfect fit... AppCache is no magic bullet. Don't use it if you figure it isn't a perfect fit. I then add the manifest to enable the App Cache, and what do I get? Everytime I open that URL every hour, I ALWAYS see the STALE version (the 1 hour late version). Then few seconds (or minutes) later (depend on when the AppCache gets updated), I refresh, then I got the latest content. Annoying, right? FYI, HTTP has already resolved this issue, by forbidding implementations from returning a stale version by default under normal situations or without warning In this case, I better off NOT to use App Cache, since it brings the old content everytime. Right. Bad App Cache. Now, let see the alternative: I build a framework to separate the dynamic from the static. I have to make it so that only ONE MAIN PAGE get cached by the app cache. So, my URL can NO LONGER BE: http://bla/page?id=10 But it has to change to: http://bla/page#!id=10 Why do I have to do this? it's because if I DON'T, then each page will be stored on different App Cache, and the stale by one still occurs! That is, http://bla/page?id=10 and http://bla/page?id=11 Will be on DIFFERENT AppCache! In that case, my cleanly separated static and dynamic will have no effect! Because all the statics get duplicated for each App Cache. It will be the same as if I don't have the framework! I'm not following your line of thinking. Why do you insist on using an App Cache for each page rather than a shared cache for all your resources? So, to make the AppCache only cache one static framework, I have to make my page such that it is served under ONE url: http://bla/page Then take the #!id=10 as non url (or ajax bookmark). This way, the AppCache will only cache ONE of my static framework, and MANY dynamic content inside it. Guess what? All the incoming links from other blogs are now broken! Of course I can make a redirect, but redirect is AGAINST making the web faster! I think Facebook did the #! thing a while ago, then they abandoned it, why? Ok now I'm happy with my framework and the redirect, and guess what? Soon, I have other pages with #!id=11, #!id=12, ..., #!id=1. All of them are important and I wan't to cache them and I uses the localStorage (or indexedDB) to cache the dynamic content of those pages. Note that even though the dynamic content is dynamic it doesn't mean that: http://bla/page?id=10 has shared data with http://bla/page?id=11 It can be totally different unrelated dynamic content. id=10 dynamic content is entirely different from id=11 dynamic content. However, since I use localStorage to cache the dynamic content, ALL OF THEM are limited to the quota of my domain. My 5MB localStorage domain quota will quickly run out of space. Are you certain that users wish to archive every single dynamic resource they fetch from your site? Disposition of any significant amount of storage should be in the hands of the user, if indirectly through the user agent. Take handhelds. If only I can store the dynamic content into a pageStorage (assuming different URL - including the shebang bookmark has different pageStorage), then I won't be running out of storage if I keep one page within 5MB. So And you're sure this is a good thing, because? http://bla/page#!id=10 You *can't* allocate a quota per URI fragment, as a script in the page could create new ones as wanted. Then I would be very happy with the new framework. Since it will store very compact static App and very compact dynamic content. It's a win win for everyone, nothing
Re: [whatwg] AppCache-related e-mails
Ask HTTP implementors to store a potentially stale fallback copy for offline use when an authoritative copy is unavailable. Even HTTP caches are allowed to return stale responses as long as they warn their clients (so they can warn their clients or fetch an authoritative copy via another route). Browsers should keep copies of the most used entries for offline use. It's probably a matter of minor tweaking, considering that mainstream browsers support offline modes already. From http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.1.5: In some cases, the operator of a cache MAY choose to configure it to return stale responses even when not requested by clients. This decision ought not be made lightly, but may be necessary for reasons of availability or performance, especially when the cache is poorly connected to the origin server. Whenever a cache returns a stale response, it MUST mark it as such (using a Warning header) enabling the client software to alert the user that there might be a potential problem. P.S. Your hypothetical major overhaul should probably involve splitting the dynamic content into separate resources linked to from a static main page/index using iframes.
Re: [whatwg] AppCache-related e-mails
It's possible to build a main page so that it can update its content using a subresource. You can use iframes, javascript (including json), xmlhttprequests, or other things to do this. Nothing requires you to have a monolythic main page which is incapable of dynamically updating itself. ... If I visit your page on May 1st and sit there for two months, does your page really just want to continue to show me the same content when I glance at it on July 1st? It can show other content if it wants to, and in order to save bandwidth costs, it should avoid resending the framework which shouldn't be changing. Once your page works well for this case, it should work well for app-cache. On 6/29/11, Felix Halim felix.ha...@gmail.com wrote: On Thu, Jun 9, 2011 at 3:21 AM, Ian Hickson i...@hixie.ch wrote: If you're not loading the main page from the cache, what does this gain you that regular HTTP caching doesn't? Suppose the content of the main page change very often (like news site). In this case, you don't want to cache the main page since the users want to see the latest main page, not the cached ones when they open the main page later. However, should the network connectivity is down, the user should be presented with the cached main page. This problem can be solved by having the main page to NOT include the news content, but only a static template. The news content is fetched dynamically through XHR and stored in localStorage. However, this complicates the news site (a major redesign of the website is necessary). It would be far easier if there is an option in the MANIFEST file to NOT CACHE the main page. So that the behavior is exactly like caching, but it is far stronger, since the rest of the resources (css, js, images, etc... are never re-fetched from the network). The current HTTP Caching still checks whether the resources are modified, but in app cache, we can explicitly say that they are not modified unless we change the manifest hash. So, in this case, HTML5 App Cache can help make regular online websites far faster, as well as provide offline access should the network is down (or the server is down). This would make the online news site feels online when it's online and offline when it's offline. I don't think HTTP Cache can serve the content if the network / server is down. If the main page is always cached, then the next time the user visits the main page, it will (almost) always see the STALE content of the main page. Then a split second later, the main page refreshes with the most up-to-date version, which is very annoying to the users. On Mon, 14 Feb 2011, Felix Halim wrote: I have a use case where it is preferable that the main page is not cached: Suppose you have a main page that changes based on it's ID: http://example.com/page.php?id=10 The appCache will store each main page with different id in separate cache, which is undesirable! And we DON'T want to cache the main pages, since the content differs significantly (think of it as a forum website). The idea of the appcache feature is to enable offline usage. If you don't want it cached, how is it going to work offline? It will work offline when the network or the server is down? In such case, the latest (cached) main page is shown. I wasn't very clear when I say the main page should not be cached. I was saying, we should still keep the main page cached, but always show the online (non cached) main page if the network and the server are alive. The main goal here is NOT to make the page offline, but to cache the resources that the page uses (i.e, .js, .css, images, etc...) that are very likely to be IMMUTABLE (particularly the jQuery.js and jQueryUI css+images that almost every sites uses!). Appcache only adds one feature: The ability to work offline. Everything else that appcache does is already possible with regular HTTP caching. So if you don't want to work offline, just use regular HTTP caching. HTTP Caching requires server modifications on altering the headers and is a non option for users that have no control on the server side. Also, many servers are mostly mis-configured on how to send the correct headers and some proxies may alter them on its way to the client. It would be great to be able to specify what to CACHE and what not in the MANIFEST in the HTML file no matter what HTTP Caching says! HTML5 App Cache here works as the complement for web-developers that cannot do HTTP Caching. Moreover, some HTTP Caching strategies do requires round-trip to the servers which can be hundred of milliseconds slower! If we specify everything in the manifest file, no such round-trip ever necessary. In fact, we can do even better than that by not fetching the MANIFEST itself by including an (optional) manifest's HASH inside the HTML like: html useManifest=my.manifest manifestHash=asdfasdfasd If not specified, then the my.manifest will always be checked
Re: [whatwg] AppCache-related e-mails
On Fri, Jul 1, 2011 at 12:40 AM, timeless timel...@gmail.com wrote: It's possible to build a main page so that it can update its content using a subresource. You can use iframes, javascript (including json), xmlhttprequests, or other things to do this. Those are another option besides using localStorage. Again, those things requires restructuring your website. I'm looking for a solution that doesn't require modifying anything except adding a manifest. Nothing requires you to have a monolythic main page which is incapable of dynamically updating itself. ... If I visit your page on May 1st and sit there for two months, does your page really just want to continue to show me the same content when I glance at it on July 1st? It can show other content if it wants to, and in order to save bandwidth costs, it should avoid resending the framework which shouldn't be changing. Once your page works well for this case, it should work well for app-cache. As I said before, separating dynamic from the static will work, however, if we don't have pageStorage, even we have a clean dynamic separation, it will quickly run out of space if we use localStorage since the localStorage quota is per domain. Let's see an example: I have a dynamic page with this url: http://bla/page?id=10 The content inside is changing very frequently, lets say every hour. Of course, I want the browser to cache the latest version. So, it seemed that AppCache is a perfect fit... I then add the manifest to enable the App Cache, and what do I get? Everytime I open that URL every hour, I ALWAYS see the STALE version (the 1 hour late version). Then few seconds (or minutes) later (depend on when the AppCache gets updated), I refresh, then I got the latest content. Annoying, right? In this case, I better off NOT to use App Cache, since it brings the old content everytime. This is why most people says please DON'T cache the main page. Now, let see the alternative: I build a framework to separate the dynamic from the static. I have to make it so that only ONE MAIN PAGE get cached by the app cache. So, my URL can NO LONGER BE: http://bla/page?id=10 But it has to change to: http://bla/page#!id=10 Why do I have to do this? it's because if I DON'T, then each page will be stored on different App Cache, and the stale by one still occurs! That is, http://bla/page?id=10 and http://bla/page?id=11 Will be on DIFFERENT AppCache! In that case, my cleanly separated static and dynamic will have no effect! Because all the statics get duplicated for each App Cache. It will be the same as if I don't have the framework! So, to make the AppCache only cache one static framework, I have to make my page such that it is served under ONE url: http://bla/page Then take the #!id=10 as non url (or ajax bookmark). This way, the AppCache will only cache ONE of my static framework, and MANY dynamic content inside it. Guess what? All the incoming links from other blogs are now broken! Of course I can make a redirect, but redirect is AGAINST making the web faster! I think Facebook did the #! thing a while ago, then they abandoned it, why? Ok now I'm happy with my framework and the redirect, and guess what? Soon, I have other pages with #!id=11, #!id=12, ..., #!id=1. All of them are important and I wan't to cache them and I uses the localStorage (or indexedDB) to cache the dynamic content of those pages. Note that even though the dynamic content is dynamic it doesn't mean that: http://bla/page?id=10 has shared data with http://bla/page?id=11 It can be totally different unrelated dynamic content. id=10 dynamic content is entirely different from id=11 dynamic content. However, since I use localStorage to cache the dynamic content, ALL OF THEM are limited to the quota of my domain. My 5MB localStorage domain quota will quickly run out of space. If only I can store the dynamic content into a pageStorage (assuming different URL - including the shebang bookmark has different pageStorage), then I won't be running out of storage if I keep one page within 5MB. So http://bla/page#!id=10 has 5 MB pageStorage quota, and http://bla/page#!id=11 also has 5 MB pageStorage quota, etc... Then I would be very happy with the new framework. Since it will store very compact static App and very compact dynamic content. It's a win win for everyone, nothing is wasted. But, if I don't have pageStorage quota, my beautifully separated the dynamic from the static framework will be useless since the localStorage DOMAIN QUOTA will kill me. So, we have seen how the AppCache fails to satisfy certain usecase and how pageStorage is needed to make the alternative solution works. Here, I propose a solution: AppCache should COMPLEMENT HTTP Cache so that the main page is not cached (you know this is not literally what it means). With that solution, I don't have to do ANYTHING on my original site to make it work (except adding a manifest to my original page). I can still
Re: [whatwg] AppCache-related e-mails
On Thu, Jun 9, 2011 at 3:21 AM, Ian Hickson i...@hixie.ch wrote: If you're not loading the main page from the cache, what does this gain you that regular HTTP caching doesn't? Suppose the content of the main page change very often (like news site). In this case, you don't want to cache the main page since the users want to see the latest main page, not the cached ones when they open the main page later. However, should the network connectivity is down, the user should be presented with the cached main page. This problem can be solved by having the main page to NOT include the news content, but only a static template. The news content is fetched dynamically through XHR and stored in localStorage. However, this complicates the news site (a major redesign of the website is necessary). It would be far easier if there is an option in the MANIFEST file to NOT CACHE the main page. So that the behavior is exactly like caching, but it is far stronger, since the rest of the resources (css, js, images, etc... are never re-fetched from the network). The current HTTP Caching still checks whether the resources are modified, but in app cache, we can explicitly say that they are not modified unless we change the manifest hash. So, in this case, HTML5 App Cache can help make regular online websites far faster, as well as provide offline access should the network is down (or the server is down). This would make the online news site feels online when it's online and offline when it's offline. I don't think HTTP Cache can serve the content if the network / server is down. If the main page is always cached, then the next time the user visits the main page, it will (almost) always see the STALE content of the main page. Then a split second later, the main page refreshes with the most up-to-date version, which is very annoying to the users. On Mon, 14 Feb 2011, Felix Halim wrote: I have a use case where it is preferable that the main page is not cached: Suppose you have a main page that changes based on it's ID: http://example.com/page.php?id=10 The appCache will store each main page with different id in separate cache, which is undesirable! And we DON'T want to cache the main pages, since the content differs significantly (think of it as a forum website). The idea of the appcache feature is to enable offline usage. If you don't want it cached, how is it going to work offline? It will work offline when the network or the server is down? In such case, the latest (cached) main page is shown. I wasn't very clear when I say the main page should not be cached. I was saying, we should still keep the main page cached, but always show the online (non cached) main page if the network and the server are alive. The main goal here is NOT to make the page offline, but to cache the resources that the page uses (i.e, .js, .css, images, etc...) that are very likely to be IMMUTABLE (particularly the jQuery.js and jQueryUI css+images that almost every sites uses!). Appcache only adds one feature: The ability to work offline. Everything else that appcache does is already possible with regular HTTP caching. So if you don't want to work offline, just use regular HTTP caching. HTTP Caching requires server modifications on altering the headers and is a non option for users that have no control on the server side. Also, many servers are mostly mis-configured on how to send the correct headers and some proxies may alter them on its way to the client. It would be great to be able to specify what to CACHE and what not in the MANIFEST in the HTML file no matter what HTTP Caching says! HTML5 App Cache here works as the complement for web-developers that cannot do HTTP Caching. Moreover, some HTTP Caching strategies do requires round-trip to the servers which can be hundred of milliseconds slower! If we specify everything in the manifest file, no such round-trip ever necessary. In fact, we can do even better than that by not fetching the MANIFEST itself by including an (optional) manifest's HASH inside the HTML like: html useManifest=my.manifest manifestHash=asdfasdfasd If not specified, then the my.manifest will always be checked for modifications. Or i would like to update this file, or any file else, i would like to update, on demand. Not sure what this means. I think it means that we should be able to selectively update any file in the manifest, rather than blindly updating everything if the manifest's hash changes. The ability to selectively update the cached files is very appealing. If your resources are 5 MB, and you know you only want to update on a small file of 1KB... I believe the way the current App Cache updates everything if the manifest file changes is just too inefficient. You can say it can be no worse than HTTP Caching, but it can be made far better! The application cache is very powerful. But it is very disappointing, that it is only useful for static pages. With a little
Re: [whatwg] AppCache-related e-mails
On Tue, 8 Feb 2011, Michael Nordman wrote: Just had an offline discussion about this and I think the answer can be much simpler than what's been proposed so far. All we have to do for cross-origin HTTPS resources is respect the cache-control no-store header. Let me explain the rationale... first let's back up to the motivation for the restrictions on HTTPS. They're there to defeat attacks that involve physical access the the client system, so the attacker cannot look at the cross-origin HTTS data stored in the appcache on disk. But the regular disk cache stores HTTPS data provided the cache-control header doesn't say no-store, so excluding this data from appcaching does nothing to defeat that attack. Maybe the spec changes to make are... 1) Examine the cache-control header for all cross-origin resources (not just HTTPS), and only allow them if they don't contain the no-store directive. 2) Remove the special-case restriction that is currently in place only for HTTPS cross-origin resources. On Wed, 30 Mar 2011, Michael Nordman wrote: Fyi: This change has been made in chrome. * respect no-store headers for cross-origin resources (only for HTTPS) * allow HTTPS cross-origin resources to be listed in manifest hosted on HTTPS This seems reasonable. Done. But... I just looked at the current draft of the spec and i think it reflects a greater change than the one i had proposed. I had proposed respecting the no-store directive only for cross-origin resources. The current draft is examining the no-store directive for all resources without regard for their origin. The intent behind the proposed change was to allow authors to continue to override the no-store header for resources in their origin, and to disallow that override only for cross-origin resources. The proposed change is less likely to break existing apps, and I think there are valid use cases for the existing behavior where no-store can be overriden by explicit inclusion in an appcache.
[whatwg] AppCache-related e-mails
On Mon, 31 Jan 2011, Michael Nordman wrote: On Mon, Jan 31, 2011 at 4:20 PM, Ian Hickson i...@hixie.ch wrote: On Thu, 13 Jan 2011, Michael Nordman wrote: AppCache feature request: An https manifest should be able to list resources from other https origins. I've got some app developers asking for this feature. Currently, it's explicitly disallowed by the the spec for valid security reasons, but there are also valid reasons to have this capability, like a webapp that uses resources hosted on gstatic. Seems like a robots.txt like scheme where a site like gstatic can declare that its OK to appcache me from elsewhere is needed. I've opened a chromium bug for this here... http://code.google.com/p/chromium/issues/detail?id=69594 Why do the valid security reasons not apply in this case? The vendors of originA and originB have expressed that its OK for one to appcache resources of the other. In practical terms this is to support a single application being hosted on multiple 'origins'. Google gstatic.com for one example... http://superuser.com/questions/64716/what-is-gstatic-com If I understand the reason for the restrictions on HTTPS as the following... The requirement is intended to prevent hostile.example.com from forcing content from checkout.google.com to be stored onto the user's machine, so that a later offline attack involving grabbing the user's laptop cannot retrieve the information. That doesn't apply in this case because gstatic.com is not hostile to gmail.com. [...suggestion to use CORS...] On Mon, 31 Jan 2011, Jonas Sicking wrote: On Mon, Jan 31, 2011 at 2:57 PM, Michael Nordman micha...@google.com wrote: I don't �fully understand your emphasis on the implied semantics of a CORS request. You say it *only* means a site can read the response. I don't see that in the draft spec. Cross-origin XHR may have been the big motivation behind CORS, but the mechanisms described in the spec appear agnostic with regard to use cases and the abstract section seems to invite additional use cases. The spec does say what the meaning of the Access-Contol-Allow-Origin header means. You're trying to modify that meaning. Consider things from a web authors point of view. The author develops a website, bunnies.com, which contains a HTML page which performs same-site, and thus trusted, XHR requests. The HTML page additionally exposes an API based on postMessage to allow parent frames to communicate with it. As specced, this isn't possible. Nothing from an appcache is ever run with the origin privileges of an origin other than the cache manifest's origin. Since the site exposes various useful HTTP APIs it further has adds Access-Control-Allow-Origin: origin Access-Control-Allow-Credentials: true to a set of the URLs on the site. Including the url of the static HTML page. This is per CORS safe since the HTML page is static there is no information leakage that doesn't happen through a normal server-to-server request anyway. However, with the modification you are proposing, an attacker site could forever pin this page the users app-cache. This means that if there is a security bug in the page, the attacker site could exploit that security problem forever since any javascript in the page will continue to run in the security context of bunnies.com. So all of a sudden the CORS headers that the site added has now had a severe security impact. That's why I'm hampering on the semantics. Another issue is that if a site *is* willing to allow resources to be pinned in the app-cache of another site, it might still not be willing to share the contents of those resources with everyone. If we reuse the existing CORS headers to express is allowed to be app-cache pinned, then we can't satisfy that use case. For example a website could create a HTML page which embeds a user-specific key and exposes a postMessage based API for third party sites to encrypt/decrypt content using that users key. To allow this to happen for off-line apps it wants to allow the HTML page to be pinned in a third party app-cache. But it doesn't want to expose the actual key to the third party sites. If CORS was used to allow cache-pinning, this wouldn't be possible. Well this problem doesn't exist for HTML pages, since they wouldn't ever run from the appcache, so the above wouldn't work anyway. But your concern is valid for, e.g., an image: if we use CORS to allow pinning HTTPS resources, there'd be no way to allow an HTTPS resource to be pinned without granting read access to that resource as well. On Tue, 8 Feb 2011, Michael Nordman wrote: Just had an offline discussion about this and I think the answer can be much simpler than what's been proposed so far. All we have to do for cross-origin HTTPS resources is respect the cache-control no-store header. Let me explain the rationale...