Re: [whatwg] Persistent storage is critically flawed.
Ian Hickson wrote: Note that the problems you raise also exist (and have long existed) with cookies; at least the storage APIs default to a safe state in the general case instead of defaulting to an unsafe state. In what way do the storage API's default to a safe state? What unsafe state is the alternative? You've lost me. Compared to cookies storage seems less safe: the default cookie access is to the setting host only, a case that does not even exist with global storage. To publish a cookie to a wider family of hosts the setting host must explicitly set a domain on the cookie. (Ditto path, but that turns out to be illusory protection due to the DOM same-origin policy). Web-app developers aren't complaining they can't read cookies they need from sub-domains, they're complaining when private cookies leak or when they're fooled by a cookie injected at a higher domain (e.g. .co.uk cookies). Let me throw out two alternatives for providing private persistent storage, neither of which complicates the API (though may complicate the implementation). The first piggy-backs on the domain vs host cookie concept as applied to entire storage objects. Each host would have a private persistent storage object that could only be accessed by that host; shared objects would need to be explicitly named. There should be a difference in how the two types are named a) using the cookie domain nomenclature to indicate the similar concepts www.foo.com could represent the host storage only accessible to that host, and a leading '.' in .www.foo.com would represent a shared storage area. You could argue that people will forget the dot as they do with cookie domains, but they only do with cookies because UA's let them get away with it. b) another choice would be to make globalStorage[''] magic and mean the private storage for the current host. No one is going to implement universally accessible storage (the spec even recommends against it), you could just take that out of the spec and reuse it for this. All other named areas would be shared as described by the spec. The second alternative would be to have private and shared storage items within a single storage area. I know you weren't keen on adding another attribute like secure, what if instead there was a convention such as keys which start with an underscore are private and can only be accessed if the current host matches the storage area domain. My personal preference is for 1b -- use globalStorage[''] as the non-shared storage area. -Dan Veditz
Re: [whatwg] Persistent storage is critically flawed.
Ian Hickson said (among other things): It seems that what you are suggesting is that foo.example.com cannot trust example.com, because example.com could then steal data from foo.example.com. But there's a much simpler attack scenario for example.com: it can just take over foo.example.com directly. For example, it could insert new HTML code containing script tags (which is exactly what geocities.com does today, for example!), or it could change the DNS entries (which is what, e.g., dyndns.org could do). There is an implicit trust relationship here already. There is no point making the storage APIs more secure than the DNS and Web servers they rely on. That would be like putting a $500 padlock on a paper screen. I interpret this comment as: since there is already a hole in the hull of our boat, it doesn't matter if we drill some more. The proposal and your justification make too many assumptions about the [site owner / server owner / DNS provider] relationships and/or security that are unverifiable. If I run a server at books.jump.to then I accept that they COULD redirect my domain or even insert code but I also expect that I could DETECT IT and possibly sue for breach of contract. That's the key flaw in your argument - all of the exploits above are easy to detect - but no hacking or tampering is required for an untrusted party to access shared global storage. All that's required is a single page anywhere on jump.to at any time to perform a simple walk over the storage array - something which could easily be disguised as a legitimate action. That is the crux of my concern - not that the proposal allows new forms of abuse - but that it makes existing abuses easier to implement and harder to detect and remove. I'm not going to respond to all your points individually since most amount to 'Sure there are problems but UAs will fix them for us' or 'we'll fix it later'. I can only take your word for that. Besides most of the proposals' flaws can be resolved with something like the following: == THE 'RIGHT' WAY TO STORE PRIVATE USER DATA == Remove ALL trust assumptions based on the domain name and use public/private certificates to sign data in and out of storage. This would also allow IP based hosts to use storage. Remember, our objectives for persistent storage are simply: To store an object on a client and retrieve it later, even if the 'session' has since been closed. Allow trusted site(s) access to a previously stored client-side data object (such as a user document) It's quite a simple requirement that is only complicated by the standards' lame definition of what a 'trusted' site is. I absolutely insist that trust can never be inferred from DNS or IP information. In fact even the site authors own domain is somewhat suspect since it can change ownership if not renewed (it happened to me when a registrar screwed me over). Therefore we need a system of credentials based on the site owner. Fortunately this is similar to the problem that SSL site certificates solve. Since we already have a way of obtaining and verifying certificates it should not be a big stretch to extend this to private storage. It wouldn't even need to be as complex as an SSL cert since we are only trying to establish that the site trying to access the key possesses the same private or group certificate as the site setting it. Provided each site can have multiple certs then all the requirements of the spec can be met without bleeding out data to abitrary third-parties and dodgy ISPs. Sure your hosting service _could_ steal your private keys but that is unlikely to go undetected for long and would qualify as a crime in most countries (forgery,theft,fraud - take your pick). Anyway that's just a basic outline but it is FUNDAMENTALLY better than the one the draft proposes. It requires nothing from the UA except the ability to perform certificate validation and nothing from the site author other than a way to generate and protect private certificates and send signed data. I could go into more detail and even draft a sample implementation should anyone be serious about persuing this idea. On the other hand if Jim is right and the authors of the storage proposal are really just pushing for a better user-tracking system under the guise of a user feature then this argument is already over. Do whatever you like and I'll make sure it's turned off in my browser. Shannon Web Developer
Re: [whatwg] Persistent storage is critically flawed.
Ian Hickson wrote: This is mentioned in the Security and privacy section; the third bullet point here for example suggests blocking access to public storage areas: http://whatwg.org/specs/web-apps/current-work/#user-tracking I did read the suggestions and I know the authors have given these issues thought. However, my concern is that the solutions are all 'suggestions' rather than rules. I believe the standard should be more definitive to eliminate the potential for browser inconsistencies. Yes, there's an entire section of the spec discussing this in detail, with suggested solutions. Again, the key word here is 'suggest'. Indeed, the spec suggests blocking such access. Suggest. See where I'm going with this. The spec is too loose. There generally is; but for the two cases where there are not, see: http://whatwg.org/specs/web-apps/current-work/#storage ...and: http://whatwg.org/specs/web-apps/current-work/#storage0 Basically, for the few cases where an author doesn't control his subdomain space, he should be careful. But this goes without saying. The same requirement (that authors be responsible) applies to all Web technologies, for example CGI script authors must be careful not to allow SQL injection attacks, must check Referer headers, must ensure POST/GET requests are handled appropriately, and so forth. As I pointed out this only gives control to the parent domain, not the child without regard for the real-world political relationship between the two. Also the implication here is that the 'parent' domain is more trustworthy and important than the child - that it should always be able to read a subdomains private user data. The spec doesn't give the developer a chance to be responsible when it hands out user data to anybody in the domain hierarchy without regard for whether they are a single, trusted entity or not. Don't blame the programmer when the spec dictates who can read and write the data with no regard for the authors preferences. CGI scripts generally do not have this limitation so your analogy is irrelevant. Indeed; users are geocities.com shouldn't be using this service, and geocities themselves should put their data (if any) in a private subdomain space. Geocities and other free-hosting sites generally have a low server-side storage allowance. This means these sites have a _greater_ need for persistent storage than 'real' domains. It doesn't. The solution for mysite.geocities.com is to get their own domain. That's a bit presumptuous. In fact it's downright offensive. The user may have valid reasons for not buying a domain. Is it the whatcg's role to dictate hosting requirements in a web standard? The spec was written in conjunction with UA vendors. It is realistic for UA vendors to provide a hardcoded list of TLDs; in fact, there is significant work underway to create such a list (and have it be regualrly updated). That work was originally started for use for HTTP Cookie implementations, which have similar problems, but would be very useful for Storage API implementations (although, again as noted in the draft, not imperative for a secure implementation if the author is responsible. I accept that such a list is probably the answer, however I believe the list should itself be standardised before becoming part of a web standard - otherwise more UA inconsistency. One could create much more complex APIs, naturally, but I do not see that this would solve the problems. It wouldn't solve the issue of authors who don't understand the security implications of their code, for instance. It also wouldn't prevent the security issue you mentioned -- why couldn't all *.geocities.com sites cooperate to violate the user's privacy? Or *.co.uk sites, for that matter? (Note that it is already possible today to do such tracking with cookies; in fact it's already possible today even without cookies if you use Referer tracking, and even without Referer tracking one can use IP and User-Agent fingerprinting combined with log analysis to perform quite thorough tracking.) None of those techniques are reliable. My own weblogs show most users have the referer field turned off. Cookies can be safely deleted after every session without a major impact on site function (I may have to login again). IP tracking is mitigated by proxies and NAT's. The trouble with this proposal is that it would allow important data to get lumped in with tracking data when the spec suggests that UA's should only delete the storage when explicitly asked to do so. I don't have a solution to this other than to revoke this proposal or prevent the sharing of storage between sites. I accept tracking is inevitable but we shouldn't be making it easier either. Certainly one could add a .readonly field or some such to storage data items, or even fully fledged ACL APIs, but I don't think that should be available in a first version, and I'm not sure it's really useful in later versions
Re: [whatwg] Persistent storage is critically flawed.
On 28/08/06, Shannon Baker [EMAIL PROTECTED] wrote: I accept tracking is inevitable but we shouldn't be making it easier either. You have to remember that the WHAT-WG individual is a Google employee, a company that now relies on accurate tracking of details, so don't be surprised that any proposal makes tracking easier and harder to circumvent. It's probably a design requirement, of course like all WHAT-WG stuff, there is no explanation of the problems that are attempting to be solved with any of the stuff, so it's impossible to really know. Jim.
Re: [whatwg] Persistent storage is critically flawed.
On 8/28/06, Jim Ley [EMAIL PROTECTED] wrote: On 28/08/06, Shannon Baker [EMAIL PROTECTED] wrote: I accept tracking is inevitable but we shouldn't be making it easier either. You have to remember that the WHAT-WG individual is a Google employee, a company that now relies on accurate tracking of details, so don't be surprised that any proposal makes tracking easier and harder to circumvent. Well, if the WHAT-WG individual wasn't a Google employee, but an employee from Microsoft or Mozilla or Opera or any random government, would that change the above text? I don't think so. So I don't think that text is implying much, otherwise than there aren't very much 'neutral' organizations involved in writing specifications for the web. It's probably a design requirement, of course like all WHAT-WG stuff, there is no explanation of the problems that are attempting to be solved with any of the stuff, so it's impossible to really know. From: http://www.whatwg.org/specs/web-apps/current-work/#introduction2 The first is designed for scenarios where the user is carrying out a single transaction, but could be carrying out multiple transactions in different windows at the same time. Cookies don't really handle this case well. For example, a user could be buying plane tickets in two different windows, using the same site. If the site used cookies to keep track of which ticket the user was buying, then as the user clicked from page to page in both windows, the ticket currently being purchased would leak from one window to the other, potentially causing the user to buy two tickets for the same flight without really noticing. and: The second storage mechanism is designed for storage that spans multiple windows, and lasts beyond the current session. In particular, Web applications may wish to store megabytes of user data, such as entire user-authored documents or a user's mailbox, on the clientside for performance reasons. Again, cookies do not handle this case well, because they are transmitted with every request. That seem to me two use cases of problems that are attempting to be solved, not? Regards, Martijn Jim.
Re: [whatwg] Persistent storage is critically flawed.
On Mon, 28 Aug 2006, Shannon Baker wrote: This is mentioned in the Security and privacy section; the third bullet point here for example suggests blocking access to public storage areas: http://whatwg.org/specs/web-apps/current-work/#user-tracking I did read the suggestions and I know the authors have given these issues thought. However, my concern is that the solutions are all 'suggestions' rather than rules. I believe the standard should be more definitive to eliminate the potential for browser inconsistencies. The problem is that the solution is to use a list that doesn't exist yet. If the list existed and was firmly established and proved usable, then we could require its use, but since it is still being developed (by the people trying to implement the Storage APIs), we can't really require it. Basically, for the few cases where an author doesn't control his subdomain space, he should be careful. But this goes without saying. The same requirement (that authors be responsible) applies to all Web technologies, for example CGI script authors must be careful not to allow SQL injection attacks, must check Referer headers, must ensure POST/GET requests are handled appropriately, and so forth. As I pointed out this only gives control to the parent domain, not the child without regard for the real-world political relationship between the two. Also the implication here is that the 'parent' domain is more trustworthy and important than the child - that it should always be able to read a subdomains private user data. The spec doesn't give the developer a chance to be responsible when it hands out user data to anybody in the domain hierarchy without regard for whether they are a single, trusted entity or not. Don't blame the programmer when the spec dictates who can read and write the data with no regard for the authors preferences. CGI scripts generally do not have this limitation so your analogy is irrelevant. It seems that what you are suggesting is that foo.example.com cannot trust example.com, because example.com could then steal data from foo.example.com. But there's a much simpler attack scenario for example.com: it can just take over foo.example.com directly. For example, it could insert new HTML code containing script tags (which is exactly what geocities.com does today, for example!), or it could change the DNS entries (which is what, e.g., dyndns.org could do). There is an implicit trust relationship here already. There is no point making the storage APIs more secure than the DNS and Web servers they rely on. That would be like putting a $500 padlock on a paper screen. Indeed; users are geocities.com shouldn't be using this service, and geocities themselves should put their data (if any) in a private subdomain space. Geocities and other free-hosting sites generally have a low server-side storage allowance. This means these sites have a _greater_ need for persistent storage than 'real' domains. They can use it if they want. It just won't be secure. This is true regardless of how we design the API, since the Web server can insert arbitary content into their site. It doesn't. The solution for mysite.geocities.com is to get their own domain. That's a bit presumptuous. In fact it's downright offensive. The user may have valid reasons for not buying a domain. Is it the whatcg's role to dictate hosting requirements in a web standard? I'm just stating a fact of life. If you want a secure data storage mechanism, you don't host your site on a system where you don't trust the hosting provider. I accept that such a list is probably the answer, however I believe the list should itself be standardised before becoming part of a web standard - otherwise more UA inconsistency. I think we should change the spec once the list is ready, yes. This isn't yet the case, though. In the meantime, I don't think it's wise for us to restrict the possible security solutions; a UA vendor might come up with a better (and more scalable) solution. Note that the problems you raise also exist (and have long existed) with cookies; at least the storage APIs default to a safe state in the general case instead of defaulting to an unsafe state. One could create much more complex APIs, naturally, but I do not see that this would solve the problems. It wouldn't solve the issue of authors who don't understand the security implications of their code, for instance. It also wouldn't prevent the security issue you mentioned -- why couldn't all *.geocities.com sites cooperate to violate the user's privacy? Or *.co.uk sites, for that matter? (Note that it is already possible today to do such tracking with cookies; in fact it's already possible today even without cookies if you use Referer tracking, and even without Referer tracking one can use IP and User-Agent fingerprinting combined with
Re: [whatwg] Persistent storage is critically flawed.
On Sun, 27 Aug 2006 19:11:17 +0700, Shannon Baker [EMAIL PROTECTED] wrote: But why bother? This whole problem is easily solved by allowing data to be stored with an access control list (ACL). For example the site developer should be able to specify that a data object be available to '*.example.com' and 'fred.geocities.com' only. How this is done (as a string or array) is irrelevant to this post but it must be done rather than relying on implicit trust where none exists. While there are serious risks associated with global storage, I don't see how replacing the global storage with arbitrary ACLs on data items will help reduce them. All those advertisers etc can store a data item accessible to *, can't they? -- Alexey Feldgendler [EMAIL PROTECTED] [ICQ: 115226275] http://feldgendler.livejournal.com
Re: [whatwg] Persistent storage is critically flawed.
On 8/27/06, Shannon Baker [EMAIL PROTECTED] wrote: == 1: Authors failure to handle the implications of global storage. == First lets talk about the global store (|globalStorage['']) which is accessible from ALL domains. This is mentioned in the Security and privacy section; the third bullet point here for example suggests blocking access to public storage areas: http://whatwg.org/specs/web-apps/current-work/#user-tracking Did anyone stop to really consider the implications of this? I mean, sure the standard implies that UA's should deal with the security implications of this themselves, but what if they don't? Let's say a UA does allow access to this global storage, what would we expect to find in this storage space? Does the author really believe that this will be only used for sharing preferences between domains for the benefit of the user? Hell no! It's going to look like this: KEY VALUE adsense3wd4ghgtut9jhn kjh234kj23u4y2j34234hkj234hkj23h4k234k234 -- Advertiser user tracking johnyizcool I Kickerz Azz!! -- Attention freak USconspiracy 911 was an inside job. Tell everybody! -- Political activist UScitID kh546jkh45856456h45iu6y46j45j6h54kj6h45k6 -- Government spying GodsLove.com Warning! This user supports abortion. -- Vigilantie user tracking Yes, there's an entire section of the spec discussing this in detail, with suggested solutions. |What possible use could this storage region ever have to a legitimate site? Especially when sensible UA's will just block it anyway? I for one do not want my browser becoming some sort of global 'grafitti wall' written on by every website I visit. Truthfully I cannot come up with a single legitimate use for the 'global' or 'com' regions that cannot be handled by per-domain storage or global storage with ACLs (see next point). Indeed, the spec suggests blocking such access. == 2: Naive access controls which will result in guaranteed privacy violations. == The standard advocates the two-way sharing of data between domains and subdomains - Namely that host.example.com should share data with the servers at 'www.host.example.com', 'example.com', and all servers rooted at '.com'. In its own words: Each domain and each subdomain has its own separate storage area. Subdomains can access the storage areas of parent domains, and domains can access the storage areas of subdomains. My objection to this is similar to my objection to the 'global' storage space - It's totally naive. The whole scheme is based on the unfounded belief that there is a guaranteed trust relationship available between the parties controlling each of these domains. There generally is; but for the two cases where there are not, see: http://whatwg.org/specs/web-apps/current-work/#storage ...and: http://whatwg.org/specs/web-apps/current-work/#storage0 Basically, for the few cases where an author doesn't control his subdomain space, he should be careful. But this goes without saying. The same requirement (that authors be responsible) applies to all Web technologies, for example CGI script authors must be careful not to allow SQL injection attacks, must check Referer headers, must ensure POST/GET requests are handled appropriately, and so forth. Sure, one may be reliant on another for DNS redirection but that hardly implies that one wishes to share potentially confidential data with the other. As the author themselves stated there is no guarantee that users of geocities.com sub-domains wish their users data to be shared with GeoCities. Indeed; users are geocities.com shouldn't be using this service, and geocities themselves should put their data (if any) in a private subdomain space. The author states that geocities could mitigate this risk with a fake sub-domain but how does that help the owner of mysite.geocities.com? It doesn't. The solution for mysite.geocities.com is to get their own domain. The author implies that UA's should deal with this themselves and fails to provide any REALISTIC guidelines for them to do so (sure lets hardcode all the TLD's and free hosting providers). The spec was written in conjunction with UA vendors. It is realistic for UA vendors to provide a hardcoded list of TLDs; in fact, there is significant work underway to create such a list (and have it be regualrly updated). That work was originally started for use for HTTP Cookie implementations, which have similar problems, but would be very useful for Storage API implementations (although, again as noted in the draft, not imperative for a secure implementation if the author is responsible. What annoys me is that the author acknowledges the issue and then passes the buck to browser manufacturers as though it's their problem and they should solve it in any (incompatible or non-compliant) way they like. Any solution must be compliant, by definition; regarding compatibility, it