Re: [whatwg] Persistent storage is critically flawed.

2006-09-04 Thread Daniel Veditz
Ian Hickson wrote:
 Note that the problems you raise also exist (and have long existed) with 
 cookies; at least the storage APIs default to a safe state in the general 
 case instead of defaulting to an unsafe state.

In what way do the storage API's default to a safe state? What unsafe
state is the alternative? You've lost me.

Compared to cookies storage seems less safe: the default cookie access
is to the setting host only, a case that does not even exist with global
storage. To publish a cookie to a wider family of hosts the setting host
must explicitly set a domain on the cookie. (Ditto path, but that turns
out to be illusory protection due to the DOM same-origin policy).

Web-app developers aren't complaining they can't read cookies they need
from sub-domains, they're complaining when private cookies leak or when
they're fooled by a cookie injected at a higher domain (e.g. .co.uk
cookies).

Let me throw out two alternatives for providing private persistent
storage, neither of which complicates the API (though may complicate the
implementation).

The first piggy-backs on the domain vs host cookie concept as applied to
entire storage objects. Each host would have a private persistent
storage object that could only be accessed by that host; shared objects
would need to be explicitly named. There should be a difference in how
the two types are named
  a) using the cookie domain nomenclature to indicate the similar
  concepts www.foo.com could represent the host storage only
  accessible to that host, and a leading '.' in .www.foo.com
  would represent a shared storage area. You could argue that
  people will forget the dot as they do with cookie domains,
  but they only do with cookies because UA's let them get away
  with it.
  b) another choice would be to make globalStorage[''] magic and
  mean the private storage for the current host. No one is going
  to implement universally accessible storage (the spec even
  recommends against it), you could just take that out of the spec
  and reuse it for this. All other named areas would be shared
  as described by the spec.

The second alternative would be to have private and shared storage items
within a single storage area. I know you weren't keen on adding another
attribute like secure, what if instead there was a convention such as
keys which start with an underscore are private and can only be
accessed if the current host matches the storage area domain.

My personal preference is for 1b -- use globalStorage[''] as the
non-shared storage area.

-Dan Veditz




Re: [whatwg] Persistent storage is critically flawed.

2006-08-29 Thread Shannon Baker

Ian Hickson said (among other things):
It seems that what you are suggesting is that foo.example.com cannot trust 
example.com, because example.com could then steal data from 
foo.example.com. But there's a much simpler attack scenario for 
example.com: it can just take over foo.example.com directly. For example, 
it could insert new HTML code containing script tags (which is exactly 
what geocities.com does today, for example!), or it could change the DNS 
entries (which is what, e.g., dyndns.org could do).


There is an implicit trust relationship here already. There is no point 
making the storage APIs more secure than the DNS and Web servers they rely 
on. That would be like putting a $500 padlock on a paper screen.


I interpret this comment as: since there is already a hole in the hull 
of our boat, it doesn't matter if we drill some more. The proposal and 
your justification make too many assumptions about the [site owner / 
server owner / DNS provider] relationships and/or security that are 
unverifiable. If I run a server at books.jump.to then I accept that they 
COULD redirect my domain or even insert code but I also expect that I 
could DETECT IT and possibly sue for breach of contract. That's the key 
flaw in your argument - all of the exploits above are easy to detect - 
but no hacking or tampering is required for an untrusted party to access 
shared global storage. All that's required is a single page anywhere on 
jump.to at any time to perform a simple walk over the storage array - 
something which could easily be disguised as a legitimate action. That 
is the crux of my concern - not that the proposal allows new forms of 
abuse - but that it makes existing abuses easier to implement and harder 
to detect and remove.


I'm not going to respond to all your points individually since most 
amount to 'Sure there are problems but UAs will fix them for us' or 
'we'll fix it later'. I can only take your word for that. Besides most 
of the proposals' flaws can be resolved with something like the following:


== THE 'RIGHT' WAY TO STORE PRIVATE USER DATA ==
Remove ALL trust assumptions based on the domain name and use 
public/private certificates to sign data in and out of storage. This 
would also allow IP based hosts to use storage. Remember, our objectives 
for persistent storage are simply:


To store an object on a client and retrieve it later, even if the 
'session' has since been closed.
Allow trusted site(s) access to a previously stored client-side data 
object (such as a user document)


It's quite a simple requirement that is only complicated by the 
standards' lame definition of what a 'trusted' site is. I absolutely 
insist that trust can never be inferred from DNS or IP information. In 
fact even the site authors own domain is somewhat suspect since it can 
change ownership if not renewed (it happened to me when a registrar 
screwed me over). Therefore we need a system of credentials based on the 
site owner. Fortunately this is similar to the problem that SSL site 
certificates solve. Since we already have a way of obtaining and 
verifying certificates it should not be a big stretch to extend this to 
private storage. It wouldn't even need to be as complex as an SSL cert 
since we are only trying to establish that the site trying to access the 
key possesses the same private or group certificate as the site setting 
it. Provided each site can have multiple certs then all the requirements 
of the spec can be met without bleeding out data to abitrary 
third-parties and dodgy ISPs. Sure your hosting service _could_ steal 
your private keys but that is unlikely to go undetected for long and 
would qualify as a crime in most countries (forgery,theft,fraud - take 
your pick).


Anyway that's just a basic outline but it is FUNDAMENTALLY better than 
the one the draft proposes. It requires nothing from the UA except the 
ability to perform certificate validation and nothing from the site 
author other than a way to generate and protect private certificates and 
send signed data. I could go into more detail and even draft a sample 
implementation should anyone be serious about persuing this idea.


On the other hand if Jim is right and the authors of the storage 
proposal are really just pushing for a better user-tracking system under 
the guise of a user feature then this argument is already over. Do 
whatever you like and I'll make sure it's turned off in my browser.


Shannon
Web Developer








Re: [whatwg] Persistent storage is critically flawed.

2006-08-28 Thread Shannon Baker

Ian Hickson wrote:


This is mentioned in the Security and privacy section; the third
bullet point here for example suggests blocking access to public
storage areas:

  http://whatwg.org/specs/web-apps/current-work/#user-tracking

I did read the suggestions and I know the authors have given these 
issues thought. However, my concern is that the solutions are all 
'suggestions' rather than rules. I believe the standard should be more 
definitive to eliminate the potential for browser inconsistencies.



Yes, there's an entire section of the spec discussing this in detail,
with suggested solutions.


Again, the key word here is 'suggest'.


Indeed, the spec suggests blocking such access.


Suggest. See where I'm going with this. The spec is too loose.


There generally is; but for the two cases where there are not, see:

  http://whatwg.org/specs/web-apps/current-work/#storage

...and:

  http://whatwg.org/specs/web-apps/current-work/#storage0

Basically, for the few cases where an author doesn't control his
subdomain space, he should be careful. But this goes without saying.
The same requirement (that authors be responsible) applies to all Web
technologies, for example CGI script authors must be careful not to
allow SQL injection attacks, must check Referer headers, must ensure
POST/GET requests are handled appropriately, and so forth.

As I pointed out this only gives control to the parent domain, not the 
child without regard for the real-world political relationship between 
the two. Also the implication here is that the 'parent' domain is more 
trustworthy and important than the child - that it should always be able 
to read a subdomains private user data. The spec doesn't give the 
developer a chance to be responsible when it hands out user data to 
anybody in the domain hierarchy without regard for whether they are a 
single, trusted entity or not. Don't blame the programmer when the spec 
dictates who can read and write the data with no regard for the authors 
preferences. CGI scripts generally do not have this limitation so your 
analogy is irrelevant.



Indeed; users are geocities.com shouldn't be using this service, and
geocities themselves should put their data (if any) in a private
subdomain space.
Geocities and other free-hosting sites generally have a low server-side 
storage allowance. This means these sites have a _greater_ need for 
persistent storage than 'real' domains.


It doesn't. The solution for mysite.geocities.com is to get their own 
domain.
That's a bit presumptuous. In fact it's downright offensive. The user 
may have valid reasons for not buying a domain. Is it the whatcg's role 
to dictate hosting requirements in a web standard?



The spec was written in conjunction with UA vendors. It is realistic
for UA vendors to provide a hardcoded list of TLDs; in fact, there is
significant work underway to create such a list (and have it be
regualrly updated). That work was originally started for use for HTTP
Cookie implementations, which have similar problems, but would be very
useful for Storage API implementations (although, again as noted in
the draft, not imperative for a secure implementation if the author is
responsible.
I accept that such a list is probably the answer, however I believe the 
list should itself be standardised before becoming part of a web 
standard - otherwise more UA inconsistency.



One could create much more complex APIs, naturally, but I do not see
that this would solve the problems. It wouldn't solve the issue of
authors who don't understand the security implications of their code,
for instance. It also wouldn't prevent the security issue you
mentioned -- why couldn't all *.geocities.com sites cooperate to
violate the user's privacy? Or *.co.uk sites, for that matter? (Note
that it is already possible today to do such tracking with cookies; in
fact it's already possible today even without cookies if you use
Referer tracking, and even without Referer tracking one can use IP and
User-Agent fingerprinting combined with log analysis to perform quite
thorough tracking.)
None of those techniques are reliable. My own weblogs show most users 
have the referer field turned off. Cookies can be safely deleted after 
every session without a major impact on site function (I may have to 
login again). IP tracking is mitigated by proxies and NAT's. The trouble 
with this proposal is that it would allow important data to get lumped 
in with tracking data when the spec suggests that UA's should only 
delete the storage when explicitly asked to do so. I don't have a 
solution to this other than to revoke this proposal or prevent the 
sharing of storage between sites. I accept tracking is inevitable but we 
shouldn't be making it easier either.



Certainly one could add a .readonly field or some such to storage data
items, or even fully fledged ACL APIs, but I don't think that should
be available in a first version, and I'm not sure it's really useful
in later versions 

Re: [whatwg] Persistent storage is critically flawed.

2006-08-28 Thread Jim Ley

On 28/08/06, Shannon Baker [EMAIL PROTECTED] wrote:

I accept tracking is inevitable but we
shouldn't be making it easier either.


You have to remember that the WHAT-WG individual is a Google employee,
a company that now relies on accurate tracking of details, so don't be
surprised that any proposal makes tracking easier and harder to
circumvent.

It's probably a design requirement, of course like all WHAT-WG stuff,
there is no explanation of the problems that are attempting to be
solved with any of the stuff, so it's impossible to really know.

Jim.


Re: [whatwg] Persistent storage is critically flawed.

2006-08-28 Thread Martijn

On 8/28/06, Jim Ley [EMAIL PROTECTED] wrote:

On 28/08/06, Shannon Baker [EMAIL PROTECTED] wrote:
 I accept tracking is inevitable but we
 shouldn't be making it easier either.

You have to remember that the WHAT-WG individual is a Google employee,
a company that now relies on accurate tracking of details, so don't be
surprised that any proposal makes tracking easier and harder to
circumvent.


Well, if the WHAT-WG individual wasn't a Google employee, but an
employee from Microsoft or Mozilla or Opera or any random government,
would that change the above text? I don't think so. So I don't think
that text is implying much, otherwise than there aren't very much
'neutral' organizations involved in writing specifications for the
web.


It's probably a design requirement, of course like all WHAT-WG stuff,
there is no explanation of the problems that are attempting to be
solved with any of the stuff, so it's impossible to really know.


From:
http://www.whatwg.org/specs/web-apps/current-work/#introduction2

The first is designed for scenarios where the user is carrying out a
single transaction, but could be carrying out multiple transactions in
different windows at the same time.

Cookies don't really handle this case well. For example, a user could
be buying plane tickets in two different windows, using the same site.
If the site used cookies to keep track of which ticket the user was
buying, then as the user clicked from page to page in both windows,
the ticket currently being purchased would leak from one window to
the other, potentially causing the user to buy two tickets for the
same flight without really noticing.


and:

The second storage mechanism is designed for storage that spans
multiple windows, and lasts beyond the current session. In particular,
Web applications may wish to store megabytes of user data, such as
entire user-authored documents or a user's mailbox, on the clientside
for performance reasons.

Again, cookies do not handle this case well, because they are
transmitted with every request.


That seem to me two use cases of  problems that are attempting to be
solved, not?

Regards,
Martijn


Jim.



Re: [whatwg] Persistent storage is critically flawed.

2006-08-28 Thread Ian Hickson
On Mon, 28 Aug 2006, Shannon Baker wrote:
  
  This is mentioned in the Security and privacy section; the third 
  bullet point here for example suggests blocking access to public 
  storage areas:
  
http://whatwg.org/specs/web-apps/current-work/#user-tracking
 
 I did read the suggestions and I know the authors have given these 
 issues thought. However, my concern is that the solutions are all 
 'suggestions' rather than rules. I believe the standard should be more 
 definitive to eliminate the potential for browser inconsistencies.

The problem is that the solution is to use a list that doesn't exist yet. 
If the list existed and was firmly established and proved usable, then we 
could require its use, but since it is still being developed (by the 
people trying to implement the Storage APIs), we can't really require it.


  Basically, for the few cases where an author doesn't control his 
  subdomain space, he should be careful. But this goes without saying. 
  The same requirement (that authors be responsible) applies to all Web 
  technologies, for example CGI script authors must be careful not to 
  allow SQL injection attacks, must check Referer headers, must ensure 
  POST/GET requests are handled appropriately, and so forth.
 
 As I pointed out this only gives control to the parent domain, not the 
 child without regard for the real-world political relationship between 
 the two. Also the implication here is that the 'parent' domain is more 
 trustworthy and important than the child - that it should always be able 
 to read a subdomains private user data. The spec doesn't give the 
 developer a chance to be responsible when it hands out user data to 
 anybody in the domain hierarchy without regard for whether they are a 
 single, trusted entity or not. Don't blame the programmer when the spec 
 dictates who can read and write the data with no regard for the authors 
 preferences. CGI scripts generally do not have this limitation so your 
 analogy is irrelevant.

It seems that what you are suggesting is that foo.example.com cannot trust 
example.com, because example.com could then steal data from 
foo.example.com. But there's a much simpler attack scenario for 
example.com: it can just take over foo.example.com directly. For example, 
it could insert new HTML code containing script tags (which is exactly 
what geocities.com does today, for example!), or it could change the DNS 
entries (which is what, e.g., dyndns.org could do).

There is an implicit trust relationship here already. There is no point 
making the storage APIs more secure than the DNS and Web servers they rely 
on. That would be like putting a $500 padlock on a paper screen.


  Indeed; users are geocities.com shouldn't be using this service, and 
  geocities themselves should put their data (if any) in a private 
  subdomain space.

 Geocities and other free-hosting sites generally have a low server-side 
 storage allowance. This means these sites have a _greater_ need for 
 persistent storage than 'real' domains.

They can use it if they want. It just won't be secure. This is true 
regardless of how we design the API, since the Web server can insert 
arbitary content into their site.


  It doesn't. The solution for mysite.geocities.com is to get their own 
  domain.

 That's a bit presumptuous. In fact it's downright offensive. The user 
 may have valid reasons for not buying a domain. Is it the whatcg's role 
 to dictate hosting requirements in a web standard?

I'm just stating a fact of life. If you want a secure data storage 
mechanism, you don't host your site on a system where you don't trust the 
hosting provider.


 I accept that such a list is probably the answer, however I believe the 
 list should itself be standardised before becoming part of a web 
 standard - otherwise more UA inconsistency.

I think we should change the spec once the list is ready, yes. This isn't 
yet the case, though. In the meantime, I don't think it's wise for us to 
restrict the possible security solutions; a UA vendor might come up with a 
better (and more scalable) solution.

Note that the problems you raise also exist (and have long existed) with 
cookies; at least the storage APIs default to a safe state in the general 
case instead of defaulting to an unsafe state.


  One could create much more complex APIs, naturally, but I do not see 
  that this would solve the problems. It wouldn't solve the issue of 
  authors who don't understand the security implications of their code, 
  for instance. It also wouldn't prevent the security issue you 
  mentioned -- why couldn't all *.geocities.com sites cooperate to 
  violate the user's privacy? Or *.co.uk sites, for that matter? (Note 
  that it is already possible today to do such tracking with cookies; in 
  fact it's already possible today even without cookies if you use 
  Referer tracking, and even without Referer tracking one can use IP and 
  User-Agent fingerprinting combined with 

Re: [whatwg] Persistent storage is critically flawed.

2006-08-27 Thread Alexey Feldgendler
On Sun, 27 Aug 2006 19:11:17 +0700, Shannon Baker [EMAIL PROTECTED] wrote:

 But why bother? This whole problem is easily solved by allowing data to
 be stored with an access control list (ACL). For example the site
 developer should be able to specify that a data object be available to
 '*.example.com' and 'fred.geocities.com' only. How this is done (as a
 string or array) is irrelevant to this post but it must be done rather
 than relying on implicit trust where none exists.

While there are serious risks associated with global storage, I don't see how 
replacing the global storage with arbitrary ACLs on data items will help reduce 
them. All those advertisers etc can store a data item accessible to *, can't 
they?


-- 
Alexey Feldgendler [EMAIL PROTECTED]
[ICQ: 115226275] http://feldgendler.livejournal.com


Re: [whatwg] Persistent storage is critically flawed.

2006-08-27 Thread Ian Hickson

On 8/27/06, Shannon Baker [EMAIL PROTECTED] wrote:


== 1: Authors failure to handle the implications of global storage. ==
First lets talk about the global store (|globalStorage['']) which is
accessible from ALL domains.


This is mentioned in the Security and privacy section; the third
bullet point here for example suggests blocking access to public
storage areas:

  http://whatwg.org/specs/web-apps/current-work/#user-tracking



Did anyone stop to really consider the implications of this? I mean,
sure the standard implies that UA's should deal with the security
implications of this themselves, but what if they don't? Let's say a UA
does allow access to this global storage, what would we expect to find
in this storage space? Does the author really believe that this will be
only used for sharing preferences between domains for the benefit of the
user? Hell no! It's going to look like this:

KEY   VALUE
adsense3wd4ghgtut9jhn
kjh234kj23u4y2j34234hkj234hkj23h4k234k234   --  Advertiser user tracking
johnyizcool   I Kickerz Azz!!
--  Attention freak
USconspiracy  911 was an inside job. Tell
everybody!  --  Political activist
UScitID
kh546jkh45856456h45iu6y46j45j6h54kj6h45k6   --  Government spying
GodsLove.com  Warning! This user supports
abortion.   --  Vigilantie user tracking


Yes, there's an entire section of the spec discussing this in detail,
with suggested solutions.



|What possible use could this storage region ever have to a legitimate
site? Especially when sensible UA's will just block it anyway? I for one
do not want my browser becoming some sort of global 'grafitti wall'
written on by every website I visit. Truthfully I cannot come up with a
single legitimate use for the 'global' or 'com' regions that cannot be
handled by per-domain storage or global storage with ACLs (see next point).


Indeed, the spec suggests blocking such access.



== 2: Naive access controls which will result in guaranteed privacy
violations. ==
The standard advocates the two-way sharing of data between domains and
subdomains - Namely that host.example.com should share data with the
servers at 'www.host.example.com', 'example.com', and all servers rooted
at '.com'. In its own words: Each domain and each subdomain has its own
separate storage area. Subdomains can access the storage areas of parent
domains, and domains can access the storage areas of subdomains.

My objection to this is similar to my objection to the 'global' storage
space - It's totally naive. The whole scheme is based on the unfounded
belief that there is a guaranteed trust relationship available between
the parties controlling each of these domains.


There generally is; but for the two cases where there are not, see:

  http://whatwg.org/specs/web-apps/current-work/#storage

...and:

  http://whatwg.org/specs/web-apps/current-work/#storage0

Basically, for the few cases where an author doesn't control his
subdomain space, he should be careful. But this goes without saying.
The same requirement (that authors be responsible) applies to all Web
technologies, for example CGI script authors must be careful not to
allow SQL injection attacks, must check Referer headers, must ensure
POST/GET requests are handled appropriately, and so forth.



Sure, one may be reliant
on another for DNS redirection but that hardly implies that one wishes
to share potentially confidential data with the other. As the author
themselves stated there is no guarantee that users of geocities.com
sub-domains wish their users data to be shared with GeoCities.


Indeed; users are geocities.com shouldn't be using this service, and
geocities themselves should put their data (if any) in a private
subdomain space.



The
author states that geocities could mitigate this risk with a fake
sub-domain but how does that help the owner of mysite.geocities.com?


It doesn't. The solution for mysite.geocities.com is to get their own domain.



The
author implies that UA's should deal with this themselves and fails to
provide any REALISTIC guidelines for them to do so (sure lets hardcode
all the TLD's and free hosting providers).


The spec was written in conjunction with UA vendors. It is realistic
for UA vendors to provide a hardcoded list of TLDs; in fact, there is
significant work underway to create such a list (and have it be
regualrly updated). That work was originally started for use for HTTP
Cookie implementations, which have similar problems, but would be very
useful for Storage API implementations (although, again as noted in
the draft, not imperative for a secure implementation if the author is
responsible.



What annoys me is that the
author acknowledges the issue and then passes the buck to browser
manufacturers as though it's their problem and they should solve it in
any (incompatible or non-compliant) way they like.


Any solution must be compliant, by definition; regarding
compatibility, it