Thanks - a few replies below.
On 14/06/2009, at 10:24 PM, Anne van Kesteren wrote:
On Sun, 14 Jun 2009 03:58:31 +0200, Mark Nottingham <[email protected]>
wrote:
As I said, I raised have raised substantive issues before:
http://lists.w3.org/Archives/Public/public-appformats/2008Jan/0226.html
and don't believe they were formally addressed (note that I'm using
Process terminology here). That experience didn't lead me to believe
that it was worth spending the time to track the specification
closely.
What do you mean? We replied in a timely manner and attempted to
address all of your issues until you were either satisfied with the
response or stopped replying.
Your perception of what happened then is different than mine. At any
rate, there's not much point in arguing about it now. I think it's
unfortunate that CORS is designed how it is, but here we are...
Content providers wanted the flexbility of not having to list every
header in advance. Both so debugging headers and such would not
have to
be exposed and to reduce the payload.
Which content providers? How much extra payload do you really expect
this to be?
I believe Google, Microsoft, Mozilla, and SitePen.
I don't know how much payload would be saved, but that was not the
only reason.
The others being?
Implementors did not want a blacklist. The attack vector is the server
inadvertently exposing headers it did not want to.
Has this been discussed in depth before? If so, do you have a ref? I
think it deserves some serious discussion if not.
It has not been exhaustively discussed, I think, although it
certainly has been discussed. There have been plans for allowing the
list to be extended if the server explicitly opts in to more headers.
Some prior discussion on response headers:
http://lists.w3.org/Archives/Public/public-webapi/2008Apr/thread.html#msg58
So, the crux of the motivation seems to be Ian's:
I don't think we should change this without a better reason. There's
no reason to believe that some servers don't have information in the
headers that shouldn't be seen by third-parties, and it's the kind
of thing that would be really easy to miss when securing a page for
third-party access.
It seems odd to me that you're willing to expose all of the data in
the response, but almost none of the metadata (headers). Taking a
quick look through the message header registry, a number of candidates
that would be useful -- if you allowed them -- come up, including:
Age, Allow, Alternates, Content-Disposition, Content-Encoding, Content-
ID, Content-Length, Content-Location, Content-MD5, Content-Range,
Content-Script-Type, Content-Style-Type, Date, Link, Location, P3P,
PICS-Label, Retry-After, Server, Vary, Warning
Note that many of these are critical to understanding the message, and
disallowing many others will disallow many applications.
For example, Content-Length will tell the application whether or not
the response is complete; Content-MD5 is another integrity check; as
noted by others, both Content-Location and Location carry useful
information; Age and Date allow an application to determine how long
something has been cached; P3P and PICS-Label carry metadata that some
applications may have need for; Vary tells applications what
dimensions the response varies across; Warning carries information
about how the response has been cached, and Link looks to have a
number of use cases queued up.
This is not a complete list; adding these headers to your whitelist
will not resolve this issue.
Here's a thread on request headers:
http://lists.w3.org/Archives/Public/public-appformats/2008Feb/thread.html#msg168
Similar concerns seem to apply here. You mention theoretical attacks/
risks a lot, and people who disagree are asked to prove a negative --
an unrealistically high bar.
* Chattiness - The protocol set out here requires a pre-flight request
every time a new URL is used; this will force Web sites to tunnel
requests for different resources over the same URL for performance/
efficiency reasons, and as such is not in tune with the Web
architecture. A much more scalable approach would be to define a
"map"
of the Web site/origin to define what cross-site requests are
allowed
where (in the style of robots.txt et al; see also the work being
done
on
host-meta, XRDS and similar). I made this comment on an older
draft a
long time ago, and have still not received a satisfactory response.
See crossdomain.xml. It is a security nightmare. Especially when a
single origin is being used for several APIs.
Waving your hands and saying "security" is not a substantial
response.
Ok,
http://lists.w3.org/Archives/Public/public-appformats/2008Feb/0050.html
http://lists.w3.org/Archives/Public/public-appformats/2008Feb/0052.html
It's interesting that you point that out. Reading the entire thread,
it seemed like there was consensus on a requirement there, but it
wasn't added to the document. Why not?
http://lists.w3.org/Archives/Public/public-appformats/2008Jan/thread.html#msg248
Ah, yes the infamous "I disagree with much of the Web arch" thread.
Enough said, I think.
http://www.w3.org/2005/06/tracker/waf/issues/22
http://lists.w3.org/Archives/Public/public-appformats/2008Jan/thread.html#msg303
Where is the sentence that the resolution of that issue refers to?
Added servers. It's not clear to me how to rewrite the
specification in
a way that does not leave gaps. If you can find another editor who
can
do that for us that'd be ok I suppose.
I don't think saying (roughly) "that's the best we can do with
limited
resources" is a substantial response either, but I have a feeling
it'll
be accepted nevertheless :-/
By and large it seems like an editorial request and as editor I do
not see how to address it.
Understood. It's up to the Director to decide if the document meets
the minimum requirements for Recommendation.
* Conformance Criteria - "A conformant server is one that..." -->
"A
conformant resource is one that..."
I haven't done this yet. Does it still make sense to talk about a
server processing model if we do this?
Probably "resource processing model..."
I started making changes in that direction and it did not make a
whole lot of sense. E.g. how would you rephrase "In response to a
simple cross-origin request or actual request the server indicates
whether or not to share the resource."?
"In response to either an actual request or a simple cross-origin
request, the resource indicates whether or not to share the response."
although in this particular case, "... the server indicates whether or
not to share the response" would work as well (since it's unambiguous).
The distinction is more important when talking about scoping -- e.g.,
does a decision apply to a single response, all responses from a
resource (as identified by a URI), or all responses from a server.
* Generic Cross-Origin Request Algorithms - for clarity, can this be
split up into separate subsections?
I added spacing instead. Does this work?
My personal preference would be subsections, to make sure they're
distinct.
I left it as is for now. If more people want this I'll make the
change.
Ack.
--
Mark Nottingham http://www.mnot.net/