11.11.2010, в 11:38, Julian Reschke написал(а):

> I don't think the IETF will ever approve a standard where the encoding 
> depends on the recipient's locale, with no reliable way to find out upfront 
> what that locale is.

Yes, that makes good sense to me.

Note that Safari's doesn't rely on OS locale (other than for picking its 
original default browser encoding, which can then be changed by the user). 
Surely, some people are allergic to the idea of default browser encoding too, 
but it's unavoidable in practice - we can't interpret untagged content as 
Latin-1.

> I disagree that "raw bytes" are a de facto standard; they do not interoperate 
> across UAs (see above)...


I think that we agree about technical details and empirical data now, but 
describe them differently.

Surely, there is no way (that I'm aware of) to guarantee correct downloaded 
file name in all browsers for all users. A lot of server operators only care 
about users in their country, and can reasonably (i.e. with negligible cost to 
business) rely on Windows locale being set. They can just send raw bytes in 
language default encoding in Content-Disposition, and that works for them and 
their clients. For all I know, that's what almost everyone does, and it's 
"interoperable" for them.

Global operators like Google or Yahoo obviously want to cover many languages at 
once, and they just send different HTTP headers to different browsers. That's 
not great, but that's unavoidable unless IE changes - whether changing 
interpretation of raw bytes or implementing RFC5987, IE would have to change.

> The spec (RFC 2616) already says that raw bytes are ISO-8859-1, so UAs 
> overriding this are in violation of the spec (IMHO).


Yes, that's why I'd really welcome a spec that's closer to reality in this 
regard. No browser whose vendor cares about markets not covered by Latin-1 can 
actually treat raw bytes in Content-Disposition as ISO-8859-1. No server 
operator who wants to serve downloadable content in those markets can stick to 
ISO-8859-1.

> Introducing a separate parameter (filename*) that doesn't carry the legacy 
> problems is in my opinion the best way to move forward.


As a browser implementor, I don't have a strong opinion about filename*. The 
actual content I see on the Web uses raw bytes in Content-Disposition, so I 
mostly care about that being adequately specified, so that at least non-IE 
browsers could all work the same. Firefox and Safari are already pretty close. 
It's unfortunate if Chrome does not implement this fallback scheme.

Generally speaking, having no custom encoding is better than having an opaque 
custom encoding. In my opinion, the ideal situation would be for servers to 
send raw UTF-8, and for clients to do what Safari and Firefox do (try UTF-8, 
then fall back to other encodings). This may be unachievable in practice, in 
which case interoperability via opaque RFC2231-style encoding is a lesser evil.

- WBR, Alexey Proskuryakov

_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

Reply via email to