Re: [whatwg] [encoding] utf-16

2012-01-03 Thread Leif Halvard Silli
Henri Sivonen, Mon Jan 2 07:43:07 PST 2012 On Fri, Dec 30, 2011 at 12:54 PM, Anne van Kesteren wrote: And why should there be UTF-16 sniffing? The reason why Gecko detects BOMless Basic Latin-only UTF-16 regardless of the heuristic detector mode is

Re: [whatwg] [encoding] utf-16

2012-01-03 Thread Leif Halvard Silli
Leif Halvard Silli, Tue, 3 Jan 2012 23:51:52 +0100: Henri Sivonen, Mon Jan 2 07:43:07 PST 2012 On Fri, Dec 30, 2011 at 12:54 PM, Anne van Kesteren wrote: And why should there be UTF-16 sniffing? The reason why Gecko detects BOMless Basic Latin-only UTF-16 regardless of the heuristic

Re: [whatwg] [encoding] utf-16

2012-01-02 Thread Henri Sivonen
On Fri, Dec 30, 2011 at 12:54 PM, Anne van Kesteren ann...@opera.com wrote: And why should there be UTF-16 sniffing? The reason why Gecko detects BOMless Basic Latin-only UTF-16 regardless of the heuristic detector mode is https://bugzilla.mozilla.org/show_bug.cgi?id=631751 It's quite possible

Re: [whatwg] [encoding] utf-16

2012-01-02 Thread Henri Sivonen
On Tue, Dec 27, 2011 at 4:52 PM, Anne van Kesteren ann...@opera.com wrote: I ran some utf-16 tests using 007A as input data, optionally preceded by FFFE or FEFF, and with utf-16, utf-16le, and utf-16be declared in the Content-Type header I suggest testing with zero, one, two and three BOMs.

Re: [whatwg] [encoding] utf-16

2011-12-30 Thread Anne van Kesteren
On Fri, 30 Dec 2011 05:51:16 +0100, Leif Halvard Silli xn--mlform-iua@målform.no wrote: The Trident cache behaviour is a symptom of its over all UTF-16 behaviour: Apart from reading the BOM, it doesn't do any UTF-16 sniffing. I suspect that you want Opera/Firefox to become as bad at 'getting'

Re: [whatwg] [encoding] utf-16

2011-12-30 Thread Leif Halvard Silli
Anne van Kesteren Fri, 30 Dec 2011 11:54:34 +0100 On Fri, 30 Dec 2011 05:51:16 +0100, Leif Halvard Silli: The Trident cache behaviour is a symptom of its over all UTF-16 behaviour: Apart from reading the BOM, it doesn't do any UTF-16 sniffing. I suspect that you want Opera/Firefox to become

Re: [whatwg] [encoding] utf-16

2011-12-28 Thread Anne van Kesteren
On Wed, 28 Dec 2011 03:38:53 +0100, Boris Zbarsky bzbar...@mit.edu wrote: One interesting question here: Does this apply to web-facing things only, or also to MUAs? In my experience charset behavior for the two might well differ I have not tested MUAs. If someone can help out with

Re: [whatwg] [encoding] utf-16

2011-12-28 Thread Anne van Kesteren
On Wed, 28 Dec 2011 03:20:26 +0100, Leif Halvard Silli xn--mlform-iua@målform.no wrote: By default you supposedly mean default, before error handling/heuristic detection. Relevance: On the real Web, no browser fails to display utf-16 as often as Webkit - its defaulting behavior not withstanding

Re: [whatwg] [encoding] utf-16

2011-12-28 Thread Leif Halvard Silli
Anne van Kesteren Tue Dec 27 06:52:01 PST 2011: I spotted a shortcoming in your testing: I ran some utf-16 tests using 007A as input data, optionally preceded by FFFE or FEFF, and with utf-16, utf-16le, and utf-16be declared in the Content-Type header. For WebKit I tested both Safari

[whatwg] [encoding] utf-16

2011-12-28 Thread Leif Halvard Silli
Anne van Kesteren Wed Dec 28 01:05:48 PST 2011: On Wed, 28 Dec 2011 03:20:26 +0100, Leif Halvard Silli wrote: By default you supposedly mean default, before error handling/heuristic detection. Relevance: On the real Web, no browser fails to display utf-16 as often as Webkit - its defaulting

[whatwg] [encoding] utf-16

2011-12-27 Thread Anne van Kesteren
I ran some utf-16 tests using 007A as input data, optionally preceded by FFFE or FEFF, and with utf-16, utf-16le, and utf-16be declared in the Content-Type header. For WebKit I tested both Safari 5.1.2 and Chrome 17.0.963.12. Trident is Internet Explorer 9 on Windows 7. Presto is Opera

Re: [whatwg] [encoding] utf-16

2011-12-27 Thread Leif Halvard Silli
Hi Anne. Over all, your findings corresponds with mine, which are based on http://malform.no/testing/utf/. I also agree with the direction of the conclusions, but I would like the encodings document to make some distinctions that it currently doesn't - and which you have not proposed either.

Re: [whatwg] [encoding] utf-16

2011-12-27 Thread Boris Zbarsky
On 12/27/11 9:20 PM, Leif Halvard Silli wrote: I think http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html should follow Trident/WebKit. Specifically: utf-16 defaults to utf-16le in absence of a BOM. One interesting question here: Does this apply to web-facing things only, or also to