Re: [whatwg] Is EBCDIC support needed for not breaking the Web?

2008-06-02 Thread Henri Sivonen

On Jun 2, 2008, at 04:27, Benjamin Smedberg wrote:


Henri Sivonen wrote:

Firefox and Opera being able get away with not supporting EBCDIC  
flavors suggests that EBCDIC-based encodings cannot be particularly  
Web-relevant. Even if saying that browsers MUST NOT support them  
might end up being a dead letter, it seems that it would be  
feasible to say that browsers SHOULD NOT support them or at least  
MUST NOT let a heuristic detector guess EBCDIC (for security  
reasons).


Gecko does support UTF-7 and will continue to do so because UTF-7 is  
still in use as a character set for mail encoding and multi-part  
MIME documents.


Does/will Gecko support UTF-7 as a possible heuristic detector guess  
on the Web/HTTP side?


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




[whatwg] Is EBCDIC support needed for not breaking the Web?

2008-06-01 Thread Henri Sivonen
The HTML5 draft says that authors should not use EBCDIC-based  
encodings. This is more lax than saying that authors must not use and  
user agents must not support CESU-8, UTF-7, BOCU-1 and SCSU.


In general, now that UTF-8 exists and is ubiquitously supported,  
proliferation of encodings is costly and doesn't expand that the  
expressiveness of HTML which is parsed into a Unicode DOM anyway.  
Moreover, encodings that are not ASCII supersets are potential  
security risks since the string script may be represented by  
different bytes than in ASCII leading to potential privilege  
escalation if a server-side gatekeeper and a user agent give different  
meanings to the bytes.


For these reasons, if EBCDIC-based encodings don't need to be  
supported in order to Support Existing Content, it would be beneficial  
never to add support for them and, thus, ban them like CESU-8, UTF-7,  
BOCU-1 and SCSU.


I asked Hixie for examples of sites or browsers that require/support  
EBCDIC-based encodings. He had none. I examined the encoding menus of  
Firefox 3b5, Safari 3.1 and Opera 9.5 beta (on Leopard) and IE8 beta 1  
(on English XP SP3). None of them expose EBCDIC-based encodings in the  
UI. (All the IBM encodings Firefox exposes turn out to be ASCII-based.)


This makes me wonder: Do the top browsers support any EBCDIC-based  
encodings but just without exposing them in the UI? If not, can there  
be any notable EBCDIC-based Web content?


I'm suspecting that EBCDIC isn't actually a Web-relevant.

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




Re: [whatwg] Is EBCDIC support needed for not breaking the Web?

2008-06-01 Thread Bjoern Hoehrmann
* Henri Sivonen wrote:
This makes me wonder: Do the top browsers support any EBCDIC-based  
encodings but just without exposing them in the UI? If not, can there  
be any notable EBCDIC-based Web content?

Internet Explorer should support any character encoding Windows supports
(see the advanced tab in `control International`), which includes many
EBCDIC encodings. See eg. http://www.websitedev.de/temp/ebcdic-cp-us.txt
for an example. It seems to me [EMAIL PROTECTED] would have been
a better place to ask your questions than the mailing lists you picked.
-- 
Björn Höhrmann · mailto:[EMAIL PROTECTED] · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 


Re: [whatwg] Is EBCDIC support needed for not breaking the Web?

2008-06-01 Thread Henri Sivonen

On Jun 1, 2008, at 17:25, Bjoern Hoehrmann wrote:


* Henri Sivonen wrote:

This makes me wonder: Do the top browsers support any EBCDIC-based
encodings but just without exposing them in the UI? If not, can there
be any notable EBCDIC-based Web content?


Internet Explorer should support any character encoding Windows  
supports

(see the advanced tab in `control International`), which includes many
EBCDIC encodings. See eg. http://www.websitedev.de/temp/ebcdic-cp-us.txt
for an example.


Thanks.

Philip Taylor made a test case:
http://philip.html5.org/demos/charset/ebcdic/charsets.html

It shows that browsers that use general-purpose decoder libraries (IE  
and Safari) support some EBCDIC flavors but browsers that roll their  
own decoders (Firefox and Opera) don't.


Firefox and Opera being able get away with not supporting EBCDIC  
flavors suggests that EBCDIC-based encodings cannot be particularly  
Web-relevant. Even if saying that browsers MUST NOT support them might  
end up being a dead letter, it seems that it would be feasible to say  
that browsers SHOULD NOT support them or at least MUST NOT let a  
heuristic detector guess EBCDIC (for security reasons).


(Also, I think I'm going to remove EBCDIC support from Validator.nu.)


It seems to me [EMAIL PROTECTED] would have been
a better place to ask your questions than the mailing lists you  
picked.


So many lists. :-( CCed that one, too, just in case.

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




Re: [whatwg] Is EBCDIC support needed for not breaking the Web?

2008-06-01 Thread David Gerard
[just to whatwg]

2008/6/1 Henri Sivonen [EMAIL PROTECTED]:

 Philip Taylor made a test case:
 http://philip.html5.org/demos/charset/ebcdic/charsets.html
 It shows that browsers that use general-purpose decoder libraries (IE and
 Safari) support some EBCDIC flavors but browsers that roll their own
 decoders (Firefox and Opera) don't.


I just loaded that test page in Firefox 3 on Linux (Mozilla/5.0 (X11;
U; Linux i686; en-US; rv:1.9pre) Gecko/2008052604 Minefield/3.0pre)
and the accented characters appear to work in the EBCDIC encodings ...


- d.