Why 17 planes? (was: Re: Why 11 planes?)

2012-11-27 Thread Martin J. Dürst
Well, first, it is 17 planes (or have we switched to using hexadecimal numbers on the Unicode list already? Second, of course this is in connection with UTF-16. I wasn't involved when UTF-16 was created, but it must have become clear that 2^16 (^ denotes exponentiation (to the power of))

Re: xkcd: ‮LTR

2012-11-27 Thread Simon Montagu
On 11/26/2012 08:42 PM, Marc Durdin wrote: Somewhat ironically, both Firefox and Internet Explorer, on my machine at least, detect this page is encoded with ISO-8859-1 and cp-1252 respectively, instead of UTF-8. It seems they both ignore the XML prolog which is the only place where the encoding

Re: xkcd: ‮LTR

2012-11-27 Thread Behnam Esfahbod ZWNJ
Simon, There's no sign of HTML5 on that page. The head of the file matches all XHTML 1.1 requirements and passes all checks on validator.w3.org. Now, why would Firefox follow anything from HTML5 spec here? -Behnam On Tue, Nov 27, 2012 at 3:37 AM, Simon Montagu smont...@smontagu.orgwrote: On

Re: cp1252 decoder implementation

2012-11-27 Thread Martin J. Dürst
On 2012/11/17 12:54, Buck Golemon wrote: On Fri, Nov 16, 2012 at 4:11 PM, Doug Ewelld...@ewellic.org wrote: Buck Golemon wrote: Is it incorrect to say that 0x81 is a non-semantic byte in cp1252, and to map it to the equally-non-semantic U+81 ? U+0081 (there are always at least four

First known use of the word, email (1978)

2012-11-27 Thread N. Ganesan
There are interviews in Tamil and English language media about V. A. Shiva Ayyadurai and his work in high school and later with respect to electronic mail. A statement issued by MIT will be useful to make things clear. http://tech.mit.edu/V132/N5/corrections.html A brief published on Jan. 11

Re: xkcd: ‮LTR

2012-11-27 Thread Simon Montagu
On 11/27/2012 11:19 AM, Behnam Esfahbod ZWNJ wrote: Simon, There's no sign of HTML5 on that page. The head of the file matches all XHTML 1.1 requirements and passes all checks on validator.w3.org http://validator.w3.org. Now, why would Firefox follow anything from HTML5 spec here? As I

Re: data for cp1252

2012-11-27 Thread Peter Krefting
Buck Golemon b...@yelp.com: In summary, all browsers agree that it decodes to U+81. Opera initially thought it was undefined, but changed their mind in version 12 (the current version). Yes, it was changed to become compatible with a number of web sites that depended on that behaviour.

Re: xkcd: LTR

2012-11-27 Thread Philippe Verdy
HTML5 does not reference the Content-Type: text/html header as enough to qualify as meaning HTML5. HTML5 **requires** its own prolog (i.e. its basic document declaration **within** the document itself, for the HTML syntax, or its FULL document declaration for the XML/XHTML syntax). So Firefox is

Re: Why 17 planes? (was: Re: Why 11 planes?)

2012-11-27 Thread Philippe Verdy
That's a valid computation if the extension was limited to use only 2-surrogate encodings for supplementary planes. If we could use 3-surrogate encodings, you'd need 3*2ˆn surrogates to encode 2^(3*n) new codepoints. With n=10 (like today), this requires a total of 3072 surrogates, and you

Re: Why 17 planes? (was: Re: Why 11 planes?)

2012-11-27 Thread Philippe Verdy
Note that the **curent bet** that the existing 17 planes will be sufficient is valid only if there's no international desire to encode something else than just what is in the current focus of Unicode. Say (for example) that the WIPO absolutely wants to encode corporate logos. Or ISO or the IETF

Re: First known use of the word, email (1978)

2012-11-27 Thread Clive Hohberger
You might want to look at Wikipedia entry E-mail. There was a formal timeshare messaging system: 1978 – EMAIL at University of Medicine and Dentistry of New Jerseyhttp://en.wikipedia.org/wiki/University_of_Medicine_and_Dentistry_of_New_Jersey [36] http://en.wikipedia.org/wiki/E-mail#cite_note-37

Re: First known use of the word, email (1978)

2012-11-27 Thread John D. Burger
What has this to do with Unicode??? - John Burger MITRE On Nov 27, 2012, at 05:14 , N. Ganesan wrote: There are interviews in Tamil and English language media about V. A. Shiva Ayyadurai and his work in high school and later with respect to electronic mail. A statement issued by MIT

Re: xkcd: LTR

2012-11-27 Thread Philippe Verdy
I've never said that user agents had to 'write the prolog. It's the reverse: yes authors have to write a prolog (but the prolog is perfect here so this is not the fault of the author). Why do have to use this prolog, it's exactly because user agents will have to read it (not write it), as it is

Re: xkcd: LTR

2012-11-27 Thread Philippe Verdy
Also you make a confusion in the sense that HTML5 must be able to parse HTML4. This is true, but this does not mean that they will be able to render it fully. HTML5 is not fully upward compatible with past versions (and the case of the identification of encodings is an example where it is

Re: First known use of the word, email (1978)

2012-11-27 Thread Michael Everson
On 27 Nov 2012, at 14:31, John D. Burger j...@mitre.org wrote: What has this to do with Unicode??? u+1F4E7 U+1F455 Michael Everson * http://www.evertype.com/

Re: First known use of the word, email (1978)

2012-11-27 Thread Michael Everson
On 27 Nov 2012, at 13:53, Clive Hohberger cp...@case.edu wrote: BTW, the routine capitalization of 'E' in E-mail came in the 1990's from William Safire's On Language column in the NY Times newspaper: He made the analogy with T-shirt. The T in T-shirt is capitalized because of the shape of

Re: Why 17 planes? (was: Re: Why 11 planes?)

2012-11-27 Thread William_J_G Overington
On Tuesday 27 November 2012, Philippe Verdy verd...@wanadoo.fr wrote: This is not complicate to parse it in the foreward direction, but for the backward direction, it means that when you see the final low surrogate, you still need to rollback to the previous position: it can only be a

Re: xkcd: LTR

2012-11-27 Thread Leif Halvard Silli
Philippe Verdy, Tue, 27 Nov 2012 15:39:43 +0100: I've never said that user agents had to 'write the prolog. It's the reverse: yes authors have to write a prolog (but the prolog is perfect here so this is not the fault of the author). XML has (or more correctly: can have) a prolog. HTML does

Re: xkcd: LTR

2012-11-27 Thread Khaled Hosny
Looks OK here, but that is probably FreeType doing its magic as usual. Regards, Khaled On Tue, Nov 27, 2012 at 02:29:45AM +0100, Philippe Verdy wrote: Also I really don't like the Deseret font: {font-family: CMU; src: url(CMUSerif-Roman.ttf) format(truetype);} that you have inserted in your

RE: Why 17 planes? (was: Re: Why 11 planes?)

2012-11-27 Thread Whistler, Ken
There isn't an actual problem here which needs a solution, satisfactory, or otherwise. The persistence of the 17 planes may not be enough meme on this list is an interesting phenomenon in itself, but has no practical impact on any of the actual ongoing work on maintenance of the encoding

RE: First known use of the word, email (1978)

2012-11-27 Thread Joe
German scholars have traced it back at least to the early Middle Ages: Email im frühen Mittelalter von Günther Haseloff http://books.google.com/books/about/Email_im_frühen_Mittelalter.html?id=H7RJAQAAIAAJ Jo(k)e

Re: First known use of the word, email (1978)

2012-11-27 Thread Clive Hohberger
In modern German, it means enamel 2012/11/27 Joe j...@unicode.org German scholars have traced it back at least to the early Middle Ages: Email im frühen Mittelalter von Günther Haseloff http://books.google.com/books/about/Email_im_frühen_Mittelalter.html?id=H7RJAQAAIAAJ

Re: xkcd: LTR

2012-11-27 Thread Philippe Verdy
A ! I see now the problem: the XHTML file is being served as HTML instead of XHTML (but this is not invalid for XHTML 1). But anyway you're also right that the XML prolog found is NOT valid for HTML5 when the file is served as HTML instead of XHTML. This should immediately trigger the fact

Re: xkcd: LTR

2012-11-27 Thread Philippe Verdy
No. Freetype is not involved here for the ugly rendering (on screen) under Windows of the unhinted CMU font provided by the page. May be this looks OK on Mac (if Safari is autohinting the font itself, despite the font is not autohinted itself ; I'm not sure that Safari on MacOS processes TTF fonts

Re: xkcd: LTR

2012-11-27 Thread Leif Halvard Silli
Philippe Verdy, Tue, 27 Nov 2012 21:07:31 +0100: A ! I see now the problem: the XHTML file is being served as HTML instead of XHTML (but this is not invalid for XHTML 1). Both SGML-based HTML4 and XML-based XHTML 1 operate with syntax rules that are not - and has never been - compatible

Re: xkcd: LTR

2012-11-27 Thread Asmus Freytag
On 11/27/2012 5:39 AM, Masatoshi Kimura wrote: (2012/11/27 20:27), Philippe Verdy wrote: Could you please stop spreading an unfounded rumor such as Firefox is wrong because it ignores the lacking of HTML5 prolog? Getting Philippe to stop spreading unfounded anything is a near impossible

Re: xkcd: LTR

2012-11-27 Thread Philippe Verdy
2012/11/27 Leif Halvard Silli xn--mlform-...@xn--mlform-iua.no The fact that XHTML 1 permits the XML prolog regardless how the document is served, is just a shortcoming of the XHTML 1 specification. No, it was by design. Making HTML an application of XML. Only XML but with all rules of XML.

Re: Why 17 planes?

2012-11-27 Thread Martin J. Dürst
To this, my mother would say: Why keep it simple when we can make it complicated?. Regards,Martin. On 2012/11/27 21:01, Philippe Verdy wrote: That's a valid computation if the extension was limited to use only 2-surrogate encodings for supplementary planes. If we could use 3-surrogate

Re: First known use of the word, email (1978)

2012-11-27 Thread N. Ganesan
On Tue, Nov 27, 2012 at 5:53 AM, Clive Hohberger cp...@case.edu wrote: You might want to look at Wikipedia entry E-mail. There was a formal timeshare messaging system: 1978 – EMAIL at University of Medicine and Dentistry of New Jersey[36] This is Shiva Ayyadurai's program written in 1978.

Re: First known use of the word, email (1978)

2012-11-27 Thread Doug Ewell
Not only does this have nothing to do with Unicode, but who cares? Grumpily, -- Doug Ewell | Thornton, Colorado, USA http://www.ewellic.org | @DougEwell ­ From: N. Ganesan Sent: Tuesday, November 27, 2012 18:35 To: Clive Hohberger Cc: Indic Discussion List ; Unicode Mailing List Subject: Re:

Re: LTR

2012-11-27 Thread Doug Ewell
So using the xml:lang=en-Dsrt pseudo-attribute remains a good suggestion to allow a CSS stylesheet to avoid using referening CMU font on Windows and MacOS when displaying the Latin text (using xml:lang=en) and to allow the same stylesheet to specify a much better Deseret font for Windows (Segoe

Re: xkcd: LTR

2012-11-27 Thread Leif Halvard Silli
Philippe Verdy, Wed, 28 Nov 2012 01:10:45 +0100: 2012/11/27 Leif Halvard Silli The fact that XHTML 1 permits the XML prolog regardless how the document is served, is just a shortcoming of the XHTML 1 specification. No, it was by design. Making HTML an application of XML. Only XML but with 

Re: xkcd: LTR

2012-11-27 Thread Philippe Verdy
2012/11/28 Leif Halvard Silli xn--mlform-...@xn--mlform-iua.no For a new version of the validator, that ask more of those questions, please try http://validator.w3.org/nu/ - it happens to for the most part be developed by one of the Firefox developers, btw. And it allows you to check

Re: xkcd: LTR

2012-11-27 Thread Philippe Verdy
detects a violation of the required extended prolog (sorry, the HTML5 document declaration, which is not a valid document declaration for XHTML or for HTML4 or before or even for SGML, due to the unspecified schema after the shema short name), it should catch this exception to try

Re: First known use of the word, email (1978)

2012-11-27 Thread Philippe Verdy
On Tue, Nov 27, 2012 at 5:53 AM, Clive Hohberger cp...@case.edu wrote: BTW, the routine capitalization of 'E' in E-mail came in the 1990's from William Safire's On Language column in the NY Times newspaper: He made the analogy with T-shirt Is this capitalisation of T-shirt mandatory ? (of

Re: xkcd: LTR

2012-11-27 Thread Leif Halvard Silli
Philippe Verdy, Wed, 28 Nov 2012 04:23:10 +0100: 2012/11/28 Leif Halvard Silli xn--mlform-...@xn--mlform-iua.no For a new version of the validator, that ask more of those questions, please try http://validator.w3.org/nu/ - it happens to for the most part be developed by one of the Firefox

Re: Why 17 planes?

2012-11-27 Thread Doug Ewell
Philippe Verdy wrote: And it will still remain enough place in the remaining planes to define later a few more surrogates of a new type, if really needed for a future, upward compatible, standard if it ever comes to reality — such as having an open registry of corporate logos or glyph designs,

Re: First known use of the word, email (1978)

2012-11-27 Thread Doug Ewell
Philippe Verdy wrote: Is this capitalisation of T-shirt mandatory ? (of course the shape of the letter recalls the shape of the suit) I've seen frequently t-shirt (sometimes tee-shirt as well) when the term was lexicalized, with a clear pronunciation and understanding by itself, without

Re: xkcd: LTR

2012-11-27 Thread Leif Halvard Silli
Philippe Verdy, Wed, 28 Nov 2012 04:50:06 +0100: detects a violation of the required extended prolog (sorry, the HTML5 document declaration, which is not a valid document declaration for XHTML or for HTML4 or before or even for SGML, due to the unspecified schema after the shema short name),