Re: UTF-8 isn't the default for HTML (was: xkcd: LTR)

2012-11-29 Thread Philippe Verdy
So we would be in a case where it's impossible to warranty full compatiblity or interoperability between the two concurrent standards from the same standard body, and promissing the best interoperoperability with past flavors of HTML (those past flavors are still not in the past given that two of

Re: Old Cyrillic Yest

2012-11-29 Thread Michael Everson
On 29 Nov 2012, at 08:57, QSJN 4 UKR qsjn4...@gmail.com wrote: Yes, maybe, probably. Truly different glyph is the NARROW YEST. Truly special character name has the BROAD YES, YAKORNOYE YEST, while the NARROW as well as the modern UKRAINIAN є is just IE or YEST. Well, I don't know, would you

Re: Why 17 planes?

2012-11-29 Thread William_J_G Overington
On Wednesday 28 November 2012, Doug Ewell d...@ewellic.org wrote: William_J_G Overington wjgo underscore 10009 at btinternet dot com wrote: For example, there is my research on communication through the language barrier... No, stop right there. This is an excellent example of

Re: UTF-8 isn't the default for HTML (was: xkcd: LTR)

2012-11-29 Thread Leif Halvard Silli
Philippe Verdy, Thu, 29 Nov 2012 10:11:13 +0100: So we would be in a case where it's impossible to warranty full compatiblity or interoperability between the two concurrent standards from the same standard body, and promissing the best interoperoperability with past flavors of HTML (those

Re: UTF-8 isn't the default for HTML (was: xkcd: LTR)

2012-11-29 Thread Philippe Verdy
You're wrong. XHTML1 is integrated in the W3C validator and recognized automatically. The document you cite in the XHTML1 specs has just not been updated. http://validator.w3.org/check?uri=http%3A%2F%2Fwww.xn--elqus623b.net%2FXKCD%2F1137.htmlcharset=%28detect+automatically%29doctype=Inlinegroup=0

Re: UTF-8 isn't the default for HTML (was: xkcd: LTR)

2012-11-29 Thread Leif Halvard Silli
Philippe Verdy, Thu, 29 Nov 2012 13:26:28 +0100: You're wrong. XHTML1 is integrated in the W3C validator and recognized automatically. Indeed, yes. What I meant by doesn't integrate XHTML1' was that Unicorn doesn't 100% adhere to the two sections of XHTML1 that I quoted.[1][2] The document

Re: Why 17 planes?

2012-11-29 Thread Philippe Verdy
2012/11/28 Doug Ewell d...@ewellic.org Using the PUA to extend Unicode substantially beyond what a character encoding standard is supposed to be, and (especially) expecting others to adopt that non-character PUA usage, or expecting it to be ipso facto a step toward formal encoding, is

Re: UTF-8 isn't the default for HTML (was: xkcd: LTR)

2012-11-29 Thread Philippe Verdy
And you forget the important part of Appendix A: *Consequence*: Remember, however, that when the XML declaration is not included in a document, AND the character encoding is not specified by a higher level protocol such as HTTP, the document can only use the default character encodings UTF-8 or

Re: Caret

2012-11-29 Thread Philippe Verdy
Damn ! I'm not even sure styling will still work as soon as you have substituted f before i by f.ligaleft, the substitution disables the possibility to position a caret there: you're no longer working at the character level but at the glyph level, and matching logical positions and positions in

Re: UTF-8 isn't the default for HTML (was: xkcd: LTR)

2012-11-29 Thread Leif Halvard Silli
Philippe Verdy, Thu, 29 Nov 2012 14:24:29 +0100: And you forget the important part of Appendix A: *Consequence*: Remember, however, that when the XML declaration is not included in a document, AND the character encoding is not specified by a higher level protocol such as HTTP, the document

Re: Caret

2012-11-29 Thread Philippe Verdy
Another complication : it is pobably possible to style components of a ligature (or any sequence of characters where glyph substituion/position is expected to occur) with just distinct colors or backgrounds, using such technics based on component glyph metrics, as long as style does not change the

Re: UTF-8 isn't the default for HTML (was: xkcd: LTR)

2012-11-29 Thread Philippe Verdy
2012/11/29 Leif Halvard Silli xn--mlform-...@xn--mlform-iua.no Philippe Verdy, Thu, 29 Nov 2012 14:24:29 +0100: ... But why ? Isn't UTF-8 (or alternatively UTF-16) already the default encoding of XHTML? If not, then we should file a bug in the W3C Validator for not honoring the

Re: Why 17 planes?

2012-11-29 Thread Doug Ewell
William_J_G Overington wjgo underscore 10009 at btinternet dot com wrote: Do NOT try to make this system conceptually part of Unicode. Well, consider please the following example, from a simulation, of the text of a plain text email.   Margaret Gattenford [...] Embedding these items

Re: UTF-8 isn't the default for HTML (was: xkcd: LTR)

2012-11-29 Thread Philippe Verdy
In my opinion, from HTML5, and not XHTML5, there should also exist a leading prolog like ?html version=5.0 encoding=utf-8 For XHTML5, we will continue using the XML prolog ; but it *may* be followed by the html prolog, without needing to repeat the optional encoding pseudo-attribute, which XML

Re: LTR

2012-11-29 Thread Doug Ewell
Philippe Verdy verdy underscore p at wanadoo dot fr wrote: In all the ensuing discussion about this page, did anyone notice the typo in the Deseret cartoon? The non-mirrored question mark ? Nope, that's in the original cartoon too. U+003F isn't bidi-mirrored anyway. Try again. There's a

Re: UTF-8 isn't the default for HTML (was: xkcd: LTR)

2012-11-29 Thread Leif Halvard Silli
Philippe Verdy, Thu, 29 Nov 2012 16:10:14 +0100: Thanks a lot, this was really hard to see and understand, because I was only reading the XHTML specs, and the Validator did not complain. Glad to find we are no the same page! Philippe Verdy, Thu, 29 Nov 2012 16:27:13 +0100: ?html version=5.0

Re: UTF-8 isn't the default for HTML (was: xkcd: LTR)

2012-11-29 Thread Philippe Verdy
- Method 1 (the BOM) is only goof for UTF-16. not reliable for UTF-8 whuch is still the default for XHTML (and where the BOM is not always present). - Method 2 is working sometimes, but is not practicle for many servers that you can't configure to change their content-type for specific pages all

Re: UTF-8 isn't the default for HTML (was: xkcd: LTR)

2012-11-29 Thread Leif Halvard Silli
Philippe Verdy, Thu, 29 Nov 2012 19:11:42 +0100: 2012/11/29 Leif Halvard Silli: Philippe Verdy, Thu, 29 Nov 2012 16:27:13 +0100: ?html version=5.0 encoding=utf-8 Thus I can guarantee you that your idea about at method number 9, is not going to be met with enthusiasm. - Method 5 is where ?

Re: UTF-8 isn't the default for HTML (was: xkcd: LTR)

2012-11-29 Thread Philippe Verdy
Note that I challenge the term conforming you use, given that HTML5 is still not released, so its conformance is still not formally defined. The nu validator is still expliitly marked by the W3C as experimental. 2012/11/29 Leif Halvard Silli xn--mlform-...@xn--mlform-iua.no HTML5 already have

Re: ‮LTR

2012-11-29 Thread John H. Jenkins
I double-checked *very* carefully, and I did't see anything wrong at all. :-) You got sharp eyes there, Doug. On 2012年11月28日, at 下午10:58, Doug Ewell d...@ewellic.org wrote: John H. Jenkins wrote: Or, if one prefers: http://www.井作恆.net/XKCD/1137.html In all the ensuing discussion

Re: LTR

2012-11-29 Thread Philippe Verdy
The only minor typo that I see is the lowercase e in U+202e. But this not written in Deseret, and is similar to the Latin version. An exclamation point in the Latin version becomes a full dot in Deseret : minor typo as well I think but still a difference. The Latin word « DIDN'T » (reversed)

RE: LTR

2012-11-29 Thread Doug Ewell
Philippe Verdy verdy underscore p at wanadoo dot fr wrote: The Latin word « DIDN'T » (reversed) seems to be transliterated to Deseret as « DID'T » (reversed), missing an N You're about two hours behind John, but that's it. -- Doug Ewell | Thornton, Colorado, USA http://www.ewellic.org |

Re: Request for Review: draft-slevinski-signwriting-text

2012-11-29 Thread Doug Ewell
Steve Slevinski slevin at signpuddle dot net wrote: I have documented a text encoding for an unusual script that is used by an international community. The script use a 2-dimensional plain text encoding with ASCII and Unicode PUA. draft-slevinski-signwriting-text

Re: Request for Review: draft-slevinski-signwriting-text

2012-11-29 Thread Philippe Verdy
From basic analysis, the posted IETF draft has nothing to do with Unicode encoding. However it remains related to it, because it presents the relations between the characters that are to be encoded in the UCS, and their additional properties within the focus of their classification and behavior