Re: Feedback on the proposal to change U+FFFD generation when decoding ill-formed UTF-8

2017-05-23 Thread Jonathan Coxhead via Unicode
On 18/05/2017 1:58 am, Alastair Houghton via Unicode wrote: On 18 May 2017, at 07:18, Henri Sivonen via Unicode wrote: the decision complicates U+FFFD generation when validating UTF-8 by state machine. It *really* doesn’t. Even if you’re hell bent on using a pure state

Is this the oldest d20 on Earth?

2014-09-20 Thread Jonathan Coxhead
Here's an icosahedral dice from the Ptolemaic period: http://www.metmuseum.org/collection/the-collection-online/search/551070 I find myself idly wondering whether the identities of the characters are all known and encoded ... Cheers —Jonathan ___

Re: Draft Proposal to encode the English Phonotypic Alphabet

2010-07-05 Thread Jonathan Coxhead
Just curious ... Is there a free font anywhere that contains these characters? Either in the PUA of a Unicode font or encoded somehow in an 8-bit font? Thanks Jonathan Coxhead 72 Rock Harbor Ln, Foster City CA 94404 +1-650-430-6564 (m) On 2010-06-30 9:26 am, Mark Davis ☕ wrote

Re: CSS3, Unicode BIDI, and Vertical Text Layout

2004-10-20 Thread Jonathan Coxhead
. Rtl and ltr are only meaningful in horizontal writing. But the LTR text would be rotated clockwise, and the RTL anticlockwise, wouldn't it? So there is *an* interaction---just not the same one. -- /| Jonathan Coxhead o o o (_|/ /| Sunnyvale CA (_/

Re: Saudi-Arabian Copyright sign

2004-09-22 Thread Jonathan Coxhead
in the desert. -- /| Jonathan Coxhead o o o (_|/ /| Sunnyvale CA USA (_/

Re: Unicode Encoding Illustration

2004-08-19 Thread Jonathan Coxhead
titled UTF-16 should have contents FE FF 44 37 or FF FE 37 44 The box titled UTF-32 should have contents 00 00 FE FF 00 00 44 37 or FF FE 00 00 37 44 00 00 Cheers ... -- /| Jonathan Coxhead o o o (_|/ /| (_/

Re: Proposal to encode dominoes and other game symbols

2004-06-11 Thread Jonathan Coxhead
Michael, especially given the minimal examples of usage and justification for encoding provided in the proposal. Andrew -- /| Jonathan Coxhead o o o (_|/ /| Kucinich for President! http://www.kucinich.us (_/

RE: Complex Combining

2003-12-01 Thread Jonathan Coxhead
My take on Cleanicode, the Atomic Theory of Unicode, can be found at http://www.doves.demon.co.uk/atomic.html. It is very much a software engineer's view of character coding. The characters START GROUP and POP DIRECTIONAL FORMATTING are used as brackets. Yes, it could involve arbitrary

Re: Backslash n [OT] was Line Separator and Paragraph Separator

2003-10-22 Thread Jonathan Coxhead
On 22 Oct 2003, at 6:53, John Cowan wrote: Kent Karlsson scripsit: All of CR, LF, CR, LF, NEL, LS, PS, and EOF(!). (Assuming that the encoding of the text file is recognised.) XML 1.0 treats CR, LF, and CR, LF as line terminators and reports them as LF. XML 1.1 will treat CR, LF,

Re: Backslash n [OT] was Line Separator and Paragraph Separator

2003-10-21 Thread Jonathan Coxhead
On 21 Oct 2003, at 12:01, Jill Ramonsky wrote: I would be more than grateful if someone could point me in the direction of a DEFINITVE specification which claims this is not the case, that the interpretion of \n as anything other than LF may be considered conformant behaviour. The C

Re: Yerushala(y)im - or Biblical Hebrew

2003-07-28 Thread Jonathan Coxhead
On 28 Jul 2003, at 16:49, Kenneth Whistler wrote: Part of the specification of the Unicode normalization algorithm is idempotency *across* versions, so that addition of new characters to the standard, which require extensions of the tables for decomposition, recomposition, and composition

Re: BOM's at Beginning of Web Pages?

2003-02-18 Thread Jonathan Coxhead
That's a very long-winded way of writing it! How about this: #!/usr/bin/perl -pi~ -0777 # program to remove a leading UTF-8 BOM from a file # works both STDIN - STDOUT and on the spot (with filename as argument) s/^\xEF\xBB\xBF//s; which uses perl's -p, -i and -0

Re: changing scripts

2002-07-29 Thread Jonathan Coxhead
On 26 Jul 2002, at 23:23, Curtis Clark wrote: Are you saying that, even though Unicode defines U+0027 as punctuation, other, I could use it as a glottal stop and create a locale that would treat it as a letter (and still be Unicode compliant, whatever that is?). If my name is

Re: logical order

2002-07-26 Thread Jonathan Coxhead
The way these concepts were explained to me was as visual order (the order as you see it with your eyes, as defined by the writing system) and aural order (the order you hear it with your ears, as defined by pronunciation of the spoken language). Neither of these is more or less logical

Re: ISO 3166 (country codes) Maintenance Agency Web pages move

2002-02-27 Thread Jonathan Coxhead
On 27 Feb 2002, at 14:42, John Cowan wrote: [EMAIL PROTECTED] wrote: There is a point of view that says that *all* such identifiers (for countries, languages, etc.) should just be randomly generated strings for the kind of reasons mentioned. (Not that I'm arguing for that.) It is

Re: Shape of the US Dollar Sign

2001-10-02 Thread Jonathan Coxhead
I can't resist transcribing the following, which is a quotation from _Love_and_Sleep_ by John Crowley (Bantam Books, 1994). (It's fiction.) |There are many Monarchs, and many Princes, but only one Emperor. Rudolf | II, King in his own right of Hungary and Bohemia, Archduke of Austria,

[unicode] Re: Unicode editing

2001-03-28 Thread Jonathan Coxhead
On 28 Mar 01, at 12:02, Marco Cimarosti wrote: struct MyWysiwygGlyph { wchar_t GlyphCode; int EmbeddingLevel; }; I think that Roozbeh had something quite similar in mind. Yes. I was not sure that if that's enough, but after this

[unicode] Re: removing compromises from unicode (WCode)

2001-03-23 Thread Jonathan Coxhead
It would be very entertaining to do the same job with the ideographs (down to the radical level) and count the number of atoms. I suspect the resulting "character set" would contain less than 2000 atoms altogether. MichKa replied ... More than just entertaining, one would definitely find

Re: Unicode complaints

2001-03-15 Thread Jonathan Coxhead
Suzanne M Topping wrote, In hunting around for negative opinions about Unicode, ... Markus Scherer wrote, Let me add one complaint to your list: Thai is not stored/used in logical order in Unicode. and Michael Kaplan wrote, And your suggestion for characters that sort

Re: Updates to Technical Reports

2001-03-13 Thread Jonathan Coxhead
UTR 19 paragraph D36c(a) contains a reference to 'UTF-32BE' that should read 'UTF-32', I think. /| o o o (_|/ /| (_/

Re: Names of planes, and request for sneak preview

2000-07-11 Thread Jonathan Coxhead
Oh, by the way, if 12 is a dozen and 144 is a gross, what are 16 and 256? 272

RE: Difference between EM QUAD and EM SPACE

2000-07-10 Thread Jonathan Coxhead
In TeX, the difference is that an EM QUAD (\qquad) and an EN QUAD (\quad) provide spaces that are legitimate breakpoints for lines within a paragraph; while EM SPACE, EN SPACE (\enspace) and THIN SPACE (\thinspace) produce horizontal space that cannot cause a line-break. My assumption