Re: The Unicode Standard and ISO

2018-06-07 Thread Marcel Schneider via Unicode
On Thu, 17 May 2018 09:43:28 -0700, Asmus Freytag via Unicode wrote: > > On 5/17/2018 8:08 AM, Martinho Fernandes via Unicode wrote: > > Hello, > > > > There are several mentions of synchronization with related standards in > > unicode.org, e.g. in https://www.unicode.org/versions/index.html, and

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Joan Montané via Unicode
2018-06-04 21:49 GMT+02:00 Manish Goregaokar via Unicode < unicode@unicode.org>: > Hi, > > The Rust community is considering > adding non-ascii > identifiers, which follow UAX #31 > (XID_Start XID_Continue*, with

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Richard Wordingham via Unicode
On Thu, 7 Jun 2018 13:32:13 +0200 Joan Montané via Unicode wrote: > 2018-06-04 21:49 GMT+02:00 Manish Goregaokar via Unicode < > unicode@unicode.org>: > * Ŀ, LATIN CAPITAL LETTER L WITH MIDDEL DOT NFKC decomposes > to LATIN CAPITAL LETTER L (U+004C) MIDDLE DOT (U+00B7): > * ŀ, LATIN SMALL

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Mark Davis ☕️ via Unicode
Got it, thanks. Mark On Thu, Jun 7, 2018 at 3:29 PM, Richard Wordingham via Unicode < unicode@unicode.org> wrote: > On Thu, 7 Jun 2018 10:42:46 +0200 > Mark Davis ☕️ via Unicode wrote: > > > > The proposal also asks for identifiers to be treated as equivalent > > > under > > NFKC. > > > > The

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Philippe Verdy via Unicode
If you intend to allow all the standard orthography of common languages, you would also need to support apostrophes and regular hyphens in identifiers, including those from ASCII ! The Catalan middle dot is just a compact variant of the hyphen, it should have better been a diacritic, but the

Re: The Unicode Standard and ISO

2018-06-07 Thread Michael Everson via Unicode
On 7 Jun 2018, at 14:20, Mark Davis ☕️ via Unicode wrote: > > A few facts. > >> > ... Consortium refused till now to synchronize UCA and ISO/IEC 14651. > > ISO/IEC 14651 and Unicode have longstanding cooperation. Ken Whistler could > speak to the synchronization level in more detail, but the

Re: The Unicode Standard and ISO

2018-06-07 Thread Mark Davis ☕️ via Unicode
A few facts. > ... Consortium refused till now to synchronize UCA and ISO/IEC 14651. ISO/IEC 14651 and Unicode have longstanding cooperation. Ken Whistler could speak to the synchronization level in more detail, but the above statement is inaccurate. > ... For another part it [sync with ISO/IEC 

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Richard Wordingham via Unicode
On Thu, 7 Jun 2018 10:42:46 +0200 Mark Davis ☕️ via Unicode wrote: > > The proposal also asks for identifiers to be treated as equivalent > > under > NFKC. > > The guidance in #31 may not be clear. It is not to replace > identifiers as typed in by the user by their NFKC equivalent. It is >

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Alastair Houghton via Unicode
On 7 Jun 2018, at 15:51, Frédéric Grosshans via Unicode wrote: > >> IMO the major issue with non-ASCII identifiers is not a technical one, but >> rather that it runs the risk of fragmenting the developer community. >> Everyone can *type* ASCII and everyone can read Latin characters (for >>

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Frédéric Grosshans via Unicode
Le 06/06/2018 à 11:29, Alastair Houghton via Unicode a écrit : On 4 Jun 2018, at 20:49, Manish Goregaokar via Unicode wrote: The Rust community is considering adding non-ascii identifiers, which follow UAX #31 (XID_Start XID_Continue*, with tweaks). The proposal also asks for identifiers to

Re: The Unicode Standard and ISO

2018-06-07 Thread Marcel Schneider via Unicode
On Thu, 7 Jun 2018 15:20:29 +0200, Mark Davis ☕️ via Unicode wrote: > > A few facts.  > > > ... Consortium refused till now to synchronize UCA and ISO/IEC 14651. > > ISO/IEC 14651 and Unicode have longstanding cooperation. Ken Whistler could > speak to the > synchronization level in more detail,

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Frédéric Grosshans via Unicode
Le 07/06/2018 à 18:01, Alastair Houghton a écrit : I appreciate that the upshot of the Anglicised world of software engineering is that native English speakers have an advantage, and those for whom Latin isn’t

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Asmus Freytag via Unicode
On 6/7/2018 9:01 AM, Alastair Houghton via Unicode wrote: But please don’t misunderstand; I am not — and have not been — arguing against non-ASCII identifiers. We were asked whether there were any problems. These are problems (or perhaps we might

Re: The Unicode Standard and ISO

2018-06-07 Thread Marcel Schneider via Unicode
On Thu, 17 May 2018 22:26:15 +, Peter Constable via Unicode wrote: […] > Hence, from an ISO perspective, ISO 10646 is the only standard for which > on-going > synchronization with Unicode is needed or relevant. This point of view is fueled by the Unicode Standard being traditionally

RE: The Unicode Standard and ISO

2018-06-07 Thread via Unicode
I cannot but fully agree with Mark and Michael. Sincerely Erkki I. Kolehmainen Mannerheimintie 75 B 37, 00270 Helsinki, Finland Mob: +358 400 825 943 -Alkuperäinen viesti- Lähettäjä: Unicode Puolesta Michael Everson via Unicode Lähetetty: torstai 7. kesäkuuta 2018 16.29

Re: The Unicode Standard and ISO

2018-06-07 Thread Philippe Verdy via Unicode
2018-06-07 21:13 GMT+02:00 Marcel Schneider via Unicode : > On Thu, 17 May 2018 22:26:15 +, Peter Constable via Unicode wrote: > […] > > Hence, from an ISO perspective, ISO 10646 is the only standard for which > on-going > > synchronization with Unicode is needed or relevant. > > This point

Re: Hyphenation Markup

2018-06-07 Thread Richard Wordingham via Unicode
On Sat, 2 Jun 2018 05:44:29 +0100 Richard Wordingham via Unicode wrote: > In Latin text, one can indicate permissible line break opportunities > between grapheme clusters by inserting U+00AD SOFT HYPHEN. What > low-end schemes, if any, exist for such mark-up within grapheme > clusters? It

RE: The Unicode Standard and ISO

2018-06-07 Thread Marcel Schneider via Unicode
On Thu, 7 Jun 2018 22:46:12 +0300, Erkki I. Kolehmainen via Unicode wrote: > > I cannot but fully agree with Mark and Michael. > > Sincerely > Thank you for confirming. All witnesses concur to invalidate the statement about uniqueness of ISO/IEC 10646 ‐ Unicode synchrony. — After being

Re: The Unicode Standard and ISO

2018-06-07 Thread Marcel Schneider via Unicode
On Fri, 8 Jun 2018 00:43:04 +0200, Philippe Verdy via Unicode wrote: [cited mail] > > The "normative names" are in fact normative only as a forward reference > to the ISO/IEC repertoire becaus it insists that these names are essential > part > of the stable encoding policy which was then

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Richard Wordingham via Unicode
On Tue, 5 Jun 2018 01:37:47 +0100 Richard Wordingham via Unicode wrote: > The decomposed > form that looks the same is นํ้า . > The problem is that for sane results, needs > special handling. This sequence is also often untypable - part of the > protection against Thai homographs. I've been

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Mark Davis ☕️ via Unicode
> The proposal also asks for identifiers to be treated as equivalent under NFKC. The guidance in #31 may not be clear. It is not to replace identifiers as typed in by the user by their NFKC equivalent. It is rather to internally *identify* two identifiers (as typed in by the user) as being the

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Alastair Houghton via Unicode
On 6 Jun 2018, at 17:50, Manish Goregaokar wrote: > > I think the recommendation to use ASCII as much as possible is implicit there. It would be a very good idea to make it explicit. Even for English speakers, there may be a temptation to use characters that are hard to distinguish or hard to

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Hans Åberg via Unicode
> On 7 Jun 2018, at 03:56, Asmus Freytag via Unicode > wrote: > > On 6/6/2018 2:25 PM, Hans Åberg via Unicode wrote: >>> On 4 Jun 2018, at 21:49, Manish Goregaokar via Unicode >>> wrote: >>> >>> The Rust community is considering adding non-ascii identifiers, which >>> follow UAX #31

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Philippe Verdy via Unicode
In my opinion the usual constant is most often shown as "휋" (curly serifs, slightly slanted) in mathematical articles and books (and in TeX), but rarely as "π" (sans-serif). There's a tradition of using handwriting for this symbol on backboards (not always with serifs, but still often slanted).

Unicode 11.0.0: BidiMirroring.txt

2018-06-07 Thread Marcel Schneider via Unicode
In the wake of the new release, may we discuss the reason why UTC persisted in recommending that 3 pairs of mathematical symbols featuring tildes are mirrored in low-end support by glyph-exchange bidi-mirroring, with the result that legibility of tildes is challenged, as demonstrated for

Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-07 Thread Hans Åberg via Unicode
Now that the distinction is possible, it is recommended to do that. My original question was directed to the OP, whether it is deliberate. And they are confusables only to those not accustomed to it. > On 7 Jun 2018, at 12:05, Philippe Verdy wrote: > > In my opinion the usual constant is