Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-22 Thread Philippe Verdy
So the variation of shape (vertical stick, half-ring, curved comma, or classical 5-shaped) and is attachment or not (below the letter) is perceived as being less significant than the horizontal placement of the cedilla (centered or below the right-most stem). Only Romanian seems to insist to use

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-05 Thread Denis Jacquerye
On Thu, Jul 4, 2013 at 12:07 PM, Michael Everson ever...@evertype.com wrote: On 4 Jul 2013, at 03:56, Phillips, Addison addi...@lab126.com wrote: I don't disagree with the potential need for changing the decomposition. That discussion seems clear and is only muddled by talking about variant,

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-05 Thread Philippe Verdy
All this discussion if going to nowhere. What would be more decisive would the fact that these shapes for celillas had constrasting uses in any language. As far as I can tell, this has not been demonstrated (not even in Romanian). So the proposal is to disunify characters that are already encoded,

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-05 Thread Martin J. Dürst
On 2013/07/05 16:04, Denis Jacquerye wrote: On Thu, Jul 4, 2013 at 12:07 PM, Michael Eversonever...@evertype.com wrote: The problem is in pretending that a cedilla and a comma below are equivalent because in some script fonts in France or Turkey routinely write some sort of

RE: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-05 Thread Erkki I Kolehmainen
: unicode Unicode Discussion Aihe: Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below) ... I fought this battle back when I supported the Romanian disunification of their letters from the Turkish ones. We're just finishing the job now, as far as I can see. Michael Everson * http

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-05 Thread Michael Everson
On 5 Jul 2013, at 11:27, Erkki I Kolehmainen e...@iki.fi wrote: And I'm sorry for having supported you then, since the Romanians claimed at the time that they could not live with a font variation, since they needed to be able to have a distinction between s and t with cedilla and comma below

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-05 Thread Philippe Verdy
2013/7/5 Michael Everson ever...@evertype.com On 5 Jul 2013, at 08:04, Denis Jacquerye moy...@gmail.com wrote: The problem is in pretending that a cedilla and a comma below are equivalent because in some script fonts in France or Turkey routinely write some sort of undifferentiated tick for

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-04 Thread Denis Jacquerye
On Thu, Jul 4, 2013 at 2:42 AM, Lisa Moore li...@us.ibm.com wrote: And it's a pretty easy guess that there are quite a few more users with Japanese and Chinese filenames in the same file system than users with Latvian and Marshallese filenames in the same file system, both because both

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-04 Thread Michael Everson
On 4 Jul 2013, at 03:56, Phillips, Addison addi...@lab126.com wrote: I don't disagree with the potential need for changing the decomposition. That discussion seems clear and is only muddled by talking about variant, language sensitive rendering. That isn't the only consideration, right? No,

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-03 Thread Martin J. Dürst
On 2013/06/22 0:32, Michael Everson wrote: On 21 Jun 2013, at 16:20, Khaled Hosnykhaledho...@eglug.org wrote: Yeah, I don't believe that you can language-tag individual file names for such display as that is markup. Why do you need to? You only need one language, it is not like file names

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-03 Thread Michael Everson
On 3 Jul 2013, at 09:52, Martin J. Dürst due...@it.aoyama.ac.jp wrote: Quite a few people might expect their Japanese filenames to appear with a Japanese font/with Japanese glyph variants, and their Chinese filenames to appear with a Chinese font/Chinese glyph variants. But that's never how

RE: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-03 Thread Phillips, Addison
Martin wrote: Quite a few people might expect their Japanese filenames to appear with a Japanese font/with Japanese glyph variants, and their Chinese filenames to appear with a Chinese font/Chinese glyph variants. But that's never how this was planned, and that's not how it works today.

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-03 Thread Asmus Freytag
On 7/3/2013 2:04 AM, Michael Everson wrote: On 3 Jul 2013, at 09:52, Martin J. Dürst due...@it.aoyama.ac.jp wrote: Quite a few people might expect their Japanese filenames to appear with a Japanese font/with Japanese glyph variants, and their Chinese filenames to appear with a Chinese

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-07-03 Thread Lisa Moore
And it's a pretty easy guess that there are quite a few more users with Japanese and Chinese filenames in the same file system than users with Latvian and Marshallese filenames in the same file system, both because both Chinese and Japanese are used by many more people than Latvian or

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-26 Thread Ilya Zakharevich
On Thu, Jun 20, 2013 at 09:27:49AM +0100, Michael Everson wrote: On 19 Jun 2013, at 18:24, Richard Wordingham richard.wording...@ntlworld.com wrote: The X11 restriction of one character per key stroke is not so easy to get round. Get them to fix X11. It looks like you think that X11 is

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Denis Jacquerye
About positioning: Michael, you mentioned the issue of positioning of the diacritic, this is a font issue not a character issue. I mentioned Navajo ogonek because that is how it solves the issue of positioning, custom Navajo fonts have centered ogoneks. Locale aware fonts and applications could do

Aw: Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Jörg Knappen
My opinion on the cedilla mess is the following: * Add preemptively LATIN [CAPITALLOWERCASE] LETTER * WITH CEDILLA ATTACHED for every Latvian/Livonian character currently in UNicode. (Dont use terms like MARSHALLESE [CAPITALLOWERCASE] LETTER [MN] -- such entities dont exist from a character

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Michael Everson
On 21 Jun 2013, at 07:01, Denis Jacquerye moy...@gmail.com wrote: About positioning: Michael, you mentioned the issue of positioning of the diacritic, this is a font issue not a character issue. I mentioned Navajo ogonek because that is how it solves the issue of positioning, custom Navajo

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Michael Everson
On 21 Jun 2013, at 08:23, Jörg Knappen jknap...@web.de wrote: My opinion on the cedilla mess is the following: * Add preemptively LATIN [CAPITAL|LOWERCASE] LETTER * WITH CEDILLA ATTACHED for every Latvian/Livonian character currently in UNicode. Why? Latvian and Livonian don't use letters

Aw: Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Jörg Knappen
Micheal Everson schrieb: My opinion on the cedilla mess is the following: * Add preemptively LATIN [CAPITAL|LOWERCASE] LETTER * WITH CEDILLA ATTACHED for every Latvian/Livonian character currently in UNicode. Why? Latvian and Livonian don't use letters with proper cedilla attached. Maybe

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Michael Everson
On 21 Jun 2013, at 09:09, Jörg Knappen jknap...@web.de wrote: * Add preemptively LATIN [CAPITAL|LOWERCASE] LETTER * WITH CEDILLA ATTACHED for every Latvian/Livonian character currently in UNicode. Why? Latvian and Livonian don't use letters with proper cedilla attached. Maybe my english

Aw: Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Jörg Knappen
Micheal Everson schrieb: * Add preemptively LATIN [CAPITAL|LOWERCASE] LETTER * WITH CEDILLA ATTACHED for every Latvian/Livonian character currently in UNicode. Why? Latvian and Livonian don't use letters with proper cedilla attached. Maybe my english wasn't perfect here; of course I think

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Dominikus Dittes Scherkl
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 21.06.2013 11:17, schrieb Jörg Knappen: The first reason is to solve this problem completely and not only to resolve a Latvian-Marshallese conflict and leave some other exceptions for later. The second reason is that the letter g, k, l, m, r

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Michael Everson
On 21 Jun 2013, at 11:26, Dominikus Dittes Scherkl lyrate...@gmx.de wrote: Why not instead encoding a new combining MARSHALLESE CEDILLA that ought to be used with g, k, l, m, r and their uppercase counterparts? Because then there would be tree confusable ways of writing all this data. N WITH

Aw: Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Jörg Knappen
Dominikus Dittes Scherkl schrieb: Why not instead encoding a new combining MARSHALLESE CEDILLA that ought to be used with g, k, l, m, r and their uppercase counterparts? This is not a good idea, because the combining MARSHALLESE CEDILLA can be combined with the letter C, too. This creates all

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Khaled Hosny
On Fri, Jun 21, 2013 at 02:27:38PM +0100, Michael Everson wrote: On 21 Jun 2013, at 14:06, Denis Jacquerye moy...@gmail.com wrote: It is not the character model that is not reliable, it is the application. If you application doesn't support locale settings and locale specific font

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Michael Everson
On 21 Jun 2013, at 15:56, Khaled Hosny khaledho...@eglug.org wrote: Try this in the file system. The file system embeds visual rendering of text? You probably mean the file manager The Finder. my file manager shows me locale-dependant glyph variants without any special setup (apart

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Khaled Hosny
On Fri, Jun 21, 2013 at 04:00:20PM +0100, Michael Everson wrote: Yeah, I don't believe that you can language-tag individual file names for such display as that is markup. Why do you need to? You only need one language, it is not like file names are multilingual high quality text books where

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Michael Everson
On 21 Jun 2013, at 16:20, Khaled Hosny khaledho...@eglug.org wrote: Yeah, I don't believe that you can language-tag individual file names for such display as that is markup. Why do you need to? You only need one language, it is not like file names are multilingual high quality text books

Re: Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread Philippe Verdy
This will also happen with the new confusable introduced by a new separate and undecomposable letter... I don't see where your point is. Already Marshellese documents are encoded using the existing cedilla, even if you don't like it, and it still works correctly for most users, even if the cedila

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-21 Thread CE Whitehead
Hi. I personally do not see why these supplemental characters cannot be created, as done for other Latin-1 characters (http://www.unicode.org/charts/PDF/U0080.pdf I'm a bit confused though, and have a question: how are characters from current Marshallese texts and from new texts with the

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-20 Thread Julian Bradfield
On 2013-06-19, Richard Wordingham richard.wording...@ntlworld.com wrote: The X11 restriction of one character per key stroke is not so easy to get round. Some applications don't cooperate with work-arounds such as ibus, and I find ibus unreliable enough that I want alternative methods for

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-20 Thread Michael Everson
On 19 Jun 2013, at 18:24, Richard Wordingham richard.wording...@ntlworld.com wrote: The X11 restriction of one character per key stroke is not so easy to get round. Get them to fix X11. Michael Everson * http://www.evertype.com/

Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Denis Jacquerye
Marshallese uses the letters L/l, M/m, N/n, and O/o with cedilla. The Ad Hoc http://www.unicode.org/L2/L2013/13128-latvian-marshal-adhoc.pdf concluded that encoding LATIN CAPITAL LETTER MARSHALLESE L WITH CEDILLA LATIN SMALL LETTER MARSHALLESE L WITH CEDILLA LATIN CAPITAL LETTER MARSHALLESE N

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Denis Jacquerye
On Wed, Jun 19, 2013 at 7:54 AM, Denis Jacquerye moy...@gmail.com wrote: Marshallese uses the letters L/l, M/m, N/n, and O/o with cedilla. The Ad Hoc http://www.unicode.org/L2/L2013/13128-latvian-marshal-adhoc.pdf concluded that encoding LATIN CAPITAL LETTER MARSHALLESE L WITH CEDILLA LATIN

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Michael Everson
On 19 Jun 2013, at 07:54, Denis Jacquerye moy...@gmail.com wrote: Marshallese uses the letters L/l, M/m, N/n, and O/o with cedilla. The Ad Hoc http://www.unicode.org/L2/L2013/13128-latvian-marshal-adhoc.pdf concluded that encoding LATIN CAPITAL LETTER MARSHALLESE L WITH CEDILLA LATIN SMALL

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Philippe Verdy
Do you mean that these characters will be encoded without permitting canonnical decompositions (because it would violate the equivalences with Latvian/Livonian) ? If so, what you want is just to explicitly say that these cedillas for Marshallese (or other uses) should be attached and not

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Michael Everson
On 19 Jun 2013, at 09:04, Denis Jacquerye moy...@gmail.com wrote: Furthermore, the cedilla can also have a proper cedilla form as opposed to the Latvian or Livonian comma below form in transliteration systems. This has nothing to do with the Marshallese/Latvian conflict, though. ALA-LC

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Michael Everson
On 19 Jun 2013, at 09:59, Jörg Knappen jknap...@web.de wrote: Somehow, the compromise solution found at the ad hoc meeting sounds fishy, because the is no such thing as LATIN CAPITAL LETTER MARSHALLESE L or LATIN SMALL LETTER MARSHALLESE N (to be equipped with a cedilla). It is not the

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Philippe Verdy
2013/6/19 Michael Everson ever...@evertype.com On 19 Jun 2013, at 09:59, Jörg Knappen jknap...@web.de wrote: Somehow, the compromise solution found at the ad hoc meeting sounds fishy, because the is no such thing as LATIN CAPITAL LETTER MARSHALLESE L or LATIN SMALL LETTER MARSHALLESE N

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Denis Jacquerye
On Wed, Jun 19, 2013 at 9:12 AM, Michael Everson ever...@evertype.com wrote: On 19 Jun 2013, at 07:54, Denis Jacquerye moy...@gmail.com wrote: [...] How would one rationalize using one diacritic U+0327 with M/m and O/o but not with L/l and N/n in Marshallese? The same way one would

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Szelp, A. Sz.
The COMMAN BELOW / CEDILLA problem is typically something that probably cannot be solved in Unicode in a way to satisfy every possible aspect.[^1] These problems are an artifact of the historical development of Unicode, and as a standard, stability issues seem to be high priority. Higher priority

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Michael Everson
On 19 Jun 2013, at 13:41, Denis Jacquerye moy...@gmail.com wrote: The same way one would rationalize using precomposed ãẽĩñõũỹ (aeinouy with tilde) but a necessarily de-composed g̃ (g with tilde) in Guaraní. This is wrong: ãẽĩñõũỹ normalize to use U+0303 in NFD, so they canonically

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Asmus Freytag
On 6/19/2013 6:36 AM, Michael Everson wrote: Only in text which has been decomposed. Not all text gets decomposed. All text may get decomposed without warning. As data is shipped around and processed in various parts of a distributed system, nobody can make any safe assumptions on the

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Asmus Freytag
On 6/19/2013 6:36 AM, Michael Everson wrote: The issue of cedilla can easily be solved at a higher level, font technologies like OpenType can easily display glyphs in Latvian or Livonia and different glyphs for Marshallese. Only in environments which permit language tagging. I'd like

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Richard Wordingham
On Wed, 19 Jun 2013 14:36:16 +0100 Michael Everson ever...@evertype.com wrote: On 19 Jun 2013, at 13:41, Denis Jacquerye moy...@gmail.com wrote: If we don't want additional confusing characters maybe we should have CGJ, ZWJ or ZWNJ + combining cedilla (or any other similar sequence) to

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-19 Thread Richard Wordingham
On Wed, 19 Jun 2013 09:12:30 +0100 Michael Everson ever...@evertype.com wrote: First off, what does a Marshallese keyboard look like anyway? Second, well, maybe, but I am still convinced that this is the best solution. Keyboards aren't that hard to implement. The X11 restriction of one