Combining latin small letters with diacritics

2012-03-05 Thread Denis Jacquerye
Hi, Could the following be decomposed instead of being encoded as single characters? COMBINING LATIN SMALL LETTER A WITH DIAERESIS COMBINING LATIN SMALL LETTER O WITH DIAERESIS COMBINING LATIN SMALL LETTER U WITH DIAERESIS The phonetic alphabet of Gillérion and Rousselot used in the ''Atlas

Zero-width joiner won't join

2012-03-05 Thread Andreas Prilop
I think the zero-width joiner (ZWJ, U+200D) should join regardless of typeface. But Internet Explorer 8 won't join if the ZWJ is taken from another font than surrounding text. In MS Windows, the font Mangal contains the zero-width joiner but not Arabic letters. When I specify font-family: Mangal

Re: Combining latin small letters with diacritics

2012-03-05 Thread Philippe Verdy
You can do that if you wish. This is part of the standard. Look at the existing canonical decomposition mappings in the UCD (or just look at them in the charts which display them). Note that this will not make any difference for all conforming Unicode processes. For example you can freely

Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)

2012-03-05 Thread Denis Jacquerye
On Tue, Feb 28, 2012 at 4:00 AM, Philippe Verdy verd...@wanadoo.fr wrote: I am looking for the codes or assignements status of the Cyrillic letter OE/oe (ligatured) as used in Selkup (exactly similar to the Latin pair). This character pair has been part of the registration nr. 223 (in 1998)

Re: Combining latin small letters with diacritics

2012-03-05 Thread Denis Jacquerye
On Mon, Mar 5, 2012 at 7:29 PM, Philippe Verdy verd...@wanadoo.fr wrote: You can do that if you wish. This is part of the standard. Look at the existing canonical decomposition mappings in the UCD (or just look at them in the charts which display them). Note that this will not make any

Re: Combining latin small letters with diacritics

2012-03-05 Thread Philippe Verdy
Le 5 mars 2012 18:33, Denis Jacquerye moy...@gmail.com a écrit : [1] pp.19-24 http://www.archive.org/stream/atlaslinguistnot00gilluoft#page/18/mode/2up I note an interesting character in your page : the « open g » used to denote the « g dur français » show in the middle of the page on the

Re: Combining latin small letters with diacritics

2012-03-05 Thread Philippe Verdy
My question really is whether they could not be seen as combacombcombdiaeresis/comb, etc. Where the shape of combdiaeresis/comb is contextual. Sorry I did not understood the question. Anyway I don't see the exact problem you may find in this case. There are other stacked diacritics in this

Re: Combining latin small letters with diacritics

2012-03-05 Thread Michael Everson
On 5 Mar 2012, at 18:48, Denis Jacquerye wrote: My question really is whether they could not be seen as combacombcombdiaeresis/comb, etc. Where the shape of combdiaeresis/comb is contextual. No, because both the combining-a and the combining-diaeresis are bound to the base letter; the

Re: Combining latin small letters with diacritics

2012-03-05 Thread Philippe Verdy
So what do you propose ? - Encoding the new precomposed pairs as a new combining character (there may be a lot of candidate pairs to encode, espacially in the Latin script), - or encoding a variation of the existing diacritic to mean that they are bound to a first-level of diacritic (here a

Re: Combining latin small letters with diacritics

2012-03-05 Thread Philippe Verdy
Note that the first alternative is the one used in the DAM for encoding a separate COMBINING LATIN SMALL LETTER A/O/U WITH DIAERESIS But the document cited by Denis gives a much more productive way that allows stacking any kind of letters with its diacritics. There won't be enough space in the

Re: Zero-width joiner won't join

2012-03-05 Thread Philippe Verdy
I desesperately need a browser that allows displaying the list of characters at least using numeric character references. For now it is always unclear what is encoded in such HTML attachment, and we need to use an external tool like od to see this information. It would help if you created such

Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)

2012-03-05 Thread Benjamin M Scarborough
On Mon, Mar 5, 2012 at 19:35, Denis Jacquerye wrote: According to ftp://std.dkuug.dk/jtc1/sc2/WG2/docs/n2463.doc the Cyrillic Selkup OE is mapped to Latin OE: CYRILLIC SMALL LETTER SELKUP O E to U+0153 LATIN SMALL LIGATURE OE CYRILLIC CAPITAL LETTER SELKUP O E to U+0152 LATIN CAPITAL LIGATURE

Re: Combining latin small letters with diacritics

2012-03-05 Thread Benjamin M Scarborough
On Mon, Mar 5, 2012 at 19:09, Michael Everson wrote: No, because both the combining-a and the combining-diaeresis are bound to the base letter; the combining diaeresis is not bound to the combining-a. Just like the proposed U+1ABB COMBINING PARENTHESIS ABOVE will be bound to the base letter,

Re: Combining latin small letters with diacritics

2012-03-05 Thread Ken Whistler
On 3/5/2012 11:44 AM, Philippe Verdy wrote: So what do you propose ? It doesn't matter what *Michael* proposes at this point. These have already been approved by both the UTC and WG2 and are currently in DAM ballot. - Encoding the new precomposed pairs as a new combining character (there may

Re: Combining latin small letters with diacritics

2012-03-05 Thread Philippe Verdy
Le 5 mars 2012 21:17, Benjamin M Scarborough benjamin.scarboro...@utdallas.edu a écrit : On Mon, Mar 5, 2012 at 19:09, Michael Everson wrote: No, because both the combining-a and the combining-diaeresis are bound to the base letter; the combining diaeresis is not bound to the combining-a. Just

Re: Combining latin small letters with diacritics

2012-03-05 Thread Benjamin M Scarborough
On Mon, Mar 5, 2012 at 20:56, Philippe Verdy wrote: But the document cited by Denis gives a much more productive way that allows stacking any kind of letters with its diacritics. There won't be enough space in the BMP for such Latin supplements. Then put them in the SMP. Or is SMP still a

Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)

2012-03-05 Thread Philippe Verdy
Le 5 mars 2012 19:35, Denis Jacquerye moy...@gmail.com a écrit : On Tue, Feb 28, 2012 at 4:00 AM, Philippe Verdy verd...@wanadoo.fr wrote: I am looking for the codes or assignements status of the Cyrillic letter OE/oe (ligatured) as used in Selkup (exactly similar to the Latin pair). This

Re: Combining latin small letters with diacritics

2012-03-05 Thread Ken Whistler
On 3/5/2012 11:56 AM, Philippe Verdy wrote: Note that the first alternative is the one used in the DAM for encoding a separate COMBINING LATIN SMALL LETTER A/O/U WITH DIAERESIS Correct. But the document cited by Denis gives a much more productive way that allows stacking any kind of letters

Re: Combining latin small letters with diacritics

2012-03-05 Thread Ken Whistler
On 3/5/2012 12:17 PM, Benjamin M Scarborough wrote: On Mon, Mar 5, 2012 at 19:09, Michael Everson wrote: No, because both the combining-a and the combining-diaeresis are bound to the base letter; the combining diaeresis is not bound to the combining-a. Just like the proposed U+1ABB COMBINING

Re: Combining latin small letters with diacritics

2012-03-05 Thread Philippe Verdy
You are so much attached to keep the existing encoding model unchanged, that now you are going to prepare for LOTS of additions of combining Latin characters with diacritics... The BMP won't be enough, the SMP will fill up too, and there will be enormous problems for font creators (or

Re: Combining latin small letters with diacritics

2012-03-05 Thread Ken Whistler
On 3/5/2012 12:51 PM, Philippe Verdy wrote: You are so much attached to keep the existing encoding model unchanged, Yep. That's why I work on *standards*, after all. that now you are going to prepare for LOTS of additions of combining Latin characters with diacritics... The BMP won't be

Re: Combining latin small letters with diacritics

2012-03-05 Thread Philippe Verdy
Le 5 mars 2012 21:32, Ken Whistler k...@sybase.com a écrit : On 3/5/2012 11:56 AM, Philippe Verdy wrote: But the document cited by Denis gives a much more productive way that allows stacking any kind of letters with its diacritics. There won't be enough space in the BMP for such Latin

Re: Combining latin small letters with diacritics

2012-03-05 Thread Michael Everson
On 5 Mar 2012, at 21:01, Ken Whistler wrote: In the meantime, if the French dialectologists wish to come to the table, as the German dialectologists did, the committees can examine the data and everybody can work out together the best means of representing it in Unicode. Indeed so. Michael

Re: Combining latin small letters with diacritics

2012-03-05 Thread Philippe Verdy
Le 5 mars 2012 21:50, Ken Whistler k...@sybase.com a écrit : On 3/5/2012 12:17 PM, Benjamin M Scarborough wrote: On Mon, Mar 5, 2012 at 19:09, Michael Everson wrote: No, because both the combining-a and the combining-diaeresis are bound to the base letter; the combining diaeresis is not

Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)

2012-03-05 Thread Michael Everson
On 5 Mar 2012, at 20:13, Benjamin M Scarborough wrote: There is a clear precedent here that the unifications of N2463 are not necessarily the final fate of any of these characters. If the О Е letter for Selkup should be disunified from U+0152/U+0153, then a proposal needs to be submitted

Re: Combining latin small letters with diacritics

2012-03-05 Thread Denis Jacquerye
On Mon, Mar 5, 2012 at 9:17 PM, Ken Whistler k...@sybase.com wrote: By the way, Philippe, this horse is already long out of the barn. See U+1DD7 COMBINING LATIN SMALL LETTER C WITH CEDILLA, which is already a published part of the standard. Focusing just on the three new characters with

Re: Combining latin small letters with diacritics

2012-03-05 Thread Ken Whistler
On 3/5/2012 2:01 PM, Denis Jacquerye wrote: Wouldn't CGJ be useful in some way in cases like that of the cedilla or the light centralization stroke 1AB9 ? Base character + combining letter + CGJ + combining cedilla would be clear, the cedilla would not be moved. How is that simpler than Base

Re: Combining latin small letters with diacritics

2012-03-05 Thread Denis Jacquerye
On Mon, Mar 5, 2012 at 7:49 PM, Philippe Verdy verd...@wanadoo.fr wrote: Le 5 mars 2012 18:33, Denis Jacquerye moy...@gmail.com a écrit : [1] pp.19-24 http://www.archive.org/stream/atlaslinguistnot00gilluoft#page/18/mode/2up I note an interesting character in your page : the « open g » used

Re: Combining latin small letters with diacritics

2012-03-05 Thread Denis Jacquerye
On Mon, Mar 5, 2012 at 11:19 PM, Ken Whistler k...@sybase.com wrote: On 3/5/2012 2:01 PM, Denis Jacquerye wrote: Wouldn't CGJ be useful in some way in cases like that of the cedilla or the light centralization stroke 1AB9 ? Base character + combining letter + CGJ + combining cedilla would be

Re: Combining latin small letters with diacritics

2012-03-05 Thread Ken Whistler
On 3/5/2012 2:32 PM, Denis Jacquerye wrote: I guess it's less messy than other situations. I just couldn't help wondering why combining letters with diacritics are being encoded but letters with diacritics or out of the question. Because the combining ones are *not* decomposed, and hence don't

Re: Zero-width joiner won't join

2012-03-05 Thread Asmus Freytag
On 3/5/2012 10:25 AM, Andreas Prilop wrote: I think the zero-width joiner (ZWJ, U+200D) should join regardless of typeface. But Internet Explorer 8 won't join if the ZWJ is taken from another font than surrounding text. Normally, there's a bit of a rationale for limiting the action of the ZWJ