Hi,
Could the following be decomposed instead of being encoded as single characters?
COMBINING LATIN SMALL LETTER A WITH DIAERESIS
COMBINING LATIN SMALL LETTER O WITH DIAERESIS
COMBINING LATIN SMALL LETTER U WITH DIAERESIS
The phonetic alphabet of Gillérion and Rousselot used in the ''Atlas
I think the zero-width joiner (ZWJ, U+200D) should join
regardless of typeface. But Internet Explorer 8 won't join
if the ZWJ is taken from another font than surrounding text.
In MS Windows, the font Mangal contains the zero-width joiner
but not Arabic letters. When I specify font-family: Mangal
You can do that if you wish. This is part of the standard. Look at the
existing canonical decomposition mappings in the UCD (or just look at
them in the charts which display them). Note that this will not make
any difference for all conforming Unicode processes.
For example you can freely
On Tue, Feb 28, 2012 at 4:00 AM, Philippe Verdy verd...@wanadoo.fr wrote:
I am looking for the codes or assignements status of the Cyrillic
letter OE/oe (ligatured) as used in Selkup (exactly similar to the
Latin pair).
This character pair has been part of the registration nr. 223 (in
1998)
On Mon, Mar 5, 2012 at 7:29 PM, Philippe Verdy verd...@wanadoo.fr wrote:
You can do that if you wish. This is part of the standard. Look at the
existing canonical decomposition mappings in the UCD (or just look at
them in the charts which display them). Note that this will not make
any
Le 5 mars 2012 18:33, Denis Jacquerye moy...@gmail.com a écrit :
[1] pp.19-24
http://www.archive.org/stream/atlaslinguistnot00gilluoft#page/18/mode/2up
I note an interesting character in your page : the « open g » used to
denote the « g dur français » show in the middle of the page on the
My question really is whether they could not be seen as
combacombcombdiaeresis/comb, etc. Where the shape of
combdiaeresis/comb is contextual.
Sorry I did not understood the question.
Anyway I don't see the exact problem you may find in this case. There
are other stacked diacritics in this
On 5 Mar 2012, at 18:48, Denis Jacquerye wrote:
My question really is whether they could not be seen as
combacombcombdiaeresis/comb, etc. Where the shape of
combdiaeresis/comb is contextual.
No, because both the combining-a and the combining-diaeresis are bound to the
base letter; the
So what do you propose ?
- Encoding the new precomposed pairs as a new combining character
(there may be a lot of candidate pairs to encode, espacially in the
Latin script),
- or encoding a variation of the existing diacritic to mean that they
are bound to a first-level of diacritic (here a
Note that the first alternative is the one used in the DAM for
encoding a separate COMBINING LATIN SMALL LETTER A/O/U WITH DIAERESIS
But the document cited by Denis gives a much more productive way that
allows stacking any kind of letters with its diacritics. There won't
be enough space in the
I desesperately need a browser that allows displaying the list of
characters at least using numeric character references. For now it is
always unclear what is encoded in such HTML attachment, and we need to
use an external tool like od to see this information.
It would help if you created such
On Mon, Mar 5, 2012 at 19:35, Denis Jacquerye wrote:
According to ftp://std.dkuug.dk/jtc1/sc2/WG2/docs/n2463.doc the
Cyrillic Selkup OE is mapped to Latin OE:
CYRILLIC SMALL LETTER SELKUP O E to U+0153 LATIN SMALL LIGATURE OE
CYRILLIC CAPITAL LETTER SELKUP O E to U+0152 LATIN CAPITAL LIGATURE
On Mon, Mar 5, 2012 at 19:09, Michael Everson wrote:
No, because both the combining-a and the combining-diaeresis are bound to the
base letter; the combining diaeresis is not bound to the combining-a.
Just like the proposed U+1ABB COMBINING PARENTHESIS ABOVE will be bound to the
base letter,
On 3/5/2012 11:44 AM, Philippe Verdy wrote:
So what do you propose ?
It doesn't matter what *Michael* proposes at this point. These have already
been approved by both the UTC and WG2 and are currently in DAM ballot.
- Encoding the new precomposed pairs as a new combining character
(there may
Le 5 mars 2012 21:17, Benjamin M Scarborough
benjamin.scarboro...@utdallas.edu a écrit :
On Mon, Mar 5, 2012 at 19:09, Michael Everson wrote:
No, because both the combining-a and the combining-diaeresis are bound to the
base letter; the combining diaeresis is not bound to the combining-a.
Just
On Mon, Mar 5, 2012 at 20:56, Philippe Verdy wrote:
But the document cited by Denis gives a much more productive way that
allows stacking any kind of letters with its diacritics. There won't
be enough space in the BMP for such Latin supplements.
Then put them in the SMP. Or is SMP still a
Le 5 mars 2012 19:35, Denis Jacquerye moy...@gmail.com a écrit :
On Tue, Feb 28, 2012 at 4:00 AM, Philippe Verdy verd...@wanadoo.fr wrote:
I am looking for the codes or assignements status of the Cyrillic
letter OE/oe (ligatured) as used in Selkup (exactly similar to the
Latin pair).
This
On 3/5/2012 11:56 AM, Philippe Verdy wrote:
Note that the first alternative is the one used in the DAM for
encoding a separate COMBINING LATIN SMALL LETTER A/O/U WITH DIAERESIS
Correct.
But the document cited by Denis gives a much more productive way that
allows stacking any kind of letters
On 3/5/2012 12:17 PM, Benjamin M Scarborough wrote:
On Mon, Mar 5, 2012 at 19:09, Michael Everson wrote:
No, because both the combining-a and the combining-diaeresis are bound to the
base letter; the combining diaeresis is not bound to the combining-a.
Just like the proposed U+1ABB COMBINING
You are so much attached to keep the existing encoding model
unchanged, that now you are going to prepare for LOTS of additions of
combining Latin characters with diacritics... The BMP won't be enough,
the SMP will fill up too, and there will be enormous problems for font
creators (or
On 3/5/2012 12:51 PM, Philippe Verdy wrote:
You are so much attached to keep the existing encoding model
unchanged,
Yep. That's why I work on *standards*, after all.
that now you are going to prepare for LOTS of additions of
combining Latin characters with diacritics... The BMP won't be
Le 5 mars 2012 21:32, Ken Whistler k...@sybase.com a écrit :
On 3/5/2012 11:56 AM, Philippe Verdy wrote:
But the document cited by Denis gives a much more productive way that
allows stacking any kind of letters with its diacritics. There won't
be enough space in the BMP for such Latin
On 5 Mar 2012, at 21:01, Ken Whistler wrote:
In the meantime, if the French dialectologists wish to come to the table, as
the German dialectologists did, the committees can examine the data and
everybody can work out together the best means of representing it in Unicode.
Indeed so.
Michael
Le 5 mars 2012 21:50, Ken Whistler k...@sybase.com a écrit :
On 3/5/2012 12:17 PM, Benjamin M Scarborough wrote:
On Mon, Mar 5, 2012 at 19:09, Michael Everson wrote:
No, because both the combining-a and the combining-diaeresis are bound to
the base letter; the combining diaeresis is not
On 5 Mar 2012, at 20:13, Benjamin M Scarborough wrote:
There is a clear precedent here that the unifications of N2463 are not
necessarily the final fate of any of these characters. If the О Е letter for
Selkup should be disunified from U+0152/U+0153, then a proposal needs to be
submitted
On Mon, Mar 5, 2012 at 9:17 PM, Ken Whistler k...@sybase.com wrote:
By the way, Philippe, this horse is already long out of the barn. See U+1DD7
COMBINING LATIN SMALL LETTER C WITH CEDILLA, which is already a
published part of the standard.
Focusing just on the three new characters with
On 3/5/2012 2:01 PM, Denis Jacquerye wrote:
Wouldn't CGJ be useful in some way in cases like that of the cedilla
or the light centralization stroke 1AB9 ?
Base character + combining letter + CGJ + combining cedilla would be
clear, the cedilla would not be moved.
How is that simpler than Base
On Mon, Mar 5, 2012 at 7:49 PM, Philippe Verdy verd...@wanadoo.fr wrote:
Le 5 mars 2012 18:33, Denis Jacquerye moy...@gmail.com a écrit :
[1] pp.19-24
http://www.archive.org/stream/atlaslinguistnot00gilluoft#page/18/mode/2up
I note an interesting character in your page : the « open g » used
On Mon, Mar 5, 2012 at 11:19 PM, Ken Whistler k...@sybase.com wrote:
On 3/5/2012 2:01 PM, Denis Jacquerye wrote:
Wouldn't CGJ be useful in some way in cases like that of the cedilla
or the light centralization stroke 1AB9 ?
Base character + combining letter + CGJ + combining cedilla would be
On 3/5/2012 2:32 PM, Denis Jacquerye wrote:
I guess it's less messy than other situations. I just couldn't help
wondering why combining letters with diacritics are being encoded but
letters with diacritics or out of the question.
Because the combining ones are *not* decomposed, and hence don't
On 3/5/2012 10:25 AM, Andreas Prilop wrote:
I think the zero-width joiner (ZWJ, U+200D) should join
regardless of typeface. But Internet Explorer 8 won't join
if the ZWJ is taken from another font than surrounding text.
Normally, there's a bit of a rationale for limiting the action of the
ZWJ
31 matches
Mail list logo