On 4/23/2014 7:37 PM, Philippe Verdy wrote:
Thanks for the clear reply, now I know that my example in a prior
message would work appropriately with UBA:
This is an [«] ARABIC EXAMPLE [»] for demonstration only.
Because:
- the opening guillemet is not stripped out of the context stack when
On 23 Apr 2014, at 22:16, Mathias Bynens math...@qiwi.be wrote:
Let’s say I’m writing a program that strips combining characters and grapheme
extenders from an input string.
For combining marks, I’m looking for any non-combining marks (e.g. `a`)
followed by one or more combining marks
Date: Thu, 24 Apr 2014 00:28:50 -0700
From: Asmus Freytag asm...@ix.netcom.com
CC: k...@unicode.org, Eli Zaretskii e...@gnu.org,
James Clark j...@jclark.com,
unicode Unicode Discussion unicode@unicode.org
On 4/23/2014 7:37 PM, Philippe Verdy wrote:
Thanks for the clear reply, now I
2014-04-24 16:39 GMT+02:00 Eli Zaretskii e...@gnu.org:
In addition, assuming that by guillemets Philippe means U+00AB and
U+00BB,
guillemet is THE correct name, even in English. guillemot comes from an
old typo error. If you don't want this term in Engmish you can still use
double angle
From: Philippe Verdy verd...@wanadoo.fr
Date: Thu, 24 Apr 2014 17:11:23 +0200
Cc: Asmus Freytag asm...@ix.netcom.com, Ilya Zakharevich
nospam-ab...@ilyaz.org, k...@unicode.org,
James Clark j...@jclark.com, unicode Unicode Discussion
unicode@unicode.org
In addition, assuming that
On 4/24/2014 8:20 AM, Eli Zaretskii wrote:
So nothing (at least not the reason of the GC which is just an intermediate
but incomplete helper) forbids the guillemets to be listed in
BidiBrackets.txt.
They don't satisfy the conditions for that. From BidiBrackets.txt:
Philippe is incorrect once
On 4/24/2014 7:39 AM, Eli Zaretskii wrote:
This is _*incorrect*_, see the text in blue/bold in the definition
copied below.
The second bullet in item 3 of the second second-level bullet of the
third top-level bullet of BD16 clearly says that all elements that are
above the matched element are
2014-04-24 17:20 GMT+02:00 Eli Zaretskii e...@gnu.org:
From: Philippe Verdy verd...@wanadoo.fr
Date: Thu, 24 Apr 2014 17:11:23 +0200
Cc: Asmus Freytag asm...@ix.netcom.com, Ilya Zakharevich
nospam-ab...@ilyaz.org, k...@unicode.org,
James Clark j...@jclark.com, unicode Unicode
Mathias Bynens mathias at qiwi dot be wrote:
Let's say I'm writing a program that strips combining characters and
grapheme extenders from an input string.
For combining marks, I'm looking for any non-combining marks (e.g.
'a') followed by one or more combining marks (e.g. ' ̃'), and then I
On 23 Apr 2014, at 22:16, Mathias Bynens math...@qiwi.be wrote:
Let’s say I’m writing a program that strips combining characters and
grapheme extenders from an input string.
For combining marks, I’m looking for any non-combining marks (e.g. `a`)
followed by one or more combining marks
Re: Unclear text in the UBA (UAX#9) of Unicode 6.3
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
[...] And at least your original message
used and transliterations, not the actual characters.
No I used the «» characters exacvtly like here.
I absolutely never use the ASCII
Markus Scherer markus@gmail.com wrote:
|I strongly recommend you parse the derived properties rather than trying to
|follow the derivation formula, because that can change over time.
..this file includes only those core properties that have
themselves a derivation-may-change property?
(I
On Thu, Apr 24, 2014 at 12:56 PM, Steffen Nurpmeso sdao...@yandex.comwrote:
Markus Scherer markus@gmail.com wrote:
|I strongly recommend you parse the derived properties rather than trying
to
|follow the derivation formula, because that can change over time.
..this file includes only
On 24 Apr 2014, at 21:38, Whistler, Ken ken.whist...@sap.com wrote:
Grapheme_Extend characters per se do not apply to anything.
They are a mixture of different General_Category types -- mostly combining
marks, but not all. The concept of applying to a base only refers to
combining marks
Given the incredible level of interest shown on this list during
the last week, I am glad that I can finally announce the publication
of Bidi Brackets for Dummies:
http://www.unicode.org/notes/tr39/
I had wanted to publish that several weeks ago, but unfortunately,
publication was held up for
tn not tr
http://www.unicode.org/notes/tn39/
markus
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode
On Thu, 24 Apr 2014 19:38:54 +
Whistler, Ken ken.whist...@sap.com wrote:
Yes. Grapheme_Extend characters per se do not apply to anything.
They are a mixture of different General_Category types -- mostly
combining marks, but not all. The concept of applying to a base only
refers to
On Apr 24, 2014, at 2:16 PM, Whistler, Ken wrote:
Given the incredible level of interest shown on this list during
the last week, I am glad that I can finally announce the publication
of Bidi Brackets for Dummies:
http://www.unicode.org/notes/tn39/
Dear Dr. Ken,
Thanks ever so much for
On Thu, 24 Apr 2014 23:07:58 +0200
Mathias Bynens math...@qiwi.be wrote:
I realize reversing a string has nothing to do with text segmentation
– but ignoring grapheme extenders leads to unexpected results (since
after reversing the code points, the grapheme extender might extend
the wrong
On this side show, Philippe finally is correct, because I received his
message without ASCII-i-fication; he cc'd me directly, and I never saw
the mangled text. It's a bit embarassing for a Unicode mail list to not
even be able to let guillemets through unmolested.
But this shall not distract
20 matches
Mail list logo