On Mon, May 18, 2015 at 11:19 AM, Doug Ewell d...@ewellic.org wrote:
Is the new mechanism intended to allow flag tags that include either
subtype values or contains values?
As far as I can tell from your quotes, CLDR will say what's valid (plus
containment info), and Unicode permits you to
Philippe and I have got bogged down in a long discussion of how to
parse traces of Unicode strings under canonical equivalence against
non-regular Kleene star of regular expressions. Fortunately, such
expressions can be expected to have very little use. A seemingly simple
example is the regex
The hyphen is not redundant in ISO 3166 that defines primary codes with
variable length (even if ISO 3166 part 1 for now only use two-letter codes).
Sometime in a future, two letters will not be enough even in ISO 3166-1, if
countries continue to split/merge (this does not happen frequently but is
L2/15-145R says:
In CLDR 28, LDML will define a unicode_subdivision_subtag which also
provides validity criteria for the codes used for regional
subdivisions (see CLDR ticket #8423). When representing regional
subdivisions using ISO 3166-2 codes, only those codes that are valid
for the LDML
On 18 May 2015 at 19:19, Doug Ewell d...@ewellic.org wrote:
Is the new mechanism intended to allow flag tags that include either
subtype values or contains values? For example:
That is my understanding.
1F3F3 E0047 E0042 E002D E0053 E0043 E0054 (GB-SCT)
for the Scottish flag
and
1F3F3
2015-05-18 20:35 GMT+02:00 Richard Wordingham
richard.wording...@ntlworld.com:
The algorithm itself should be tractable - Mark Davis has published
an algorithm to generate all strings canonically equivalent to a
Unicode string, and what we need might not be so complex.
Even this algorithm
On Mon, 18 May 2015 19:37:06 +0100
Andrew West andrewcw...@gmail.com wrote:
1F3F3 E0047 E0042 E002D E004E E004C E004B (GB-NLK)
for the North Lanarkshire council area flag
I don't believe that North Lanarkshire has an associated flag, which I
think is the case for most UK counties and
Date: Mon, 18 May 2015 19:35:45 +0100
From: Richard Wordingham richard.wording...@ntlworld.com
Mark Davis has published an algorithm to generate all strings
canonically equivalent to a Unicode string
Where can I find the description of that algorithm?
Markus Scherer markus dot icu at gmail dot com wrote:
As far as I can tell from your quotes, CLDR will say what's valid
(plus containment info), and Unicode permits you to show a flag for
any valid tag. North Lanarkshire seems perfectly fine.
I'm under the impression that this will be a
On Mon, 18 May 2015 21:05:49 +0200
Philippe Verdy verd...@wanadoo.fr wrote:
2015-05-18 20:35 GMT+02:00 Richard Wordingham
richard.wording...@ntlworld.com:
The algorithm itself should be tractable - Mark Davis has published
an algorithm to generate all strings canonically equivalent to a
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
If ever the country codes used in BCP47 becomes full (all pairs of
letters used), just some time before this happens, we could see new
prefixes added before a new range of code. It is possible to use a
1-letter prefix for new
2015-05-18 23:38 GMT+02:00 Doug Ewell d...@ewellic.org:
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
So country codes cannot be reassigned (and we can expect many more
merges/splits or changes of regimes in the many troubled areas of the
world.
Changes of regimes don't
2015-05-18 23:55 GMT+02:00 Doug Ewell d...@ewellic.org:
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
If ever the country codes used in BCP47 becomes full (all pairs of
letters used), just some time before this happens, we could see new
prefixes added before a new range of
2015-05-18 22:14 GMT+02:00 Doug Ewell d...@ewellic.org:
I know I'll regret this...
You should not
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
Sometime in a future, two letters will not be enough even in ISO
3166-1, if countries continue to split/merge (this does not
On Mon, 18 May 2015 22:40:21 +0300
Eli Zaretskii e...@gnu.org wrote:
Date: Mon, 18 May 2015 19:35:45 +0100
From: Richard Wordingham richard.wording...@ntlworld.com
Mark Davis has published an algorithm to generate all strings
canonically equivalent to a Unicode string
Where can I
Isn't it possible for your basic substitution to transform \uf073 into a
character class [\uf071\uf072\uf073] that the regexp considers as a single
entity to check ?
In that case, backtracking for matching \u0F73*\u0F72 is simpler:
[\uF071\uF072\uF073]*\u0F72, as it just requires backtracking
I know I'll regret this...
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
Sometime in a future, two letters will not be enough even in ISO
3166-1, if countries continue to split/merge (this does not happen
frequently but is occurs every few years; and it will not be possible
to
If ever the country codes used in BCP47 becomes full (all pairs of letters
used), just some time before this happens, we could see new prefixes added
before a new range of code. It is possible to use a 1-letter prefix for new
country/territory code extensions, but with some maintenance of BCP47
On Tue, 19 May 2015 01:25:54 +0200
Philippe Verdy verd...@wanadoo.fr wrote:
I don't work with strings, but with what you seem to call traces,
For the concept of traces, Wikipedia suffices:
https://fr.wikipedia.org/wiki/Mono%C3%AFde_des_traces .
As far as text manipulation is concerned, the
A few notes.
A more concrete proposal will be in a PRI to be issued soon, and people
will have a chance to comment more then. (I'm not trying to discourage
discussion, just pointing out that there will be something more concrete
relatively soon to comment on—people are pretty busy getting 8.0
On Mon, 18 May 2015 22:56:47 +0200
Philippe Verdy verd...@wanadoo.fr wrote:
Isn't it possible for your basic substitution to transform \uf073
into a character class [\uf071\uf072\uf073] that the regexp considers
as a single entity to check ?
In that case, backtracking for matching
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
ISO 3166-1 already defines alpha-3 and numeric code elements, as well
as alpha-2.
But how to work with the 2 letters limitation when the world wants
more stability in codes (this was an important reason why ISO 639 was
not fully
This is why I knew I would regret it.
Clearing up some errors here. No more posts from me on this non-Unicode
topic after this one.
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
This would be a major revision to BCP 47, it would have nothing to do
with reordering,
It woiuld
I don't work with strings, but with what you seem to call traces, but
that I call sets of states (they are in fact bitsets, which may be
compacted or just stored as arrays of bytes containing just 1 usefull bit,
but which may be a bit faster; byte arrays are just simpler to program).,
in a stack
many thanks, this exactly the needed information :)
respectfully
2015-05-15 19:09 GMT+03:00 Denis Jacquerye moy...@gmail.com:
You should use ARABIC SHADDA U+0651 in all positions. The presentation
forms (isolated, medial, final forms) are for compatibility with legacy
systems.
See what is
25 matches
Mail list logo