Am I missing something? Isn't the name of the =language= irrelevant?
Unicode encodes =scripts=, not languages, yes? There's no English
block in the standard, after all.
The script we're talking about in this thread is used to write lots of
languages - Wikipedia lists Assamese, Meitei,
When will Unihan be back? It has been down for quite a while now, and there are
librarians for whom checking this is part of their workflow…
Martin
Martin,
-On [20100712 16:52], Martin Heijdra (mheij...@princeton.edu) wrote:
When will Unihan be back? It has been down for quite a while now, and there are
librarians for whom checking this is part of their workflow…
Can I offer http://www.cojak.org/ and http://www.jisho.org/kanji/radicals
A few comments.
A tailoring that sorts word-by-word would certainly be possible, and is
certainly allowed by the UCA. As to whether it is necessary or not, that is
another matter. Sorting is about matching user expectations, and of all of
the French that I have ever asked, none except for
We hope to have it back in the next few days.
On Jul 12, 2010, at 8:34 AM, Martin Heijdra wrote:
When will Unihan be back? It has been down for quite a while now, and there
are librarians for whom checking this is part of their workflow…
Martin
=
Siôn ap-Rhisiart
John H. Jenkins
On 7/8/2010 5:09 PM, Tulasi wrote:
Ok I am correcting - Bangladeshi to Bengali.
The Government of West Bengal / Society for Natural Language Technology
Research (a member of the Consortium) has a very strong preference for
the term Bengla rather than Bengali.
Eric.
On 12 Jul 2010, at 20:32, Eric Muller wrote:
The Government of West Bengal / Society for Natural Language Technology
Research (a member of the Consortium) has a very strong preference for the
term Bengla rather than Bengali.
As a speaker of English I have a very strong preference for the
The problem in this message is probably not in the specified charset
(windows-1252) but on the way the MIME type is specified just before
it TEXT/PLAIN.
Traditionally, the MIME types are only given in lowercase, so if you
had written text/plain; charset=windows-1252, it would have been
orrectly
Philippe Verdy said:
If we don't limit the backwards reordering, then all accents in the
full sentences will be reordered, so this is the final word that will
drive the order. not only this is incorrect,
I understand that you think that the ordering should be done
word-by-word, with the
Mark Davis ☕ m...@macchiato.com
A few comments.
A tailoring that sorts word-by-word would certainly be possible, and is
certainly allowed by the UCA. As to whether it is necessary or not, that is
another matter. Sorting is about matching user expectations, and of all of
the French that I
Kenneth Whistler k...@sybase.com wrote:
Huh? That is just preprocessing to delete portions of strings
before calculating keys. If you want to do so, be my guest,
but building in arbitrary rules of content suppression into
the UCA algorithm itself is a non-starter.
I have definitely not asked
Philippe Verdy wrote:
Kenneth Whistler k...@sybase.com wrote:
Huh? That is just preprocessing to delete portions of strings
before calculating keys. If you want to do so, be my guest,
but building in arbitrary rules of content suppression into
the UCA algorithm itself is a non-starter.
In this thread I am looking to explore specific/distinctive answer to:
Among both, which standard has more letters/symbols?
So I needed 2 list of letters/symbols including cascaded conjuncts,
one GOB-standard and the other WBG-standard.
Coming up with a list of names were not the theme/target,
On Mon, 12 Jul 2010, Philippe Verdy wrote:
Traditionally, the MIME types are only given in lowercase, so if you
had written text/plain; charset=windows-1252, it would have been
orrectly detected.
Nonsense. Pure, unadulterated nonsense.
I helped write the MIME RFCs, and I can assure you that
Kenneth Whistler
A : verd...@wanadoo.fr
Copie à : unicode@unicode.org, k...@sybase.com
Objet : Re: UTS#10 (collation) : French backwards level 2, and word-breakers.
Philippe Verdy said:
A basic word-breaker using inly the space separator would marvelously
improve the speed of French
De : Kenneth Whistler k...@sybase.com
Philippe Verdy wrote:
Kenneth Whistler k...@sybase.com wrote:
Huh? That is just preprocessing to delete portions of strings
before calculating keys. If you want to do so, be my guest,
but building in arbitrary rules of content suppression into
16 matches
Mail list logo