I will be out of the office starting 12/26/2003 and will not return until
01/01/2004.
I will respond to your message when I return.
Jim Allan wrote at 4:16 PM on Sunday, December 28, 2003:
James Kass wrote on using variation selectors for fine glyph variations:
So, that approach might meet epigraphers' needs while enabling
painless cross-variant searching, and still permit scholars to
get on with encoding their texts as
Elaine Keown
still in Texas
Dear Michael Everson and Lists:
Michael Everson wrote:
And the mother of those scripts is Phoenician. She
is
*not* Hebrew.
The mother script is probably the southern Sinai or
Wadi el-Hol script, written in about 1,700 B.C.E. by
Aramaeans who
On 28/12/2003 20:47, D. Starner wrote:
...
Intra-script, a difference in appearance has call for seperate codings.
Inter-script, if the appearance is dissimilar enough to be a bar to
reading, and there's a disjoint population of users (so that one is
not a handwriting or cipher variant of
At 06:40 -0800 2003-12-29, Elaine Keown wrote:
Michael Everson wrote:
And the mother of those scripts is Phoenician. She is *not* Hebrew.
The mother script is probably the southern Sinai or Wadi el-Hol
script, written in about 1,700 B.C.E. by Aramaeans who worked either
in the copper mines of
From: Christopher John Fynn [EMAIL PROTECTED]
Anyone have a list of other standards, protocols, RFC's etc which specify
Unicode (in any of it's encoding formats) as the base, default or
preferred
character set to be used?
For RFCs it's not difficult to get this list using the RFCeditor.org
From: Kent Karlsson [EMAIL PROTECTED]
Don't know. But there are instances of sharp s () that look like a
ligated
long-s () and ezh ().
We have:
02A7;LATIN SMALL LETTER TESH DIGRAPH;Ll;0;L;N;LATIN SMALL LETTER T
ESH
but no canonical or compatibility decomposition as t + esh, even though
At 07:39 AM 12/29/2003, Michael Everson wrote:
I also think that your attitude is that of a Hellenist or
Indo-Europeanist, who looks at everything from the perspective of Athens.
Think what you like.
Semitics is Praeparatio Hellenika--its other aspects are less
important, and
hence not to be
For the same reason, why is the German ess-tsett (sharp S) given a
compatibility decomposition as ss instead of long-ss?
Don't know. But there are instances of sharp s () that look like a
ligated
long-s () and ezh ().
That is correct.
Before a consistent spelling using was introduced,
Philippe Verdy scripsit:
We have:
02A7;LATIN SMALL LETTER TESH DIGRAPH;Ll;0;L;N;LATIN SMALL LETTER T
ESH
but no canonical or compatibility decomposition as t + esh, even though it
is a clear ligature using the short-leg esh.
Since tesh does not mean the same thing as t followed by
Adam Twardoch scripsit:
Even today, some (rather few) users of German prefer to use the sz
compatibility decomposition rather than ss since it's far less ambiguous.
It's a minority practice, but I have seen this. sz does not occur in
normal German, while ss has orthographic differences from
Hi,
I would like to use free tools that can help me analyze Visual C/C++ code so
as to track down potential internationalization problems in the code.
Would appreciate your recommendations.
Will
Texts may use a, c. diaeresis as well as a, c. small e above
in the same text, even the same font (and there are (old) documents
that do so, even though they may use these characters interchangeably).
It is up to the author to decide which to use, not the font designer.
We had this argument
Philippe Verdy writes:
From: Christopher John Fynn [EMAIL PROTECTED]
Anyone have a list of other standards, protocols, RFC's etc which
specify Unicode (in any of it's encoding formats) as the base,
default or preferred character set to be used?
For RFCs it's not difficult to get this list
Philippe Verdy wrote:
We have:
02A7;LATIN SMALL LETTER TESH DIGRAPH;Ll;0;L;N;LATIN SMALL LETTER T
ESH
but no canonical or compatibility decomposition as t + esh, even
though it
is a clear ligature
using the short-leg esh.
I wonder why there's no VARIANT defined for the short leg ESH
Radovan Garabik writes:
konwert (konwert UTF8-ascii)
and unaccent.
Found them. Thanks!
--
Hallvard
It looks to me like Christopher is not after an analysis of what standards could somehow be squeezed
to use Unicode charsets, but rather a list of standards that _specify_ (actively, not potentially)
Unicode/10646.
The obvious ones are of course
HTML (at least since 4.01:
- Message d'origine -
De: Markus Scherer [EMAIL PROTECTED]
It looks to me like Christopher is not after an analysis of what standards
could somehow be squeezed
to use Unicode charsets, but rather a list of standards that _specify_
(actively, not potentially)
Unicode/10646.
The
Philippe Verdy wrote:
I note that the UCD contains lines for PUAs like this:
...
E000;Private Use, First;Co;0;L;N;
F8FF;Private Use, Last;Co;0;L;N;
...
But why isn't there lines for the _assigned_ Private Local-Use characters in
1. No one saw a need to include them?
2. The
On 29/12/2003 09:32, Jim Allan wrote:
...
Difference of language means there isn't much use in doing
cross-searches between material written in Phoenician and material
written in Greek. The same is not true about cross-searching material
written in any northwest Semitic language. The languages
From: Patrick Andries [EMAIL PROTECTED]
De: Markus Scherer [EMAIL PROTECTED]
It looks to me like Christopher is not after an analysis of what
standards
could somehow be squeezed
to use Unicode charsets, but rather a list of standards that _specify_
(actively, not potentially)
Well, they are listed in
http://www.unicode.org/Public/UNIDATA/DerivedAge.txt
If you search for noncharacter there, you will find which ones were
designated in which Unicode
version. (Only two were designated in Unicode 1.)
Thanks, I forgot to check this file, which was introduced later to
Peter Kirk wrote:
Jim, you seem to be almost contradicting yourself here. In fact it is by
no means certain that there were separate Hebrew and Phoenician
languages at the time of the Gezer calendar (9th century BCE? - from
memory). At least they may have been no more different than British and
Do you know if there are human-readable versions of Windows and/or MacOS
keyboard layouts available somewhere?
I'm looking for a way to compile a table that could look a bit like the
following:
Platform LanguageLayoutUnicodeKeystroke
WindowsPolish Polish (Programmers)
And of course:
COBOL, FORTRAN, C, C++, POSIX, 10176 Characters for identifiers in
programming languages, 14651 string ordering, 15897 registry of cultural
elements, the 8859 family, 15924 names of script, 19769 new character types
in C , and more ...
Arnold
-Original Message-
From:
At 06:55 -0800 2003-12-29, Peter Kirk wrote:
Yes, this is true at least of Azerbaijani, which mapped Cyrillic
glyphs to Latin ones one-to-one. But with Serbo-Croat we are talking
of two separate communities which prefer to use separate scripts for
what is essentially the same language; and
Such a format for Windows would be quite inadequate since it is missing many
things, such as:
1) The version of Windows in which it first shipped (there were minor
differences in what was in 9x vs. NT, and on NT some characters were added
to keyboards in later versions).
2) The fact that many
From: Michael (michka) Kaplan [EMAIL PROTECTED]
To: Unicode List [EMAIL PROTECTED]
Such a format for Windows would be quite inadequate since it is missing
many
things, such as:
1) The version of Windows in which it first shipped (there were minor
differences in what was in 9x vs. NT, and on
on 2003-12-28 16:36 Gerd Schumacher wrote:
In German the supralinear e may be used as a variation of the diaeresis
above a, o, and u. Though it is old fashioned, indeed, it is still
understandable, and might be used for invitation cards and the like. I dont know a modern
font with it,
From: Philippe Verdy [EMAIL PROTECTED]
If the intent is to display in a user interface which keystroke the user
must press to create a character sequence it can be useful to know the
character generated in the default state without modifiers (or the
character
generated in CAPSLOCK mode).
Mac OSX keyboard layouts can be defined in XML which is a close as we
get to human readability - see
http://developer.apple.com/technotes/tn2002/tn2056.html
This format has been available for over a year, so there may be some
published data files from 3rd Parties.
Michael Everson is one such
31 matches
Mail list logo