and
the other variations of a single shape
regards
- Chris
Dean Snyder [EMAIL PROTECTED]
Christopher John Fynn wrote at 12:53 PM on Saturday, December 27, 2003:
Dean Snyder wrote:
So Unicode is now prepared to provide support,
in plain text, for the needs of paleographers?
What would you
Elaine Keown [EMAIL PROTECTED] wrote:
I have only heard that they had
different opinions at Harvard and at UChicago. I
don't know (sorry) how these texts are viewed at Johns
Hopkins.
How about in European and Middle Eastern Universities?
On 12/23/03 19:40, Philippe Verdy wrote:
Could you instead take the time to work on the missing Latin
letters for African languages? Why isn't there any serious
work about these living languages that don't have lot of
universitary support and nearly no computer resources in
Africa to make
From: Elaine Keown [EMAIL PROTECTED] wrote:
Your arguments are very calm and rational, but it's
not that simple. I wish it were.
Some of the sets of symbols I found---which I simply
assumed could be added to Hebrew--are innately
controversial because of the Roadmap.
That's actually true
Jungshik Shin [EMAIL PROTECTED]
On Wed, 24 Dec 2003, Christopher John Fynn wrote:
BTW are the classical written languages of China Japan more or less the
same thing?? I understand that the Chinese Buddhist canon is also used by
the Japanese without translation so I assume
Michael Everson [EMAIL PROTECTED] wrote:
Unicode List [EMAIL PROTECTED]
At 17:46 + 2003-12-26, Christopher John Fynn wrote:
(Though the Roman style Fraktur style of Latin script are probably more
different from each other as some of the separately encoded Indic
scripts [e.g. Kannada
Dean Snyder [EMAIL PROTECTED] wrote:
To get a feel for the kinds of variations that occurred over many
centuries in the ancient Northwest Semitic script take a look at these
paleographic charts, which include glyphs for Phoenician, Moabite, Old
Hebrew, Samaritan, and Old Aramaic:
John Jenkins [EMAIL PROTECTED] wrote:
On Dec 23, 2003, at 4:23 PM, Christopher John Fynn wrote:
Remember that Unicode (not ISO 10646) was originally going to be a
16bit (plane
0 only encoding) - so I suspect CJK unification was at least partly
due to
space limitations
John Jenkins [EMAIL PROTECTED] wrote:
On Dec 23, 2003, at 4:23 PM, Christopher John Fynn wrote:
Remember that Unicode (not ISO 10646) was originally going to be a
16bit (plane
0 only encoding) - so I suspect CJK unification was at least partly
due to
space limitations
John Jenkins [EMAIL PROTECTED] wrote:
On Dec 23, 2003, at 4:23 PM, Christopher John Fynn wrote:
Remember that Unicode (not ISO 10646) was originally going to be a
16bit (plane
0 only encoding) - so I suspect CJK unification was at least partly
due to
space limitations
Samaritan Bibles have fascinating marks that indicate
the emotion or dramatic interpretation to use in
reading each verse.pretty nifty!
Sounds like these marks are akin to Vedic accents (yet to be encoded) in
Devanagri
which serve a similar purpose.
- Chris
Regarding Samaritan, there is a group of modern users certainly. This
page http://www.orindalodge.org/kadoshsamaritan.php has a number of
interesting links on it. Masonic scholars apparently differentiate
between Hebrew and Samaritan.
Yes. And looking at page 5 of:
Of course, to echo the observation John Hudson made regarding the
Masonic Hebrew and Samaritan text, the text presented here
http://www.crowndiamond.org/cd/genesis.html shows that Palaeo-Hebrew
should obviously unified with Latin.
and. . .
http://www.crowndiamond.org/cd/alpha.html
you
Remember that Unicode (not ISO 10646) was originally going to be a 16bit (plane
0 only encoding) - so I suspect CJK unification was at least partly due to
space limitations.
--
Christopher J. Fynn
- Original Message -
From: Jony Rosenne [EMAIL PROTECTED]
To: '[EMAIL PROTECTED]' [EMAIL
John Hudson wrote:
Michael Everson wrote:
...
and some documents on different approaches to unifying or not
unifying the bewildering array of early semitic writing systems,
That *is* something that is going to impact on what I have to do, and
I would really rather not be forced to give up
Elaine Keown [EMAIL PROTECTED] wrote:
Right now Jewish studies and Biblical studies people
are finally trying to convert to Unicode, fonts are
being made, proposals written, great experts are
looking over proposals.
In 18 months or so, this will all be over, and we will
go on to a
Text originally in written one script has often been published in another
related script because:
a) in the age of metal type there was no widespread availability of fonts for
many scripts and it was very time consuming and expensive to create them.
b) there may already be a large community of
Which script does the small community of native Aramaic speakers that still
exists use to write their own language?
Would they be happy if Aramaic was unified with Hebrew? I don't know but I
suspect those that live in Lebanon or Syria might not - and it could even cause
them political problems.
For me two scripts that are different enough so that a text written
in one script will have imprecise matches in another, and will be
hardly recognizable by readers is a candidate to a separate encoding,
because it starts its own family of supplementary letters specific
to some families of
However, could there be an encoding for:
LATIN CAPITAL LETTER DOTLESS J
with a lowercase mapping to the new:
LATIN SMALL LETTER DOTLESS J
Of course the former would look exactly the same as the
ASCII uppercase J, except that it would have a distinct
case mapping. This would avoid, for j/J
Philippe Verdy [EMAIL PROTECTED] wrote:
Ohhh... I admit this is hypothetic for a possible use, but the candrabindu
case is a precedent coming from romanization of non-Latin scripts: what if
there's a combining x above used to interact over a diacritic and mark its
suppression in corrected
Dean Snyder [EMAIL PROTECTED] wrote:
Recently I have had second thoughts about encoding complex signs.
Modification of base, or simple, signs was a productive process for
making new signs in the earlier periods of cuneiform usage, and included
such modifications as adding or subtracting
Jim Allan [EMAIL PROTECTED] wrote:
On the other hand, there is nothing to prevent the Unicode consortium or
any other body or any single person from creating a new *additional*
corrected set of names if the Unicode consortium or any other body or
any single person wishes to do so.
That
There seems to be at least some interest in re-establishing the UK character
encoding committee which contributed to ISO/IEC JTC1/SC2/WG2 10646.
Anyone in Britain (or British) who might be interested in participating, please
let me know ASAP.
Thanks
- Chris
==
Christopher Fynn
4 Chester Court
Doug Ewell [EMAIL PROTECTED]
The North Korean and Chinese national bodies have already
made proposals that violate both the letter and spirit of stability
policies.
Fortunately they each have only one vote in WG2.
- Chris
At 14:56 +0100 2003-12-14, Philippe Verdy wrote:
May be the Unicode name should not be swastika but a transliteration of an
Asian name (Tibetan, Chinese Pinyin...), and all references to swastika
(included in code charts, and the name index) removed if they ever occur
somewhere in the standard
The swastika is the main symbol of the Bonpo religion followed in Tibet and
surrounding regions. Banning the swastika to a Bonpo would be like banning the
cross symbol to a Christian, the star of David to a Jew, or the crescent moon
and star to a Muslim. It is also an important symbol in
PROTECTED]
To: Christopher John Fynn [EMAIL PROTECTED]; Michael Everson
[EMAIL PROTECTED]
Cc: Unicode List [EMAIL PROTECTED]; [EMAIL PROTECTED]; Paul Nelson
[EMAIL PROTECTED]
Sent: Monday, December 15, 2003 3:22 AM
Subject: Re: [tibex] Swastika to be banned by Microsoft?
Does any Asian rendering
Peter Kirk wrote:
Consider the following:
(1) span class=black-text{U+00E9}/span
(2) span class=black-texte{U+0301}/span
(3) span class=black-textespan
class=black-text{U+0301}/span/span
(4) span class=black-textespan class=red-text{U+0301}/span/span
I would expect (1), (2) and (3) to
Andrew West wrote:
... and similar stroke-by-stroke incremental diagrams showing how to write
CJK
ideographs are even more common in (Chinese, Japanese, etc.) pedagogical
texts
intended for both native children and for foreigners. I've also seen such
diagrams in Tibetan pedagogical texts, and
John Hudson [EMAIL PROTECTED] wrote:
The way to do this is to decompose bases and marks at the glyph level if
they are not already decomposed at the character level, and then to apply a
colour to the mark. In order to do this you need to know what is a mark
glyph and what is abase glyph (this
In Unicode U+0BBE, U+0BC6 and U+0BCA are all dependent vowel signs
IE is probably treating a base character and any dependent vowels as a single
unit. Since in some fonts a base character + combining vowel mark might be
displayed by a single ligature glyph, it makes sense to apply the
Michael
Personally I think some of the large corporations who are members of the
Unicode Consortium should support the work of the SEI since a) they will get
some benefit out of it and b) they could probably write off a large part of the
donation / contribution.
It would almost certainly be
Arcane Jill wrote:
In short, in any given locale, one should get the symbols of that locale,
out of the box. (And in my locale, that should include math and music symbols).
My apologies if that was not clear, but rest assured I absolutely am not
ethnocentric. I was merely stating what I think is
An adequate proposal for a complex script should surely include a proper
account of the script behaviour and sample glyphs of presentation forms.
And so such a proposal should include all that is needed for a
developer, and is available some time before the new script is
officially
Philippe Verdy [EMAIL PROTECTED] wrote:
Just visit the impressive resource references collected on:
http://www.nongnu.org/freefont/
I notice under the heading What do we plan to achieve, and how? on that page
there is a list Free UCS outline fonts will cover the following character
sets:
://136.142.158.105/Lasa2000/Hartch.PDF ).
But I guess this is all part of Globalisation
With best regards
- Chris
--
Christopher J. Fynn
- Original Message -
From: Peter Kirk [EMAIL PROTECTED]
To: Edward H. Trager [EMAIL PROTECTED]
Cc: Christopher John Fynn [EMAIL PROTECTED]; [EMAIL PROTECTED
]
To: Edward H. Trager [EMAIL PROTECTED]
Cc: Christopher John Fynn [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Wednesday, December 03, 2003 11:37 PM
Subject: Re: MS Windows and Unicode 4.0 ?
On 03/12/2003 14:46, Edward H. Trager wrote:
On Wednesday 2003.12.03 19:59:45 -, Christopher John Fynn wrote
Philippe Verdy [EMAIL PROTECTED] wrote:
As FreeType does not offer to its users such a license, it cannot implement
hinting mechanisms in its renderer. So FreeType cannot use fonts hinted with
Apple technology. This means that font authors cannot seriously sell hinted
font designs to
Eric Scace [EMAIL PROTECTED] wrote:
The set of symbols in use has been standardized for many decades
by the World Meteorological Organization.
Anywhere this standard can be found on line? or in an official publication?
--
Christopher J. Fynn
- Original Message -
From: Eric
Last time I looked TrueDoc did not not work well with fonts for Unicode ranges
beyond Latin-1.
and for IE it requires the installation of an ActiveX component on client
machines. You may also need to purchase software to make embeddeble fonts that
work with TrueDoc.
Please see:
This may be the fault of the application not Windows. Many Windows applications
do not take advantage of the support for Unicode, OpenType layout, and font
linking which is present in Windows 2000 XP.
It's plain silly to expect support for every Unicode character to be present on
every platform
Philippe Verdy [EMAIL PROTECTED] wrote:
That's why I think that font design providers (Adobe, Agfa MonoType, ...)
should agree on a common format to allow authors to distribute freely the
documents they create with these font designs. Then it's up to them to
cooperate with operating
Patrick Andries [EMAIL PROTECTED]
Well, some fonts would be better than none
(and they have to be made so that
the Unicode standard be printed).
In the case of complex scripts, a font sufficient to print a code chart is
nowhere near adequate to render that script properly.
If you code chart
Thanks for the link. It is good to know that MSKLC can be used for creating
Keyboard Driver for WinCE. But is it true only truetype fonts can be used. No
OTF?
Thanks and refgares
Mustafa Jabbar
I doubt that PostScript flavour OpenType fonts can be used since that would
require some form of
Philippe Verdy [EMAIL PROTECTED] wrote:
I also think that Tibetan issues should be discussed in that list, despite
its composition model is very different from Brahmic scripts of India,
unless there's a specific rapporteur group for it.
There already is a specific list for Tibetan script
Michael Everson wrote:
There are not very many conjuncts with -dda.
Remember Sanskrit can be, and sometimes is, written in almost
every Indic script; so, with all these Indic scripts, you have to allow
for all Sanskrit conjuncts as well as those used in the languages
predominantly written in
Suggest you check the Global Development pages at Microsoft
http://www.microsoft.com/globaldev/default.mspx (links on the right of the
page) and
http://www.microsoft.com/globaldev/getwr/wincei18n.mspx
to find out about Unicode Support in Windows CE, Windows CE fonts and creating
keyboard layouts
Michael Everson [EMAIL PROTECTED] wrote:
At 09:51 -0800 2003-11-25, Peter Constable wrote:
My understanding is that Word for Mac in MS Office Mac versions since
Office 98 have used the same file format as Windows versions -- Word 97
and later. That means that Word for Mac can read files
Peter Kirk [EMAIL PROTECTED] wrote:
This approach would certainly have simplified pointed Hebrew a lot, so
much so that it could well be serious. After all, Ethiopic was encoded
as a syllabary just because the vowel points happen to have become
attached to the base characters. And we
Mustafa
With complex scripts like Bangla under Mac OSX I think you have to make
AAT fonts rather than OT fonts - though it is possible to include both AAT
tables and OT tables in the same font.
For tools specs to do this try:
http://developer.apple.com/fonts/OSXTools.html
Christopher J. Fynn
In the case of Microsoft's Mangal, which is an OpenType font, the mapping
(including contextual mapping) from Unicode characters to glyphs in the
font is contained in lookup tables built into the font.
Many glyphs in this font do not have a direct one to one correspondence
with characters but
Of course if OpenType or AAT fonts are used you often don't require an
intelligent IME for complex scripts since the smarts are in the font.
- Chris
Deepayan Sarkar [EMAIL PROTECTED] wrote:
But this would not reflect the fact that the *glyph* [CONS][ZWJ][CONS] is
actually the same thing as the *sequence of characters*
[CONS][VIRAMA][CONS],
i.e., [CONS][VIRAMA][ZWNJ][CONS] is also a perfectly legitimate
representation.
As I
- Original Message -
From: Peter Kirk [EMAIL PROTECTED]
To: Marco Cimarosti [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Wednesday, October 08, 2003 11:54 AM
Subject: Re: Bangla: [ZWJ], [VIRAMA] and CV sequences
On 08/10/2003 02:58, Marco Cimarosti wrote:
What happens with the
Gautam Sengupta [EMAIL PROTECTED]
--- Christopher John Fynn [EMAIL PROTECTED] wrote:
As I understand it, [CONS][VIRAMA][VIRAMA][CONS]
is the correct way of
forcing a virama to be displayed rather than a
ligature - not
[CONS][VIRAMA][ZWNJ][CONS]
This is certainly true
- Original Message -
From: Jony Rosenne [EMAIL PROTECTED]
Please note that Braille is used also for Hebrew. We use the same codes,
but
they are assigned a different meaning. The reader has to know or guess
which
language it is.
I don't remember whether Hebrew Braille is written RTL
John Cowan [EMAIL PROTECTED]
http://www.iso.org/iso/en/commcentre/pressreleases/2003/Ref871.html
They say
There is no proposal currently being considered by ISO to impose
charges for use of these codes, including on the World Wide Web and
in software applications.
This kind of leaves it
Rick McGowan [EMAIL PROTECTED] has privately suggested moving
the discussion of Combining Classes of *Tibetan* Characters
from the main Unicode list [EMAIL PROTECTED] to the TIBEX list
[EMAIL PROTECTED] - an experts list which was set up several
years ago specifically to discuss proposals for
Valeriy E. Ushakov [EMAIL PROTECTED] wrote:
A sample list of dbu can contractions from Schmidt grammar:
http://snark.ptc.spbu.ru/~uwe/tibex/contractions/contractions.ht
ml
When these combinations are written in dbu-can script, as they
are here ,the problem may not look too bad. - However
Difficulties due to the present combining class values attached
to these characters most frequently occur with
abbreviations/contractions and/or with cursive scripts. With
abbreviations it is common to have two or more vowels on a
consonant stack. In cursive or semi-cursive forms of Tibetan
Michael Everson [EMAIL PROTECTED]
Regarding the last, one may note with some alarm
http://www.spiralnature.com/entertain/wheelchair.html
Seriously, it seems that the HANDICAPPED /
DISABLED/ WHEELCHAIR SIGN may be copyright in some countries.
Please see
Ken, on this list, suggested i write this myself. my get out:
some m.s.
software knowledge would be needed, which i don't have. :)
MS Office applications all have a File, Send To... option in
their menus - if people use this (and many do) it generates huge
email files in multiple formats.
-
And how about:
http://www.csaa.com/global/articledetail/0,8055,100300%257C2
670,00.html
http://www.csaa.com/global/articledetail/0,8055,100300%257C2
669,00.html
http://www.csaa.com/global/articledetail/0,8055,100300%257C2
668,00.html
- Chris
In Unicode's UnicodeData.txt (
http://www.unicode.org/Public/UNIDATA/UnicodeData.txt )
0F7E has a Canonical Combining Class Value (CCCV) of 0;
0F71 a CCCV of 129;
0F72 0F7A 0F7B 0F7C 0F7D and 0F80 a CCCV of 130;
0F74 a CCCV of 132;
and 0F82 and 0F83 have a CCCV of 230.
By normal Tibetan
In Unicode's UnicodeData.txt (
http://www.unicode.org/Public/UNIDATA/Unicodea.Dattxt )
0F7E has a Canonical Combining Class Value (CCCV) of 0;
0F71 a CCCV of 129;
0F72 0F7A 0F7B 0F7C 0F7D and 0F80 a CCCV of 130;
0F74 a CCCV of 132;
and 0F82 and 0F83 have a CCCV of 230.
By normal Tibetan
Phillipe
By relative ordering I did not mean relative collation weights
but the order in which these combining characters are usually
entered relative to other characters and each other - and the
order relative to each other in which they should be stored in a
string. The current CCCV weights
- Original Message -
From: Jain, Pankaj (MED, TCS) [EMAIL PROTECTED]
To: 'Edward H Trager' [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Thursday, June 19, 2003 6:37 PM
Subject: RE: Problem with Arial Unicode MS font for BOLD/ITALICS
in PDF
Edward,
thanks for the response. Is it
Carl W. Brown [EMAIL PROTECTED] wrote:
To: Michael (michka) Kaplan [EMAIL PROTECTED];
[EMAIL PROTECTED]
MichKa,
This is an equal opportunity forum intended for discussion
of issues
relative to Unicode, an industrial consortium that includes
(among many
others) the companies you are
Philippe Verdy [EMAIL PROTECTED] wrote:
removing the unnecessary
features that Microsoft wants to promote, such as proprietary
web fonts for CSS2,
Phillipe
I presume you are talking about embedded web fonts - since if
CSS is simply used to get the best match, using fonts already
on
Stefan Persson [EMAIL PROTECTED] wrote
I believe that we won't see any MS Office 20003 for another
18000 years.
Whooops... typo.
- Chris
Barry Caplan wrote:
I am asking about v2 for a selfish reason, but everything
above might as well be about v1 also .
I don't think it would be a good idea to promote Unicode v1
conformance in any way since some characters in that version
were removed or encoded at a different location
Abdij Bhat [EMAIL PROTECTED] wrote:
Hi,
I want one font to be used across all languages. Is it
possible? For
example, I want Tahoma to be used for all languages for all
OS.
Does Tahoma support this?
It may be possible but it doesn't sound like a good idea - a
font with proper support for
Pim Blokland [EMAIL PROTECTED]
Hello all,
I have got a stupid question - that is, the question was asked
of me
and I didn't know what to say.
What is ISO 10646?
Usually I can asnwer questions like this by doing an Internet
search, but in this case, I get varying answers:
it is a code
Rick McGowan [EMAIL PROTECTED]
James Do wrote:
Outlook Express is superb with Unicode
Hmmm, so it's different from MS Outlook? I'm not sure I get
it. I thought
Express was just a watered-down version of Outlook itself.
No Outlook and Outlook Express though they look similar are
quite
William Overington [EMAIL PROTECTED] wrote:
I am wondering whether the range from U+F200 through to
U+F2FF is being used by anyone for anything.
By its very nature anyone can use PUA codepints for anything and
I'm sure by
now someone is already using those codepoints for something* -
and
William Overington [EMAIL PROTECTED] wrote:
1. I tried out the validation procedure on the following
page.
http://www.users.globalnet.co.uk/~ngo/font7007.htm
This is a not too lengthy web page with just Basic Latin
letters. It will
not validate. It is not clear to me what I need to add
Carl W. Brown [EMAIL PROTECTED] wrote:
I think that if you have a Klingon web site that uses UTF-8
and the PUA with
your own font is very Unicode savvy.
Carl
It's certainly a lot more savvy than using Latin-1 characters to
encode Klingon.
- Chris
Carl W. Brown [EMAIL PROTECTED] wrote:
If nothing else we need to discourage people from using the
Latin-1 code
page and a special font to create a code page hack.
Yes, I think that sort of thing should be *explicitly forbidden*
on pages where the Unicode Savvy logo is present (unless they
And how about some non-latin script, non-English versions for
web sites where the main content is in other scripts and
languages.
(What is the ideograph for savvy ?)
- Chris
William Overington wrote:
However, it might indeed be that there is no interest in my
code point
allocations, yet that is the chance which I, as an inventor,
need to take
when trying to follow the publication option to get an
invention
implemented. It worked for my telesoftware invention
J Do [EMAIL PROTECTED] wrote:
Instead of that, how about just plain OK, which has already
become quite universal.
No need for words like savvy, compliant or OK - just
having the check mark symbol as in Edward's design says enough
and at that way it's not favouring one language or another.
-
John Cowan [EMAIL PROTECTED] wrote:
Kent Karlsson scripsit:
E.g., it is quite legitimate to render, e.g. LIGATURE FI as an f followed
by an i, no ligation, whereas that is not allowed for the ae
ligature/letter, nor for the oe ligature.
How do you know that? Either Caesar or Csar
Michael Everson wrote
At 16:48 -0500 2003-03-03, John Cowan wrote:
Mijan scripsit:
Let's consider the ra+virama+ya case. In the mostpart the ra+virama+ya is
displayed as ya+reph. This obviously seems to be an
instance of ambiguous interpretation because ra+virama+ya could
also
Michael Everson wrote
No. Yes. What I see is an extension of an existing system, and YES
the virama does more than just kill the vowel. It creates conjuncts.
It acts like a ZWJ. How the cluster is pronounced is a matter of the
reading rules.
I think the dual purpose of the virama
Michael Everson wrote:
At 02:13 -0800 2003-01-29, Keyur Shroff wrote:
I beg to differ with you on this point. Merely having some provision for
composing a character doesn't mean that the character is not a candidate
for inclusion as separate code point.
Yes, it does.
India is a big
Adarsh
Do you mean the physical keyboard (hardware) or the keyboard driver (software)???
Physical keyboards are pretty much the same whatever the script of the glyphs printed
on the keys. Its the software that interprets the key presses and sends characters on
that matters.
- Chris
--
Richard Cook wrote:
--A: They are compositionally formed from the 8 trigrams.
Rebuttal: By this reasoning, the 8 trigrams themselves ought not to have
been encoded, since the 8 trigrams can be generated from simple broken
and unbroken lines. This alone is not a reason to encode them, but
Simon Law wrote:
In Oracle9i our next Database Release shipping this summer, we have introduced
support for two new Unicode character sets. ...
New character *sets* ???
Asmus Freytag wrote:
At 05:53 PM 3/21/01 +0100, you wrote:
I see that the list software now appends [unicode] to all subject
lines. This is very annoying, and not very useful, since those who
wish to filter their mail and put posts from this list in a folder of
its own etc. etc. can now
Mike Lischke wrote
Dear Sarasvati,
...
Just out of curiosity, why do you use an own mailing list server if
you can use a free one (Yahoo Groups)? The Unicode list is mirrored
there anyway, so why not make the "backup list" being the actual
list. You will get not only the list, but
John H. Jenkins [mailto:[EMAIL PROTECTED]]
Some of the characters in Extension B are required for JIS X 0213
support, which is going to be a sine qua non in Japan within a few
years. There was a push a little while ago to put these characters
on the BMP for precisely this
Joel Rees [mailto:[EMAIL PROTECTED]] wrote:
...
Maybe I'm a crackpot, but the need is there and people will use and abuse
UNICODE in ways that you probably don't want to imagine. What I'm trying to
push is building the mechanism now for dodging most of the abuse.
...
Well the PUA
Mark Davis wrote:
BTW, someone on this thread made this topic out to be even more complex than
is: that Devanagari and Korean are written without spaces. While that may
have been the case historically, I believe that the modern text does use
spaces. Chinese, Japanese and Thai are the
You might have to apply different rules dependant on the script. In Indic scripts
there are often no explicit word boundary markers and you may have to look for
grammatical particles. In Tibetan, a string of letters and vowels between two tsheg
[0F0B / 0F0C] characters (or other "punctuation")
From: [EMAIL PROTECTED] wrote:
Well, if a script had such behaviour, one possibility could be to propose a
combining CONSONANT SIGN L for what we would be choosing to think of as a
dependent form of the consonant. I.e. it may not be in an existing model,
but for a new script one could
Mark Davis [mailto:[EMAIL PROTECTED]] wrote:
"Marco Cimarosti" [EMAIL PROTECTED] wrote:
I wonder what "directly from Latin" may mean in the case of English.
Because
of some timing problems, I would say it means: "through direct knowledge
of
*written* Latin".
There was a period well
Michael Everson [mailto:[EMAIL PROTECTED]]wrote:
What has fictionality have to do with it? The criteria for encoding rest
primarily in the area of information interchange. Now it seems perhaps not
very likely that most users of Klingon (which is a language people learn
and use whether
Peter Constable wrote:
This is a good example of why an enumeration of "languages"
based only on written forms (as found in ISO 639) is
insufficient for all user needs.
Of course ISO 639 is insufficient for *all* user needs
- no standard is. And is there actually a remit for
ISO 639 to
Bjorn Stabell [EMAIL PROTECTED] wrote:
According to this news item (in Chinese), China rejected HK's
application to use Unicode, and instead says they have to use
ISO 10646-1:2000 or GB18030. Apparently they don't like to
standardize on a standard controlled by an organization of
1 - 100 of 108 matches
Mail list logo