Am 2001-02-20 um 03:47 h UCT hat Krishna Desikachary geschrieben:
There is an internationally accepted set of extra chars that are
included in Roman (Latin) script to transacribe Sanskrit texts
in Roman script.
Is there a list of these characters available, online?
If so, where (URL)? If not
Am 2001-02-20 um 9:18 UCT hat Valeriy E. Ushakov geschrieben:
That's why I made and posted CSX mapping. There are a LOT of old
CSX-encoded material. With this mapping I can use existing software
(like the mentioned perl module) to convert it to Unicode and use
emacs to view/edit it.
This
On Tue, Feb 20, 2001 at 12:32:01 +, Otto Stolz wrote:
That's why I made and posted CSX mapping. There are a LOT of old
CSX-encoded material. With this mapping I can use existing software
(like the mentioned perl module) to convert it to Unicode and use
emacs to view/edit it.
Doug Ewell wrote:
A few days ago I said there was a "widespread belief" that Unicode is a
16-bit-only character set that ends at U+. A corollary is that the
supplementary characters ranging from U+1 to U+10 are either
little-known or perceived to belong to ISO/IEC 10646 only,
Otto Stolz wrote:
Am 2001-02-20 um 03:47 h UCT hat Krishna Desikachary geschrieben:
There is an internationally accepted set of extra chars that are
included in Roman (Latin) script to transacribe Sanskrit texts
in Roman script.
Is there a list of these characters available, online?
Marco Cimarosti wrote:
Doug Ewell wrote:
"A 16-bit character encoding standard [...]
By contrast, 8-bit ASCII [...]
These two statements are regularly found together, but it is the second one
that makes me despair.
If nearly half a century was not enough time for people to learn
On 02/20/2001 03:34:28 AM Marco Cimarosti wrote:
How about considering UTF-32 as the default Unicode form, in order to be
able to provide a short answer of this kind:
"Unicode is now a 32-bit character encoding standard, although only
about one million of codes actually exist, and there
The error may arise from a misunderstanding of the reference on the first
page of chapter 1 of the book to a 16-bit form and an 8-bit form and to
"using a 16-bit encoding." It's also hard to get one's head wrapped around
the idea that Unicode isn't just an encoding until one does extensive
"Charlie Jolly" [EMAIL PROTECTED] wrote:
Does anybody know if there is a chart or table showing what OS's,
Applications, Programming Languages support Unicode and in particular
what
scripts?
You'll find some of this on
http://www.unicode.org/unicode/onlinedat/products.html.
Should an open
On 02/20/2001 06:21:09 AM "Charlie Jolly" wrote:
Should an open source script processing engine be part of the standard? As
I
understand it if you want to develop Unicode solutions for complex scripts
then you either have to do it yourself or rely upon Uniscribe or ATSUI.
Whether or not the
At 4:21 AM -0800 2/20/01, Charlie Jolly wrote:
Do fonts have to tie themselves to a script engine. Will an Opentype font
for lets say Hindi such as MS Mangal work on an Apple OS or Linux? Or is
this font tied to Uniscribe? If this is correct then shouldn't there be a
better solution?
Both OT and
In a message dated 2001-02-20 06:18:34 Pacific Standard Time,
[EMAIL PROTECTED] writes:
With the Unicode-related functions in Prague growing out of size, I moved
them
into a new library called 'Babylon'. It will provide all the functionality
defined in the Unicode standard (it is not
In a message dated 2001-02-20 04:21:49 Pacific Standard Time,
[EMAIL PROTECTED] writes:
A little out of date, but describing correctly the state of art in 1991
before the merger.
Agreed, but the example was from Windows 2000. It should at least be current
through Unicode 2.1.
Even
At 04:21 AM 2/20/2001 -0800, Charlie Jolly wrote:
Do fonts have to tie themselves to a script engine. Will an Opentype font
for lets say Hindi such as MS Mangal work on an Apple OS or Linux? Or is
this font tied to Uniscribe? If this is correct then shouldn't there be a
better solution?
Mangal
Ok, I was wrong:
On pp 969-71 of the 3.0 book, it *does* mention the BMP and UCS.
They're not in the index, that's what fooled me.
1. I take it that these 2 terms are more popular with the 10646 folks?
2. P. 971, what "additional semantics" are being alluded to here?
3. 971, what
The people who are responsible for this text have been made aware of the
problem. This will be updated for WindowsXP.
Cathy
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, February 20, 2001 8:04 AM
To: Unicode List
Subject: Re: Perception that
I am unsure if "8-bit ASCII" is a well-defined term. "ASCII" implies
X3.4-1986 and the 7-bit ASCII code. It was my intention for ISO/IEC 8859-1
to be the 8-bit ASCII standard. When the US adopted ISO 8859-1 as a US
standard (ANSI/ISO 8859-1), as editor I asked ANSI to add "(8-bit ASCII)" to
Doug Ewell wrote:
In a message dated 2001-02-20 04:21:49 Pacific Standard Time,
Funilly, the message I got is stamped 03:36:27 PST...
[EMAIL PROTECTED] writes:
A nit to pick: It's the latin alphabet, not roman. Roman is a kind of
typeface, contrasting to sans serif aka grotesque.
Thanks for the comments thus far.
They have helped clarify alot of ambiguities.
As for AAT, could Apple not supply template fonts so that font designers can
concentrate on the glyphs. I.e. replace master glyphs with their own.
Charlie Jolly
[EMAIL PROTECTED] wrote:
On 02/19/2001 08:05:49 PM David Starner wrote:
It will provide all the functionality
defined in the Unicode standard (it is not Unicode but ISO 10646 compliant
as it uses 32bit wide characters internally) and is written in C++.
Eh? Unicode has no aversion to
Charlie Jolly wrote:
Fonts:
Do fonts have to tie themselves to a script engine.
Yes. Font technologies does not allow things like Nagari or Sinhala rendering
to operate by themselves, they need some assistance from the underlying
platform.
This is the current state of art, one may hope it
On 02/20/2001 10:19:37 AM John Hudson wrote:
The Apple AAT and SIL Graphite approach work a little differently. I'm not
familiar enough with Graphite to know how they handle stuff like character
reordering, or how difficult it is to achieve such things in their
Graphite
Description Language,
At 09:05 AM 2/20/2001 -0800, Antoine Leca wrote:
(In French, sans serif is normally named "antique"
Which must be very confusing to Germans and others who use 'antiqua' to
distinguish seriffed humanists types from blackletter.
John Hudson
Tiro Typeworks |
Vancouver, BC | All
At 09:17 AM 2/20/2001 -0800, Charlie Jolly wrote:
As for AAT, could Apple not supply template fonts so that font designers can
concentrate on the glyphs. I.e. replace master glyphs with their own.
The folk at Apple are certainly aware that they have a problem with AAT's
current level of
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Tuesday 20 February 2001 17:03, you wrote:
In a message dated 2001-02-20 06:18:34 Pacific Standard Time,
into a new library called 'Babylon'. It will provide all the
functionality defined in the Unicode standard (it is not Unicode but
On 02/20/2001 10:03:35 AM DougEwell2 wrote:
A nit to pick: It's the latin alphabet, not roman. Roman is a kind of
typeface, contrasting to sans serif aka grotesque.
True. I have also heard "roman" used to mean the opposite of italic.
An alphabet is a type of writing system, something
On 02/20/2001 11:19:48 AM Antoine Leca wrote:
Yes. Font technologies does not allow things like Nagari or Sinhala
rendering
to operate by themselves, they need some assistance from the underlying
platform.
This is the current state of art, one may hope it will change in the
future.
I don't
The following statements have been made by participants in this thread.
1.
A few days ago I said there was a "widespread belief" that Unicode is a
16-bit-only character set that ends at U+. A corollary is that the
supplementary characters ranging from U+1 to U+10 are either
At 9:17 AM -0800 2/20/01, Charlie Jolly wrote:
Thanks for the comments thus far.
They have helped clarify alot of ambiguities.
As for AAT, could Apple not supply template fonts so that font designers can
concentrate on the glyphs. I.e. replace master glyphs with their own.
It's on our to-do
On 02/20/2001 12:33:04 PM John Hudson wrote:
The only thing that I insist on is that we maintain the distinction
between
Roman and roman.
Which is?
I wonder though, Peter, about your suggestion that '"Latin script" is less
acceptable since "Latin" suggests something constrained to the
From: "John Hudson" [EMAIL PROTECTED]
I wonder though, Peter, about your suggestion that '"Latin script" is less
acceptable since "Latin" suggests something constrained to the language
Latin'. Couldn't the same thing be said about 'Arabic script'?
I think everyone here can likely agree that
Re: "Uniscribe is just an implementation of these specifications, and I hope
sincerely Microsoft will not hide some "features" into USP10.DLL in order to
kill any concurrence."
The process of adding new feature support to Uniscribe is not unlike adding
newer "features / capabilities" to other
[EMAIL PROTECTED] wrote:
Even 8-bit ASCII is a correct term meaning ISO-8859-1.
I would question that. Understandable, yes, but not really correct.
No, it *is* correct. ANSI X.3 (which has a new name these days) in fact
did define an 8-bit American Standard Code for Information
At 09:09 AM 2/20/2001 -0800, [EMAIL PROTECTED] wrote:
If we are talking about the full collection of characters that are
historically related to the Latin alphabet, however, i.e. the entire
script, then I would need to see better argumentation and references than
this to convince me that it's
On 02/20/2001 11:18:40 AM Tobias Hunger wrote:
Looks like David was quoting me. I am working on Babylon and wanted to
make
clear that it is not unicode conformant as its API uses 32bit wide
characters
which violates clause 1 of Section 3.1.
This is something that UTC should clean up because C1
Marco asked:
Kenneth Whistler wrote:
No. Unicode 3.1 has already been approved, and is in the
last stages of publication. After that, Unicode 3.2 will
appear, adding over 1000 more characters to the BMP.
Can you anticipate what these new BMP characters will be? Entire scripts or
just
No, the 8-bit ANSI standard (ANSI/ISO 8859-1-1987) does not include "ASCII"
as part of its title. It is listed by ANSI as
"8-Bit Single Byte Coded Graphic Character Sets - Part 1: Latin Alphabet No.
1"
So, no, there is no such thing as 8-bit ASCII, though Latin 1 is frequently
referred to as
From what I remember about these collations, Czech and Slovak are very
similar, if not identical; Croat(ian) is very different than the other two
(it also has compressions/ligatures that sort as unique letters).
Cathy
-Original Message-
From: Tex Texin [mailto:[EMAIL PROTECTED]]
Sent:
Cathy thanks. Yes, I remember now I was down this path before.
thanks
tex
Cathy Wissink wrote:
From what I remember about these collations, Czech and Slovak are very
similar, if not identical; Croat(ian) is very different than the other two
(it also has compressions/ligatures that sort as
At 12:07 20-02-2001 -0800, Tex Texin wrote:
Hi,
I am updating my information on Slovak collation.
See http://www.whizkidtech.net/ISO-8859-2/sk.html . Then email me
with any questions you might have.
Adam
---
Whiz Kid Technomagic - brand name computers for less.
See
Paul Keinänen said:
[86-M8] Motion: Amend Unicode 3.1 to change the Chapter 3, C1 conformance
clause to read "A process shall interpret Unicode code units (values) in
accordance with the Unicode transformation format used." (passed)
While this wording makes it possible to handle any 32 bit
On Tue, 20 Feb 2001 [EMAIL PROTECTED] wrote:
Even 8-bit ASCII is a correct term meaning ISO-8859-1.
I would question that. Understandable, yes, but not really correct.
In the computer culture I grew up, 8-bit ASCII meant CP437. Every author
called the CP437 table that was available at
In a message dated 2001-02-20 09:53:50 Pacific Standard Time,
[EMAIL PROTECTED] writes:
An alphabet is a type of writing system, something that is implemented for
a particular language. Certainly Latin is the name of a language while
Roman is not, and so "Latin alphabet" is correct while
I wrote:
Even 8-bit ASCII is a correct term meaning ISO-8859-1.
I would question that. Understandable, yes, but not really correct.
[EMAIL PROTECTED] wrote:
No, it *is* correct. ANSI X.3 (which has a new name these days) in fact
did define an 8-bit American Standard Code for
128 wrongs don't make a right...
;-)
I see books and documents all the time that refer to writing out
ASCII files when they really mean plaintext. Usually they don't
know which code page they are generating.
ASCII is a very ambiguous term these days...
tex
Roozbeh Pournader wrote:
On Tue, 20
45 matches
Mail list logo