Seán,
these are "perispomeni"s. Not uncommon to see them printed like that.
Encode as u+0342.
Best wishes,
Lukas
Dear Unicoders,
again, I have inadvertently sent a contribution to a member rather than
to the whole list, because the Unicode list sets the Reply-to header in
an utmost inconvenient and unexpected manner.
Here is a copy for the list. I hope I will not mistype the address.
I really wish that I
Hi, William,
I have to admit that I really haven't looked carefully at your
transformation techniques and their intended purpose. But it strikes me that
you might be re-inventing the wheel. A number of schemes exist for squeezing
wide bit patterns into narrow bit streams. UTF-8 has been adopted
From: Marco Cimarosti [mailto:[EMAIL PROTECTED]]
This also casts some light on the fact that some fonts
(notably JIS fonts)
have a big black box glyphs at position 0x7F: it is probably
for overwriting
a character already printed on paper, so that it cannot be
read anymore.
Erratum: the combining perispomeni is at U+0342, I first digged out
the non.combing one.
--J"org Knappen
- Original Message -
From: "Thomas Chan" [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Sent: Thursday, February 22, 2001 3:58 PM
Subject: fictional scripts revisited
Hi all,
Between January 30-31, there was a thread here entitled "ConScript
registry?", in which I
Mike Ayers wrote:
This also casts some light on the fact that some fonts
(notably JIS fonts)
have a big black box glyphs at position 0x7F: [...]
Probably not. A big black box (big hollow boxes are
also used for
this) is a polite way to represent a character which has no glyph.
When thinking about using surrogate pairs of 16 bit unicode characters to
express a 21 bit unicode character I like to think in terms of an analogy of
a Medieval Great Field divided into strips for cultivation. A road runs
along one edge of the field, perpendicular to the strips, so that someone
Having been advised recently about accessing 21 bit unicode characters using
an example from musical notation, following up on that advice I have found
the document that details characters in the range U+1d100 to U+1d1ff,
entitled Musical Symbols.
I began wondering about how one would use
Am I right in thinking that in the days when hand set metal type on
printing
presses was the only method of printing that there were fonts of musical
type? I have never seen any font of such type myself, though I have seen
fonts for such non-text matters as chess sets and crossword
William Overington imagined:
When thinking about using surrogate pairs of 16 bit unicode
characters to express a 21 bit unicode character I like to
think in terms of an analogy of a Medieval Great Field
divided into strips for cultivation.
That's what freedom of thought is for: allowing
On Thu, 22 Feb 2001, Marco Cimarosti wrote:
Frank da Cruz wrote:
DEL does indeed have a use in plain text files that are encoded with
Shift-In / Shift-Out to switch between left and right halves of (say)
ISO 8859-1 without having to actually put 8-bit characters in the
file.
This
On Thu, 22 Feb 2001, Marco Cimarosti wrote:
Could I find ISO-2022 on-line (or an unofficial explanation of it)?
Yes. ISO-2022 = ECMA-35
search in www.ecma.ch for ecma-35.pdf
BTW, here what it says about delete:
6.2.1 Character DELETE
Name: DELETE Acronym: DEL Coded
At 03:52 AM 2/22/2001 -0800, William Overington wrote:
Am I right in thinking that in the days when hand set metal type on printing
presses was the only method of printing that there were fonts of musical
type? I have never seen any font of such type myself, though I have seen
fonts for such
- Original Message -
From: "Patrick T. Rourke" [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Sent: Thursday, February 22, 2001 7:10 AM
Subject: Re: Inverted breve in Greek?
In fact, this is the (barely) preferred form of the circumflex, or
perispomeni, for ancient Greek. Of
Sun's Java compiler accepts Unicode, but it expects Latin1 characters only
in source files, with other Unicode characters encoded using \uddd
notation. They provide a tool, curiously named "native2ascii" which will
translate source from native encoding (non-Latin 1 and non-Unicode) to this
In a message dated 2001-02-22 04:28:10 Pacific Standard Time,
[EMAIL PROTECTED] writes:
Suppose that one has a document, say a chapter from a novel, that consists
of a sequence of unicode characters that are each more than 16 bits in
significance and one wishes to represent them using a
In a message dated 2001-02-22 04:30:20 Pacific Standard Time,
[EMAIL PROTECTED] writes:
So, I am left wondering as to how unicode will be used to set music.
Unicode only provides the symbols -- the building blocks -- needed to set
music. The process of taking these building blocks and
At 07:58 -0800 2001-02-22, [EMAIL PROTECTED] wrote:
Unicode only provides the symbols -- the building blocks -- needed to set
music. The process of taking these building blocks and creating a full
Wagner score (or folk tune) is a matter of three-dimensional layout, which is
outside the scope of
From: [EMAIL PROTECTED]
Yes. As Marco Cimarosti has indicated, each supplementary character is
represented in UTF-16 by a surrogate *pair*. Both surrogates need to be
specified each time. Consequently, a stream of Deseret text (for example)
will contain a lot of U+D801's.
It is also
About thirty years ago, I was involved in the production of a song book. At
that time, the notes were engraved directly onto copper plates by artisans
who specialized in music engraving. Repro proofs were made from the plates,
and then the words were pasted onto the proofs.
Don
//
-Original
On Thu, 22 Feb 2001, Joel Rees wrote:
What I want is to be able to send a piece of text with one or more
characters that I know the recipient will not have in his collection of
fonts, and have some hope that he or she will be able to see the glyph in a
meaningful manner.
The IDC's can help
In a (private) message dated 2001-02-22 08:47:26 Pacific Standard Time,
[EMAIL PROTECTED] writes:
Wagner score (or folk tune) is a matter of three-dimensional
layout, which is outside the scope of Unicode.
You probably meant *bi*-dimensional layout, right?
Of course I did. Duh.
What exactly _would_ be wrong with calling UNICODE a
thirty-two bit encoding
In part, it's the ambiguity or lack of clarity involved when we say "an
encoding". What's an encoding? I think most people (I certainly used to)
think of a character encoding as a collection of characters each
I'm offering just a slight correction in the details of ResourceBundle:
The Java 2 platform provides two concrete implementations of
ResourceBundle...a ListResourceBundle and a PropertyResourceBundle. The text
you can provide in them has different constraints.
A ListResourceBundle becomes a
On 02/21/2001 11:15:45 PM Tom Lord wrote:
Absurdly Brief Introduction to Unicode
[snip]
Some Special Code Points
[snip]
Two code points represent non-characters. These are U+FFFE and
U+. Programs are free to give these values special meaning
internally.
There are,
On 02/21/2001 10:55:09 PM "Joel Rees" wrote:
Now I happen to be of the opinion that the attempt to proclaim the set
closed at 17 planes is a little premature.
For better or worse, not only is it not premature, it's a done deal!
- Peter
On Wed, Feb 21, 2001 at 10:58:06PM -0800, Thomas Chan wrote:
First, there are the 4000 new[4] "CJK Ideographs" that he created solely
for a work called _Tianshu_ (A Book from the Sky)[5] (1987-1991), which Xu
spent three years carving movable wooden type for. There is no doubt that
these are
On 02/22/2001 12:11:49 PM "P. T. Rourke" wrote:
What about saying that it's "an encoding standard which can currently be
represented by 8 bit, 16 bit, and 32 bit encodings?" As I revise (and
revise, and revise) my page, that's the answer I'm leaning toward.
Yes, that should work. I'd make
For those interested in collation, we have a new version of the ICU
collation design document on
http://oss.software.ibm.com/icu/develop/collation/. Feedback is welcome.
Mark
___
Mark Davis, IBM GCoC, Cupertino
(408) 777-5850 [fax: 5891], [EMAIL PROTECTED], [EMAIL PROTECTED]
Joel,
You comment about Microsoft having pie in its face is a bit puzzling. They
based NT on Unicode 1.0 and Windows 2000 which was sent to manufacturing 15
months ago has surrogate support. For all its faults MS has been a big
promoter of Unicode.
What burns me up is Sun implementing a
[EMAIL PROTECTED] wrote:
"Unicode is a character set encoding standard which currently provides for
its entire character repertoire to be represented using 8-bit, 16-bit or
32-bit encodings."
Please say "encoding forms".
There are three distinct terms, that sound similar, and
Tom Lord wrote:
Two code points represent non-characters. These are U+FFFE and
U+. Programs are free to give these values special meaning
internally.
Unicode (2.0 and up?) has 34 non-characters at U+xxFFFE and U+xx where xx is 00,
01, .., 0F, 10.
Unicode 3.1 is adding another 32
On Thu, 22 Feb 2001, David Starner wrote:
On Wed, Feb 21, 2001 at 10:58:06PM -0800, Thomas Chan wrote:
First, there are the 4000 new[4] "CJK Ideographs" that he created solely
for a work called _Tianshu_ (A Book from the Sky)[5] (1987-1991), which Xu
spent three years carving movable
Peter, good points.
What's clear from this discussion, is when somebody asks about
the encoding of Unicode, the right response is "Why do you want to
know?" not this elaboration of terminology etc.
If they want to know maximum character count, tell them 1M+.
If they want to know whether it's
On 02/22/2001 04:15:02 PM "Tex Texin" wrote:
What's clear from this discussion, is when somebody asks about
the encoding of Unicode, the right response is "Why do you want to
know?"
Yes, that's probably the best response.
- Peter
Thomas Chan noted:
At the inception of various other fictional scripts, no one could foresee
the growth of scholarly and/or amateur interest in them;
True. That's why we wait until there is, before we consider encoding
a script.
Yes, I agree. It is harder to find historical
Hi, Carl,
Joel,
You comment about Microsoft having pie in its face is a bit puzzling.
They
based NT on Unicode 1.0 and Windows 2000 which was sent to manufacturing
15
months ago has surrogate support. For all its faults MS has been a big
promoter of Unicode.
Sometimes I run off at the
Joel Rees responded:
Idiosyncratic and personal characters are not encoded in Unicode.
I find this a fault in UNICODE. When we go through the set algebrae in the
introductory algebra courses for computer science, it is usually pointed out
that a set of characters can only be
Kenneth Whistler explained:
Joel Rees responded:
Idiosyncratic and personal characters are not encoded in Unicode.
I find this a fault in UNICODE. When we go through the set algebrae in
the
introductory algebra courses for computer science, it is usually pointed
out
that a set of
Joel Rees responded:
If UNICODE can never attempt to address the issue of non-closure,
You've got it completely backwards. The Unicode Standard is the one
with the open repertoire, which is why it keeps expanding year to year.
I know you can't foresee breaking past 17 planes, but
Ken,
Thanks for the consideration. I threw my ego away years ago.
Joel,
Note that I am just sending a response to you, not to the list.
I wouldn't mind this being on the list. I was making bad assumptions
about
Sun's and others's reasons for wanting to do perverse things with
At 04:44 AM 2/22/01, Lukas Pietsch wrote:
As far as I know, music printing with mobile letters of this kind was
indeed done, mostly back in the 16th/17th century. There were "letters"
which each represented one fragment of a stave with one or several
noteheads on them. It tended to look pretty
On 2001.02.23 15:06, Curtis Clark wrote:
At 07:04 PM 2/22/01, Joel Rees wrote:
I'm telling you that 17 planes is not enough, and it _will_ become a
painful
constraint in your lifetime.
So Plane 9, say, can be nothing but surrogates-of-surrogates, to some 64-
or 128-bit code space.
You do
On Thu, 22 Feb 2001 11:51:31 -0800 (GMT-0800), Markus Scherer
[EMAIL PROTECTED] wrote:
Tom Lord wrote:
Two code points represent non-characters. These are U+FFFE and
U+. Programs are free to give these values special meaning
internally.
Unicode (2.0 and up?) has 34 non-characters at
45 matches
Mail list logo