Re: Inverted breve in Greek?

2001-02-22 Thread Lukas Pietsch
Seán, these are "perispomeni"s. Not uncommon to see them printed like that. Encode as u+0342. Best wishes, Lukas

Re: [OT] What is DEL for?

2001-02-22 Thread Otto Stolz
Dear Unicoders, again, I have inadvertently sent a contribution to a member rather than to the whole list, because the Unicode list sets the Reply-to header in an utmost inconvenient and unexpected manner. Here is a copy for the list. I hope I will not mistype the address. I really wish that I

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space in Unicode)

2001-02-22 Thread Joel Rees
Hi, William, I have to admit that I really haven't looked carefully at your transformation techniques and their intended purpose. But it strikes me that you might be re-inventing the wheel. A number of schemes exist for squeezing wide bit patterns into narrow bit streams. UTF-8 has been adopted

RE: [OT] What is DEL for?

2001-02-22 Thread Ayers, Mike
From: Marco Cimarosti [mailto:[EMAIL PROTECTED]] This also casts some light on the fact that some fonts (notably JIS fonts) have a big black box glyphs at position 0x7F: it is probably for overwriting a character already printed on paper, so that it cannot be read anymore.

Re: Inverted breve in Greek?

2001-02-22 Thread J%ORG KNAPPEN
Erratum: the combining perispomeni is at U+0342, I first digged out the non.combing one. --J"org Knappen

Re: fictional scripts revisited

2001-02-22 Thread Joel Rees
- Original Message - From: "Thomas Chan" [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED] Sent: Thursday, February 22, 2001 3:58 PM Subject: fictional scripts revisited Hi all, Between January 30-31, there was a thread here entitled "ConScript registry?", in which I

RE: [OT] What is DEL for?

2001-02-22 Thread Marco Cimarosti
Mike Ayers wrote: This also casts some light on the fact that some fonts (notably JIS fonts) have a big black box glyphs at position 0x7F: [...] Probably not. A big black box (big hollow boxes are also used for this) is a polite way to represent a character which has no glyph.

Do 16 bit surrogate high bits indicating characters have a persisting meaning please?

2001-02-22 Thread William Overington
When thinking about using surrogate pairs of 16 bit unicode characters to express a 21 bit unicode character I like to think in terms of an analogy of a Medieval Great Field divided into strips for cultivation. A road runs along one edge of the field, perpendicular to the strips, so that someone

Re: What about musical notation?

2001-02-22 Thread William Overington
Having been advised recently about accessing 21 bit unicode characters using an example from musical notation, following up on that advice I have found the document that details characters in the range U+1d100 to U+1d1ff, entitled Musical Symbols. I began wondering about how one would use

Re: What about musical notation?

2001-02-22 Thread Lukas Pietsch
Am I right in thinking that in the days when hand set metal type on printing presses was the only method of printing that there were fonts of musical type? I have never seen any font of such type myself, though I have seen fonts for such non-text matters as chess sets and crossword

RE: Do 16 bit surrogate high bits indicating characters have a pe

2001-02-22 Thread Marco Cimarosti
William Overington imagined: When thinking about using surrogate pairs of 16 bit unicode characters to express a 21 bit unicode character I like to think in terms of an analogy of a Medieval Great Field divided into strips for cultivation. That's what freedom of thought is for: allowing

RE: [OT] What is DEL for?

2001-02-22 Thread Jungshik Shin
On Thu, 22 Feb 2001, Marco Cimarosti wrote: Frank da Cruz wrote: DEL does indeed have a use in plain text files that are encoded with Shift-In / Shift-Out to switch between left and right halves of (say) ISO 8859-1 without having to actually put 8-bit characters in the file. This

RE: [OT] What is DEL for?

2001-02-22 Thread Pierpaolo BERNARDI
On Thu, 22 Feb 2001, Marco Cimarosti wrote: Could I find ISO-2022 on-line (or an unofficial explanation of it)? Yes. ISO-2022 = ECMA-35 search in www.ecma.ch for ecma-35.pdf BTW, here what it says about delete: 6.2.1 Character DELETE Name: DELETE Acronym: DEL Coded

Re: What about musical notation?

2001-02-22 Thread John Hudson
At 03:52 AM 2/22/2001 -0800, William Overington wrote: Am I right in thinking that in the days when hand set metal type on printing presses was the only method of printing that there were fonts of musical type? I have never seen any font of such type myself, though I have seen fonts for such

Re: Inverted breve in Greek?

2001-02-22 Thread David J. Perry
- Original Message - From: "Patrick T. Rourke" [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED] Sent: Thursday, February 22, 2001 7:10 AM Subject: Re: Inverted breve in Greek? In fact, this is the (barely) preferred form of the circumflex, or perispomeni, for ancient Greek. Of

Re: Plain text in Java ResourceBundle

2001-02-22 Thread David Gallardo
Sun's Java compiler accepts Unicode, but it expects Latin1 characters only in source files, with other Unicode characters encoded using \uddd notation. They provide a tool, curiously named "native2ascii" which will translate source from native encoding (non-Latin 1 and non-Unicode) to this

Re: Do 16 bit surrogate high bits indicating characters have a persisting mea...

2001-02-22 Thread DougEwell2
In a message dated 2001-02-22 04:28:10 Pacific Standard Time, [EMAIL PROTECTED] writes: Suppose that one has a document, say a chapter from a novel, that consists of a sequence of unicode characters that are each more than 16 bits in significance and one wishes to represent them using a

Re: What about musical notation?

2001-02-22 Thread DougEwell2
In a message dated 2001-02-22 04:30:20 Pacific Standard Time, [EMAIL PROTECTED] writes: So, I am left wondering as to how unicode will be used to set music. Unicode only provides the symbols -- the building blocks -- needed to set music. The process of taking these building blocks and

Re: What about musical notation?

2001-02-22 Thread Michael Everson
At 07:58 -0800 2001-02-22, [EMAIL PROTECTED] wrote: Unicode only provides the symbols -- the building blocks -- needed to set music. The process of taking these building blocks and creating a full Wagner score (or folk tune) is a matter of three-dimensional layout, which is outside the scope of

Re: Do 16 bit surrogate high bits indicating characters have a persisting mea...

2001-02-22 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] Yes. As Marco Cimarosti has indicated, each supplementary character is represented in UTF-16 by a surrogate *pair*. Both surrogates need to be specified each time. Consequently, a stream of Deseret text (for example) will contain a lot of U+D801's. It is also

RE: What about musical notation?

2001-02-22 Thread Figge, Donald
About thirty years ago, I was involved in the production of a song book. At that time, the notes were engraved directly onto copper plates by artisans who specialized in music engraving. Repro proofs were made from the plates, and then the words were pasted onto the proofs. Don // -Original

Re: More rambling about Han

2001-02-22 Thread Thomas Chan
On Thu, 22 Feb 2001, Joel Rees wrote: What I want is to be able to send a piece of text with one or more characters that I know the recipient will not have in his collection of fonts, and have some hope that he or she will be able to see the glyph in a meaningful manner. The IDC's can help

Re: What about musical notation?

2001-02-22 Thread DougEwell2
In a (private) message dated 2001-02-22 08:47:26 Pacific Standard Time, [EMAIL PROTECTED] writes: Wagner score (or folk tune) is a matter of three-dimensional layout, which is outside the scope of Unicode. You probably meant *bi*-dimensional layout, right? Of course I did. Duh.

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-22 Thread Peter_Constable
What exactly _would_ be wrong with calling UNICODE a thirty-two bit encoding In part, it's the ambiguity or lack of clarity involved when we say "an encoding". What's an encoding? I think most people (I certainly used to) think of a character encoding as a collection of characters each

Re: Plain text in Java ResourceBundle

2001-02-22 Thread John O'Conner
I'm offering just a slight correction in the details of ResourceBundle: The Java 2 platform provides two concrete implementations of ResourceBundle...a ListResourceBundle and a PropertyResourceBundle. The text you can provide in them has different constraints. A ListResourceBundle becomes a

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-22 Thread Peter_Constable
On 02/21/2001 11:15:45 PM Tom Lord wrote: Absurdly Brief Introduction to Unicode [snip] Some Special Code Points [snip] Two code points represent non-characters. These are U+FFFE and U+. Programs are free to give these values special meaning internally. There are,

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-22 Thread Peter_Constable
On 02/21/2001 10:55:09 PM "Joel Rees" wrote: Now I happen to be of the opinion that the attempt to proclaim the set closed at 17 planes is a little premature. For better or worse, not only is it not premature, it's a done deal! - Peter

Re: fictional scripts revisited

2001-02-22 Thread David Starner
On Wed, Feb 21, 2001 at 10:58:06PM -0800, Thomas Chan wrote: First, there are the 4000 new[4] "CJK Ideographs" that he created solely for a work called _Tianshu_ (A Book from the Sky)[5] (1987-1991), which Xu spent three years carving movable wooden type for. There is no doubt that these are

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-22 Thread Peter_Constable
On 02/22/2001 12:11:49 PM "P. T. Rourke" wrote: What about saying that it's "an encoding standard which can currently be represented by 8 bit, 16 bit, and 32 bit encodings?" As I revise (and revise, and revise) my page, that's the answer I'm leaning toward. Yes, that should work. I'd make

Collation

2001-02-22 Thread Mark Davis
For those interested in collation, we have a new version of the ICU collation design document on http://oss.software.ibm.com/icu/develop/collation/. Feedback is welcome. Mark ___ Mark Davis, IBM GCoC, Cupertino (408) 777-5850 [fax: 5891], [EMAIL PROTECTED], [EMAIL PROTECTED]

RE: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-22 Thread Carl W. Brown
Joel, You comment about Microsoft having pie in its face is a bit puzzling. They based NT on Unicode 1.0 and Windows 2000 which was sent to manufacturing 15 months ago has surrogate support. For all its faults MS has been a big promoter of Unicode. What burns me up is Sun implementing a

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-22 Thread Tom Lord
[EMAIL PROTECTED] wrote: "Unicode is a character set encoding standard which currently provides for its entire character repertoire to be represented using 8-bit, 16-bit or 32-bit encodings." Please say "encoding forms". There are three distinct terms, that sound similar, and

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-22 Thread Markus Scherer
Tom Lord wrote: Two code points represent non-characters. These are U+FFFE and U+. Programs are free to give these values special meaning internally. Unicode (2.0 and up?) has 34 non-characters at U+xxFFFE and U+xx where xx is 00, 01, .., 0F, 10. Unicode 3.1 is adding another 32

Re: fictional scripts revisited

2001-02-22 Thread Thomas Chan
On Thu, 22 Feb 2001, David Starner wrote: On Wed, Feb 21, 2001 at 10:58:06PM -0800, Thomas Chan wrote: First, there are the 4000 new[4] "CJK Ideographs" that he created solely for a work called _Tianshu_ (A Book from the Sky)[5] (1987-1991), which Xu spent three years carving movable

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-22 Thread Tex Texin
Peter, good points. What's clear from this discussion, is when somebody asks about the encoding of Unicode, the right response is "Why do you want to know?" not this elaboration of terminology etc. If they want to know maximum character count, tell them 1M+. If they want to know whether it's

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-22 Thread Peter_Constable
On 02/22/2001 04:15:02 PM "Tex Texin" wrote: What's clear from this discussion, is when somebody asks about the encoding of Unicode, the right response is "Why do you want to know?" Yes, that's probably the best response. - Peter

Re: fictional scripts revisited

2001-02-22 Thread Kenneth Whistler
Thomas Chan noted: At the inception of various other fictional scripts, no one could foresee the growth of scholarly and/or amateur interest in them; True. That's why we wait until there is, before we consider encoding a script. Yes, I agree. It is harder to find historical

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-22 Thread Joel Rees
Hi, Carl, Joel, You comment about Microsoft having pie in its face is a bit puzzling. They based NT on Unicode 1.0 and Windows 2000 which was sent to manufacturing 15 months ago has surrogate support. For all its faults MS has been a big promoter of Unicode. Sometimes I run off at the

Re: fictional scripts revisited

2001-02-22 Thread Kenneth Whistler
Joel Rees responded: Idiosyncratic and personal characters are not encoded in Unicode. I find this a fault in UNICODE. When we go through the set algebrae in the introductory algebra courses for computer science, it is usually pointed out that a set of characters can only be

Re: fictional scripts revisited

2001-02-22 Thread Joel Rees
Kenneth Whistler explained: Joel Rees responded: Idiosyncratic and personal characters are not encoded in Unicode. I find this a fault in UNICODE. When we go through the set algebrae in the introductory algebra courses for computer science, it is usually pointed out that a set of

Re: fictional scripts revisited

2001-02-22 Thread Kenneth Whistler
Joel Rees responded: If UNICODE can never attempt to address the issue of non-closure, You've got it completely backwards. The Unicode Standard is the one with the open repertoire, which is why it keeps expanding year to year. I know you can't foresee breaking past 17 planes, but

Re: Perception that Unicode is 16-bit (was: Re: Surrogate space i

2001-02-22 Thread Joel Rees
Ken, Thanks for the consideration. I threw my ego away years ago. Joel, Note that I am just sending a response to you, not to the list. I wouldn't mind this being on the list. I was making bad assumptions about Sun's and others's reasons for wanting to do perverse things with

Re: What about musical notation?

2001-02-22 Thread Curtis Clark
At 04:44 AM 2/22/01, Lukas Pietsch wrote: As far as I know, music printing with mobile letters of this kind was indeed done, mostly back in the 16th/17th century. There were "letters" which each represented one fragment of a stave with one or several noteheads on them. It tended to look pretty

Re: fictional scripts revisited

2001-02-22 Thread Joel Rees
On 2001.02.23 15:06, Curtis Clark wrote: At 07:04 PM 2/22/01, Joel Rees wrote: I'm telling you that 17 planes is not enough, and it _will_ become a painful constraint in your lifetime. So Plane 9, say, can be nothing but surrogates-of-surrogates, to some 64- or 128-bit code space. You do

Re: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

2001-02-22 Thread Paul Keinanen
On Thu, 22 Feb 2001 11:51:31 -0800 (GMT-0800), Markus Scherer [EMAIL PROTECTED] wrote: Tom Lord wrote: Two code points represent non-characters. These are U+FFFE and U+. Programs are free to give these values special meaning internally. Unicode (2.0 and up?) has 34 non-characters at