Re: Timetables and conventions

2000-06-16 Thread Doug Ewell
Kenneth Whistler <[EMAIL PROTECTED]> wrote: > Actually, it is currently under discussion for the Unicode 3.0.1 > update version release, which is imminent. (No new characters, just > some minor fixes for some data files, etc.) > > UnicodeData.txt, which currently contains entries like: > > E000;;

Characters for Programming Languages

2000-06-16 Thread From Net Link
I am interested in using additional characters for programming languages. C++ and other languages cause all sort of programs and mistakes reusing (overloading) the use of the small set of ASCII character. The set of math characters contains most of what would be nice but I do not see additional pa

RE: Linguistic precedence

2000-06-16 Thread brendan_murray
[EMAIL PROTECTED] wrote: >> not to be capitalized. Writing POBLACHT NA HIODÁILE would in fact be an error. > > Cool. Or horrible, if you have to write software to handle this :-) I was taken away to a little room in Dusseldorf airport when an alert official spotted what he assumed was pretty am

How to distinguish UTF-8 from Latin-* ?

2000-06-16 Thread Vinod Balakrishnan
Hi, How can we distinguish the UTF-8 characters sequence from a Latin-1/Latin-? characters. In case of most of the internet application UTF16 characters are prefixed by "0xu" and for the UTF8 characters there is no prefix to identify those. Do we HAVE/NEED a standard to represent UTF8 ? For

Re: Unicode and multilingual support in Macintosh Web browsers

2000-06-16 Thread Robert A. Rosenberg
At 10:07 AM 06/16/2000 -0800, Deborah Goldsmith wrote: >on 6/16/2000 10:22 AM, Robert A. Rosenberg <[EMAIL PROTECTED]> >wrote: > > > Adobe Indesign has Unicode Support. So does Outlook Express (just > > send/receive a message in UTF-7/UTF-8). > >Outlook Express only supports the subset of Unicode

RE: Linguistic precedence

2000-06-16 Thread Robert A. Rosenberg
At 02:37 AM 06/16/2000 -0800, Michael Everson wrote: >software that insists ... that all letters be capitalized is utterly evil. >:-) It sure makes it hard to tell how to tell the difference between polish and Polish (as well as how to pronounce the word "POLISH" since you first must figure o

Re: UTF-8 vs UTF-16 as processing code

2000-06-16 Thread J . Schneider
We currently have an application that is distributed world-wide. Our Australian office needed CJK capability so they chose to make the application use UTF-8. At the last possible point in the code before display, it was converted to UCS2. The application is client-server with the RDBMS engine

UTF-8 and UTF-16 issues

2000-06-16 Thread OLeary, Sean (NJ)
The following is from a document I had put together following the last San José Unicode conference. I would be interested in writing a more complete document with more issues added. Please send me any recommendations you might have. Sean =

RE: UTF-8 vs UTF-16 as processing code

2000-06-16 Thread Michael Kaplan (Trigeminal Inc.)
To Windows 2000 (and Windows NT circa SP4 as well), UTF-8 is another multibyte encoding, which you can get to via "code page 65001" and MultiByteToWideChar and get from via WideCharToMultiByte. So the only difference between it and any other code page, be it iso-8859-1 or windows-1252 is that happ

Re: The mother of all collation schemes

2000-06-16 Thread Keld Jørn Simonsen
AFAIK the Dutch use y, not ÿ for the ij - eg in names. The ÿ is used in some French names where ¨ is the diearesis to significate that the y is pronounced as a separate vovel. iso-8859-15 has an uppercase Y: Keld On Fri, Jun 16, 2000 at 04:39:31AM -0800, [EMAIL PROTECTED] wrote: > [EMAIL PROTECT

RE: UTF-8 vs UTF-16 as processing code

2000-06-16 Thread Jones, Bob
I have the same question. And, if you do go UTF-8 for processing, how does that work with Windows NT/2000? Is it even possible to have input come in as UTF-8? If you compile with Unicode turned on, it seems to automatically be UCS-2. Bob -Original Message- From: [EMAIL PROTECTED] [mai

FW: quick question about Wireless Application Protocol (WAP)

2000-06-16 Thread Magda Danish (Unicode)
-Original Message- From: Drzewicki, Robert [mailto:[EMAIL PROTECTED]] Sent: Friday, June 16, 2000 11:59 AM To: '[EMAIL PROTECTED]' Subject: quick question I have been trying to track down the following answer. Possibly you can help "Is the Wireless Application Protocol (WAP) unicode c

Re: The mother of all collation schemes

2000-06-16 Thread Robert A. Rosenberg
At 12:11 PM 06/15/2000 -0800, [EMAIL PROTECTED] wrote: >2) My alphabetical order: (digits are treated as letters): >[sp] [other punc.] 0 1 2 3 4 5 6 7 8 9 A Á Ä À B C Ç D E É Ë È F G H Í Ï Ì J K >L M N Ñ O Ó Ö Ò P Q R S T U Ú Ü Ù V W X Y ÿ(why couldn't I find this in >uppercase?) Ÿ=Alt+0159 (on

Re: Unicode and multilingual support in Macintosh Web browsers

2000-06-16 Thread John H. Jenkins
At 12:29 PM -0400 6/16/00, Robert A. Rosenberg wrote: >At 01:26 PM 06/15/2000 -0800, John Jenkins wrote: >>on 6/15/00 6:00 AM, Alan Wood at [EMAIL PROTECTED] wrote: >>Apple has provided support for direct Unicode rendering since Mac OS 8.5. >>This includes the ability to use large, data-fork Tr

Re: Timetables and conventions (was RE: Chapter on character sets)

2000-06-16 Thread Kenneth Whistler
Antoine asked: > > Kenneth Whistler wrote: > > > > The same conventions will be used for citation of characters in Planes > > above Plane 0 in Unicode Technical Reports and in the eventual republication > > of the standard itself. In textual citations, the normal usage will > > include the "U+"

Re: Unicode and multilingual support in Macintosh Web browsers

2000-06-16 Thread Deborah Goldsmith
on 6/16/2000 10:22 AM, Robert A. Rosenberg <[EMAIL PROTECTED]> wrote: > Adobe Indesign has Unicode Support. So does Outlook Express (just > send/receive a message in UTF-7/UTF-8). Outlook Express only supports the subset of Unicode which can be displayed using Mac OS legacy character sets. It do

Re: [RE: A better method than the German one Otto Stolz gave?]

2000-06-16 Thread John Cowan
[EMAIL PROTECTED] wrote: > Saint Michael provided: > http://www.egt.ie/standards/iso10646/wynnyogh/thorn.html Michael is a fine fellow, but no archangel he. -- Schlingt dreifach einen Kreis um dies! || John Cowan <[EMAIL PROTECTED]> Schliesst euer Aug vor heiliger Schau, || http://www.reuter

RE: Linguistic precedence

2000-06-16 Thread jarkko . hietaniemi
> It's worse than that, the month name must be inflected...but > luckily the inflection is really simple, just a prefix: "16. kesäkuuta s/prefix/suffix/; # Furiously sipping his coffee. > 2000", or in numbers, "16.6.2000". Note the ".", none of that st/nd/rd/th mess. > > And I do not know of

Re: Timetables and conventions (was RE: Chapter on character sets)

2000-06-16 Thread Antoine Leca
Kenneth Whistler wrote: > > The same conventions will be used for citation of characters in Planes > above Plane 0 in Unicode Technical Reports and in the eventual republication > of the standard itself. In textual citations, the normal usage will > include the "U+" prefix: U+1D141, etc. Ah, tha

UTF-8 vs UTF-16 as processing code

2000-06-16 Thread Erik van der Poel
Hi everybody, I'm wondering if there are any analyses comparing UTF-8 with UTF-16 for use as a processing code. UCS-2 has often been considered a good representation to use internally inside a program because of its "fixed width" properties (assuming that you can somehow deal with combining marks

Re: Unicode and multilingual support in Macintosh Web browsers

2000-06-16 Thread Robert A. Rosenberg
At 01:26 PM 06/15/2000 -0800, John Jenkins wrote: >on 6/15/00 6:00 AM, Alan Wood at [EMAIL PROTECTED] wrote: > > > I have tried without success to find information on how to view > multilingual > > Web pages with a Macintosh and which multilingual fonts are available, so I > > have documented the

RE: Linguistic precedence

2000-06-16 Thread Michael Kaplan (Trigeminal Inc.)
One of things I like about Windows: its so easy to look at different date formats. See http://www.trigeminal.com/samples/setlocalesample.asp Its a US NT4 server so I could do everything I wanted to like Japan, Korea, TamilNadu, etc. But I tried for a little variety, and stuck a few RTL langs

Timetables and conventions (was RE: Chapter on character sets)

2000-06-16 Thread Kenneth Whistler
Doug Ewell asked: > Two questions: > > 1. What is the projected timetable for the first version of Unicode that > contains character assignments beyond Plane 0? I'm just wondering, > not trying to seem impatient. (Really.) Unicode 3.1 is still tentatively scheduled to appear as a tec

RE: [RE: A better method than the German one Otto Stolz gave?]

2000-06-16 Thread Marco . Cimarosti
Rampshot wrote: > I forgot to say: I'd put Þ (thorn) at the end. > Anybody out there who knows which comes first, Ä or Þ? Saint Michael provided: http://www.egt.ie/standards/iso10646/wynnyogh/thorn.html _ Marco

Re: A better method than the German one Otto Stolz gave?

2000-06-16 Thread Otto Stolz
Am 2000-06-16 um 18:20 h hat [EMAIL PROTECTED] geschrieben: > Treat Ä (A-umlaut) as a separate letter, between A and B. That would put "Ärger" (="trouble") and its consorts into a place where no German ever would look for them. (Of course, you are free to sort your data in any way convenient to y

Re: Linguistic precedence [was: (TC304.2313)

2000-06-16 Thread Mark Leisher
Séamas> In case anyone is (further) confused by this thread, I can only Séamas> reaffirm that the normal name of our language in English, as every Séamas> Irish person will confirm, is "Irish". (Interestingly, in Séamas> Michael's previous response, which arrived before mine, he w

Re: Linguistic precedence [off topic]

2000-06-16 Thread Juliusz Chroboczek
This is really getting off topic, but it's fun, and we're all rather chatty today. So I'll keep going. I wasn't trying to prove that languages can be ordered in a PC way, which they clearly can't. What I was trying to point out is that it is possible to educate people not to mind this sort of d

FW: UNICODE versus Shift-JIS

2000-06-16 Thread Magda Danish (Unicode)
Got this request by phone and email at the unicode home office. Could anyone respond directly to the list and cc to [EMAIL PROTECTED] Thanks. Magda. -Original Message- From: Ken Buis [mailto:[EMAIL PROTECTED]] Sent: Friday, June 16, 2000 9:11 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTE

Re: [RE: A better method than the German one Otto Stolz gave?]

2000-06-16 Thread rampshot
[EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: > Treat Ä (A-umlaut) as a separate letter, between A and B. It may be better or not, but certainly not quite German! And, anyway, why between A and B? You could place it after Z (that's the actual place of Ä in Swedish). _ Marco I forgot to say:

A better method than the German one Otto Stolz gave?

2000-06-16 Thread rampshot
Treat Ä (A-umlaut) as a separate letter, between A and B. Get free email and a permanent address at http://www.netaddress.com/?N=1

Re: Linguistic precedence [was: (TC304.2313)

2000-06-16 Thread Séamas Ó Brógáin
In case anyone is (further) confused by this thread, I can only reaffirm that the normal name of our language in English, as every Irish person will confirm, is "Irish". (Interestingly, in Michael's previous response, which arrived before mine, he writes, "the rule in Irish is ..." which shows his

Re: French encoding [Was: Chapter on character sets]

2000-06-16 Thread Alain LaBonté 
À 12:37 2000-06-15 -0800, [EMAIL PROTECTED] a écrit: The requirement for Euro was definitely very important, but as I remember the discussions, it was only with very great difficulty that any examples of Finnish text was produced. Much of the impetus for this was also a desire by some people to re

Re: Linguistic precedence

2000-06-16 Thread Antoine Leca
[EMAIL PROTECTED] wrote: > > > So in other words, are we on "16 june 2000" in Finland? > > It's worse than that, the month name must be inflected...but luckily the > inflection is really simple, just a prefix: "16. kesäkuuta 2000", Oh shame on me: I forgot that day number are really ordinal num

Re: Technique to use UNICODE to get Oriya fonts.

2000-06-16 Thread Antoine Leca
Prof. K.C.Mahapatra wrote: > > We are working on a project (in C - language) to use the conventional key > boards to write in 'Oriya' fonts. Please give us the technique to use > UNICODE for this purpose. I do not see the connection between C and Oriya. Anyway. Since Unicode is closely relate

RE: Linguistic precedence

2000-06-16 Thread jarkko . hietaniemi
> > This reminds me of one pet peeve of mine: you can spot > i18n/10n piece of > > software written > > by an English *) speaker pretty quick by checking whether > the names of > > weekdays/months/languages are Capitalized... saying > > "Maanantai/Tammikuu/Suomi" is very wrong, it should be > >

Re: Linguistic precedence

2000-06-16 Thread Antoine Leca
[EMAIL PROTECTED] wrote: > > This reminds me of one pet peeve of mine: you can spot i18n/10n piece of > software written > by an English *) speaker pretty quick by checking whether the names of > weekdays/months/languages are Capitalized... saying > "Maanantai/Tammikuu/Suomi" is very wrong, it sh

Re: German Sharp-S, again

2000-06-16 Thread Otto Stolz
Am 2000-06-16 um 16:09 h hat Torsten Mohrin geschrieben: > The Duden also allows to uppercase "ß" as "SZ" in ambiguous cases > (e.g. "MASSE" (Masse) vs. "MASZE" (Maße)). This is an (almost) obsolete rule, dropped in the 1996 spelling reform. The only current upercasing rule is that "ß" becomes "S

RE: Linguistic precedence

2000-06-16 Thread jarkko . hietaniemi
> I suppose it says POBLACHT NA hIODÁILE, which would be correct, as h- is a > mutation (the nominative is IODÁIL) and the rule in Irish is that this and > other mutations (mB-, gC-, nD-, bhF-, nG-, bP-, tS-, dT-) are > not to be capitalized. Writing POBLACHT NA HIODÁILE would in fact be an error

OFFTOPIC: CHAT: "to speak X with someone"

2000-06-16 Thread John Cowan
I am collecting cross-cultural/cross-linguistic examples of the form "to speak [name of language] with someone", in either of two metaphorical senses: 1) "to speak obscurely, unintelligibly" (English: "to speak Greek") 2) "to speak clearly, to the point, without circumlocution" (Englis

RE: French encoding [Was: Chapter on character sets]

2000-06-16 Thread jarkko . hietaniemi
> Ar 12:37 -0800 2000-06-15, scríobh [EMAIL PROTECTED]: > > >The requirement for Euro was definitely very important, but as I remember > >the discussions, it was only with very great difficulty that any examples > >of Finnish text was produced. > > "s^ekki" ('cheque' or 'check') is an extremely

Re: German Sharp-S, again (was: The mother of all collation schemes)

2000-06-16 Thread Torsten Mohrin
The Duden also allows to uppercase "ß" as "SZ" in ambiguous cases (e.g. "MASSE" (Masse) vs. "MASZE" (Maße)). Moreover, in the German Federal Armed Forces it is common to always uppercase "ß" as "SZ". --Torsten

Re: Technique to use UNICODE to get Oriya fonts.

2000-06-16 Thread brendan_murray
"Prof. K.C.Mahapatra" <[EMAIL PROTECTED]> wrote > We are working on a project (in C - language) to use the conventional key > boards to write in 'Oriya' fonts. Please give us the technique to use > UNICODE for this purpose. My experience has been that the simplest solution is to use W

Re: Linguistic precedence [was: (TC304.2313) AND/OR:

2000-06-16 Thread Otto Stolz
Am 2000-06-16 um 14:50 h hat Michael Kaplan geschrieben: > Well, "Gre" does not appear between "Deu" and "Esp" on any European > language, but "Gre" does appear between "Ger" and "Spa" so I am assuming > English names were being used here? The order of the list was by language names, expressed in

Technique to use UNICODE to get Oriya fonts.

2000-06-16 Thread Prof. K.C.Mahapatra
om Dt.16.6.2000 Dear Sir, We are working on a project (in C - language) to use the conventional key boards to write in 'Oriya' fonts. Please give us the technique to use UNICODE for this purpose. Thanki

German Sharp-S, again (was: The mother of all collation schemes)

2000-06-16 Thread Otto Stolz
Am 2000-06-16 um 14:39 h hat [EMAIL PROTECTED] geschrieben: > the German ligature "ß" [...] can also be spelled "ss" > (this is just an alternate spelling in Germany, but it is mandatory in > Switzerland), and is uppercased as "SS". Almost correct ;-) "ss" is an alternate spelling for "ß". - In

Re: Linguistic precedence [was: (TC304.2313)

2000-06-16 Thread Marion Gunn
Arsa Séamas Ó Brógáin: > Marco Cimarosti wrote: > > ... the Irish Gaelic version of "REPUBLIC OF ITALY" has a > lowercase "h" although it is all capitals. > > The name of this language is "Irish"; there is no such thing as "Irish Gaelic". Of course there is. It is fine use the name "Ir

RE: Linguistic precedence

2000-06-16 Thread Michael Everson
Ar 03:55 -0800 2000-06-16, scríobh Séamas Ó Brógáin: >Marco Cimarosti wrote: > > ... the Irish Gaelic version of "REPUBLIC OF ITALY" has a > lowercase "h" although it is all capitals. > >The name of this language is "Irish"; there is no such thing as "Irish >Gaelic". Ní hea, a Shéamais.

RE: Linguistic precedence [was: (TC304.2313) AND/OR:

2000-06-16 Thread Michael Kaplan (Trigeminal Inc.)
> Well, "Gre" does not appear between "Deu" and "Esp" on any European > language, but "Gre" does appear between "Ger" and "Spa" so I am assuming > English names were being used here? > Michael > -- > From: Robert A. Rosenberg[SMTP:[EMAIL PROTECTED]] > Sent: Thursday, June

RE: The mother of all collation schemes

2000-06-16 Thread Marco . Cimarosti
[EMAIL PROTECTED] wrote: > [...] ÿ(why couldn't I find this in uppercase?) [...] Because the corresponding uppercase is not a character, it is two: "IJ". In fact, "ÿ" is a ligature, optionally used in Dutch to represent the sequence "ij". E.g. "ijs" (= ice, ice-cream) is also spelled "ÿs", and bo

German casing rules (was: Linguistic precedence)

2000-06-16 Thread Otto Stolz
Am 2000-06-16 um 11:42 h hat Antoine Leca geschrieben: > should it be ,,deutsch'', or ,,Deutsch'', in such a context? The context, if I remember correctly, was a list of countries, so it should rather be "Deutschland" (with a capital "D", cf.

RE: Linguistic precedence [was: (TC304.2313)

2000-06-16 Thread Séamas Ó Brógáin
Marco Cimarosti wrote: ... the Irish Gaelic version of "REPUBLIC OF ITALY" has a lowercase "h" although it is all capitals. The name of this language is "Irish"; there is no such thing as "Irish Gaelic". I haven't seen the document you refer to, but I presume the term used is "POBLAC

Re: Chapter on character sets

2000-06-16 Thread Keld Jørn Simonsen
On Fri, Jun 16, 2000 at 12:11:13PM +0200, Antoine Leca wrote: > Keld Jørn Simonsen wrote: > > > > About the subset, this is not true. There are charsets in use today, > > like the national 646 variants, that differ (in the 12 unassigned > > positions). Not much used, but I get some emails in thes

RE: Collation curiosities (was: RE: Linguistic precedence [was: (

2000-06-16 Thread Marco . Cimarosti
Jarkko Hietaniemi wrote: > I think somebody just mentioned that many Italians like "i" > and "j" to be "equal". It was me. I mentioned this in a very sketchy and misinformed posting about the origin of "j" and "u". Thank you for this opportunity of going back on that topic to add a few correct

Re: Collation curiosities

2000-06-16 Thread Otto Stolz
Hello, somebody has asked for a survey of national sorting rules. (I have already deleted that note.) Now it has occurred to me that the IBM National Language Technical Centre has produced a "National Language Information and Design Guide" comprising such a list (in Volume 2, Chapter 3). I have

Re: Chapter on character sets

2000-06-16 Thread Antoine Leca
Keld Jørn Simonsen wrote: > > On Thu, Jun 15, 2000 at 09:49:14AM -0800, Mike Brown wrote: > > > > The character set defined by the ISO 646-US standard is now known as > > "US-ASCII" due to its IANA registration for use on the Internet. It defines > > hex position 23 to be # and 24 to be $. It is

RE: Linguistic precedence [was: (TC304.2313) AND/OR:

2000-06-16 Thread Marco . Cimarosti
Touché! I was mislead by a fictional character by V. Montalbán: Pepe Carva*lh*o a Catalan detective of Galician origins... Ciao. Marco > -Original Message- > From: Antoine Leca [mailto:[EMAIL PROTECTED]] > Sent: Friday, 16 June, 2000 12.44 > To: Unicode List > Cc: [EMAIL PROTECTED] > Su

Re: Linguistic precedence [was: (TC304.2313) AND/OR:

2000-06-16 Thread Antoine Leca
[EMAIL PROTECTED] wrote: > > > A Coruña [...] > > I though it was "La Coruña" (in Castillian) or "A Corunha" (in Galician). In fact, I never went to Galicia, so I do not know. In the rest of Spain, practice is to spell it A Coruña, particularly on road atlases, even if the proper Castillian spe

RE: Linguistic precedence

2000-06-16 Thread Michael Everson
Ar 02:04 -0800 2000-06-16, scríobh [EMAIL PROTECTED]: >On such documents (driving licenses, passports, etc.), the matter is >normally settled solomonically by using all capitals. > >BTW, I see from my passport that this does not fix all problems anyway: the >Irish Gaelic version of "REPUBLIC OF I

Re: French encoding [Was: Chapter on character sets]

2000-06-16 Thread Michael Everson
Ar 12:37 -0800 2000-06-15, scríobh [EMAIL PROTECTED]: >The requirement for Euro was definitely very important, but as I remember >the discussions, it was only with very great difficulty that any examples >of Finnish text was produced. "s^ekki" ('cheque' or 'check') is an extremely common word an

RE: Linguistic precedence [was: (TC304.2313) AND/OR:

2000-06-16 Thread Marco . Cimarosti
Antoine Leca wrote: > It is "español": without upper-case initial, and with a eñe. [...] > Back question: should it be ,,deutsch'', or ,,Deutsch'', in > such a context? On such documents (driving licenses, passports, etc.), the matter is normally settled solomonically by using all capitals. BT

Re: Linguistic precedence [was: (TC304.2313) AND/OR:

2000-06-16 Thread Antoine Leca
Robert A. Rosenberg wrote: > > At 07:53 AM 06/15/2000 -0800, Michael Kaplan (Trigeminal Inc.) wrote: > >Eventually someone will have a language name that does not fit > >or a language like German will inist on sorting sooner, under Deutsch rather > >than under German, etc. (which I personally

Re: Unicode and multilingual support in Macintosh Web browsers

2000-06-16 Thread Andreas Prilop
Alan Wood wrote: > I have tried without success to find information on how to view multilingual > Web pages with a Macintosh and which multilingual fonts are available, > I will appreciate being advised of any errors or of further sources of > information. I have collected some links at