Re: ü
At 20:09 -0500 2002-01-21, Patrick Andries wrote: Kenneth Whistler wrote: Patrick Andries wrote: I must say that I have already seen horrors such as geüpdated (the u is presumably approximated), again English messing with languages spelling and pronounciation... Languages don't mess with languages. People mess with languages. It isn't as if French hasn't been polluting English for a thousand years or anything, is it?! No, no, no. French has enriched English, not polluted it, by bestowing it a wealth of new words. I wonder if we could start the millenium celebration of this wonderful hybridization before 2066? Yes. French has given us things like fin de siècle. And English has given you le weekend. And nowadays, the Europeans are getting their revenge by exporting all their accents back onto English letters. Well, the Americans are putting a pretty good fight. Can't see the light behind façade, cañon and coöperate. Tsk tsk. Coöperate isn't very common, but naïve is. -- Michael Everson *** Everson Typography *** http://www.evertype.com
Re: Norwegian sorting
On Mon, Jan 21, 2002 at 06:14:55PM +0100, Stefan Persson wrote: - Original Message - From: Lars Marius Garshol [EMAIL PROTECTED] To: Unicoders [EMAIL PROTECTED] Sent: den 21 januari 2002 15:16 Subject: Re: Norwegian sorting I doubt that there is an official standard for this, but I would expect to find Ü sorted with Y, given that Norwegian Y is pronounced just like Swedish/German/Dutch Ü. Many reference works sort V and W together, for example, according to the same principle. Swedish: Ü only used in German loan words. German: Ü pronunciated as a Swedish y. Dutch: Ü pronunciated completely different. In Swedish we sort the German ü as y, and the Dutch ü as u. I have no official record on Dutch ü being sorted as u in Swedish. Where do you get this rule from? Have you got examples of this? How do you accomplish it? Kind regards Keld Simonsen
Re: Unicode 3.2 Beta Period Finishing
Regarding http://www.unicode.org/Public/BETA/Unicode3.2/Scripts-3.2.0d7.txtScripts-3.2.0d7.txt 21-Jan-2002 13:5739k It says: 03D0..03F5; GREEK # L [38] GREEK BETA SYMBOL..GREEK LUNATE EPSILON SYMBOL In the first place, 03E2 through 03EF are COPTIC letters, not Greek. In the second, some of those letters are technical symbols, not letters, so if you are including some of them why not include 03F6? -- Michael Everson *** Everson Typography *** http://www.evertype.com
Devanagari on MacOS 9.2 and IE 5.1
I spoke to fast. Upon taking a closer look at the file, the font was not set properly. MacOS 9.2, Indian Language Kit, Mac IE 5.1 and Devanagari MT as font face seem to display UTF-8 encoded Hindi just fine. Etienne Date: Mon, 21 Jan 2002 10:24:16 -0800 [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED], [EMAIL PROTECTED]: [EMAIL PROTECTED] RE: Devanagari On this subject, Win2K and IE5+ seem to do a nice job displaying UTF8-encoded Hindi. On the Mac, the Indian Language Kit provides for OS support and fonts (with MacOS 9.2 and above), but I have not been able to display Hindi (UTF8 encoded) with Mac's IE 5.1. Am I correct in assuming that the Mac version of IE does not support Hindi without a hack? Etienne Reply-To: [EMAIL PROTECTED] Christopher J Fynn [EMAIL PROTECTED] [EMAIL PROTECTED]Cc: Aman Chawla [EMAIL PROTECTED] RE: DevanagariDate: Mon, 21 Jan 2002 23:59:38 +0600 Aman Here in Bhutan the Internet connection is still much worse than in most places I've visited in India Nepal (and the cost per minute is several times higher) - believe me even then UTF-8 (or UTF-16) encoded pages do not display noticeably slower than ASCII, ISCII or 8-bit font encoded pages - and I don't need to download any special plug-ins or fonts. - Chris -- Christopher J Fynn Thimphu, Bhutan [EMAIL PROTECTED] [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Aman Chawla Sent: 21 January 2002 10:57 To: James Kass; Unicode Subject: Re: Devanagari - Original Message - From: James Kass [EMAIL PROTECTED] To: Aman Chawla [EMAIL PROTECTED]; Unicode [EMAIL PROTECTED] Sent: Monday, January 21, 2002 12:46 AM Subject: Re: Devanagari 25% may not be 300%, but it isn't insignificant. As you note, if the mark-up were removed from both of those files, the percentage of increase would be slightly higher. But, as connection speeds continue to improve, these differences are becoming almost minuscule. With regards to South Asia, where the most widely used modems are approx. 14 kbps, maybe some 36 kbps and rarely 56 kbps, where broadband/DSL is mostly unheard of, efficiency in data transmission is of paramount importance... how can we convince the south asian user to create websites in an encoding that would make his client's 14 kbps modem as effective (rather, ineffective) as a 4.6 kbps modem? Hot After Christmas DEALS on just about everything! http://www.smartshop.com/cgi-bin/main.cgi?ssa=4099 Hot After Christmas DEALS on just about everything! http://www.smartshop.com/cgi-bin/main.cgi?ssa=4099
RE: The benefit of a symbol for 2 pi
On Sat, 19 Jan 2002, Murray Sargent wrote: Capital pi is to product as capital sigma is to summation. But if I'm not mistaken, Unicode already has a separate character for n-ary products and summation (U+220F, U+2211), distinct from the capital Greek letters *and* the variant forms in the mathematical alphanumeric block. If capital pi is the way to go, why not use U+1D6F1 MATHEMATICAL ITALIC CAPITAL PI or U+1D72B MATHEMATICAL BOLD ITALIC CAPITAL PI, for instance? Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111 student/math+cs/helsinki university, http://www.iki.fi/~decoy/front openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Re: Norwegian sorting
On Mon, Jan 21, 2002 at 11:11:43AM -0500, Tex Texin wrote: Thanks Keld, that was one of the sources I checked first. I saw that it was based on a Norwegian standard, but it didn't say what the standard was used for. So I didn't know if this was a collation that dictionaries or phone books used, or who used it. NS4103 is normal sorting and filing rules, in the style that was standardized 30 years ago. NS 4103 is available so you can check it yourself. Kind regards Keld
RE: Devanagari
David Starner wrote: On Mon, Jan 21, 2002 at 02:20:17PM +0100, Marco Cimarosti wrote: What this means in practice for website developers is: 1) SCSU text can only be edited with a text editor which properly decodes the *whole* file on load and re-encodes it on save. On the other hand, UTF-8 text can also be edited using an encoding-unaware editor, although non-ASCII text is invisible. True for users of Latin-based writing systems. Probably of little comfort to users of Indic or Chinese-based writing systems. I was referring to the task of editing *source* files in HTML, XML, or other computer languages and format. Most of the time, programmers and webmasters are interested in changing the ASCII part of the file (mark-up, instructions), which is the part which most likely contains bugs to be fixed, or to need changes unrelated with the linguistic contents. Of course, the people in charge of writing the *content*, need tools that can display the actual characters. And this is true for users of Latin-based writing system as well: imagine writing in French or German with all occurrences of é, è, ä, ö, ü, etc. transformed into pairs of funny bytes. Better to stick with editors that are aware of your encoding. Of course. Provided that one exists on your platform, and that you are not bound to development tools which don't support it. 2) SCSU text cannot be built by assembling binary pieces coming from external sources. It's not really designed for that. If you're assembling things, just run the output through a UTF-8 to SCSU converter. Which translates to: SCSU is not appropriate for dynamic HTML pages, or for encoding text inside any other kind of application. More generally, SCSU is not appropriate as text encoding, but just as a compression method for documents in their final form. Ciao. _ Marco
Re: RE: ü
Marco Cimarosti wrote: 27E7FB58F42CD5119C0D0002557C0CCA16B44F@XCHANGE"> Patrick Andries wrote: Funny: I have just read a similar but opposite opinion on an Italiannewsgroup. Somebody said: if really we must accept English terms such as"file" or "window", we should at list do the effort of pronouncing themaccording to Italian spelling: /'file/ and /vin'dOv/, rather than /'fail/ or/'windo:/. It is an alternate way of doing. In fact, I believe in a middle way : spell the word as they are pronounced in your language (which is usually not the same as the original, very few Germans pronounce English loan-words in German as native English speakers would (even assuming the wealth of English pronunciations). 27E7FB58F42CD5119C0D0002557C0CCA16B44F@XCHANGE"> A way to say welcome. Uhmm... I hope such way of saying welcome will never be applied to humans.In the case I move to China, I would not like to have my hair painted blackand my eyes shape modified with surgery. :-) Remember the old adage : when in Rome... Patrick
Re: Unicode 3.2 Beta Period Finishing
Doug Ewell reported: Many of the embedded images in the Standardized Variants document are missing. The missing images have been fixed. Rick
Re: Unicode 3.2 Beta Period Finishing
Currently, the Coptic characters are treated as extensions to the Greek script, much as the Urdu characters are treated as extensions to the Arabic script. So for now, at least, they should be marked as Greek. If the UTC and SC2 ever disunify the scripts, then the Script property value would need to change. As to the technical symbols, would anyone take a stab at listing those characters that are only ever used as technical symbols, and never as letters? Mark — Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο πάντα — Ὁμήρου Μαργίτῃ [For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr] http://www.macchiato.com - Original Message - From: Michael Everson [EMAIL PROTECTED] To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, January 22, 2002 05:10 Subject: Re: Unicode 3.2 Beta Period Finishing Regarding http://www.unicode.org/Public/BETA/Unicode3.2/Scripts-3.2.0d7.txtScripts- 3.2.0d7.txt 21-Jan-2002 13:5739k It says: 03D0..03F5; GREEK # L [38] GREEK BETA SYMBOL..GREEK LUNATE EPSILON SYMBOL In the first place, 03E2 through 03EF are COPTIC letters, not Greek. In the second, some of those letters are technical symbols, not letters, so if you are including some of them why not include 03F6? -- Michael Everson *** Everson Typography *** http://www.evertype.com
Re: Devanagari on MacOS 9.2 and IE 5.1
It should be fine also on Netscape 6.2 [EMAIL PROTECTED] wrote: [EMAIL PROTECTED]"> I spoke to fast. Upon taking a closer look at the file, the font was not set properly. MacOS 9.2, Indian Language Kit, Mac IE 5.1 and Devanagari MT as font face seem to display UTF-8 encoded Hindi just fine.Etienne Date: Mon, 21 Jan 2002 10:24:16 -0800"[EMAIL PROTECTED]" [EMAIL PROTECTED] [EMAIL PROTECTED], [EMAIL PROTECTED]: [EMAIL PROTECTED]RE: DevanagariOn this subject, Win2K and IE5+ seem to do a nice job displaying UTF8-encoded Hindi. On the Mac, the Indian Language Kit provides for OS support and fonts (with MacOS 9.2 and above), but I have not been able to display Hindi (UTF8 encoded) with Mac's IE 5.1. Am I correct in assuming that the Mac version of IE does not support Hindi without a hack?Etienne Reply-To: [EMAIL PROTECTED]"Christopher J Fynn" [EMAIL PROTECTED] [EMAIL PROTECTED]Cc: "Aman Chawla" [EMAIL PROTECTED]RE: DevanagariDate: Mon, 21 Jan 2002 23:59:38 +0600AmanHere in Bhutan the Internet connection is still much worse than in mostplaces I've visited in India Nepal (and the cost per minute is severaltimes higher) - believe me even then UTF-8 (or UTF-16) encoded pages do notdisplay noticeably slower than ASCII, ISCII or 8-bit font encoded pages -and I don't need to download any special plug-ins or fonts.- Chris--Christopher J FynnThimphu, Bhuta n[EMAIL PROTECTED][EMAIL PROTECTED] -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]OnBehalf Of Aman ChawlaSent: 21 January 2002 10:57To: James Kass; UnicodeSubject: Re: Devanagari- Original Message -From: "James Kass" [EMAIL PROTECTED]To: "Aman Chawla" [EMAIL PROTECTED]; "Unicode"[EMAIL PROTECTED]Sent: Monday, January 21, 2002 12:46 AMSubject: Re: Devanagari 25% may not be 300%, but it isn't insignificant. As you note, if themark-up were removed from both of those files, the percentage ofincrease would be slightly higher. But, as connection speeds continueto improve, these differences are becoming almost minuscule. With regards to South Asia, where the most widely used modems areapprox. 14kbps, maybe some 36 kbps and rarely 56 kbps, where broadband/DSL is mostlyunheard of, efficiency in data transmission is of paramount importance...how can we convince the south asian user to create websites in an encodingthat would make his client's 14 kbps modem as effective (rather,ineffective) as a 4.6 kbps modem? Hot After Christmas DEALS on just about everything!http://www.smartshop.com/cgi-bin/main.cgi?ssa=4099 Hot After Christmas DEALS on just about everything!http://www.smartshop.com/cgi-bin/main.cgi?ssa=4099