Re: CP1256 and Persian YEH?

2001-10-11 Thread Michael \(michka\) Kaplan
Depends on what version of Windows you are on. Farsi is not officially supported in all code points for cp1256. This one is supported in WinME, Win2000, and WinXP. It maps to 0xED on cp1256 when it does map? MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/ -

Re: Gujarati IME for win2k?

2001-10-04 Thread Michael \(michka\) Kaplan
Gujarati does not require an IME at all; it is not a script with that huge of repetoire! It does require a keyboard though have you looked into Windows XP for this? It provides not only a keyboard but also an OpenType font and collation data, as well. MichKa Michael Kaplan Trigeminal

Re: Unicode locale id

2001-10-04 Thread Michael \(michka\) Kaplan
The locale choice covers all of Unicode; the choice of 1033 just means that the standard collation table is going to be used, with no specific exceptions that many other languages require. More info on collation in SQL Server can be found in the following white paper (it discusses 7.0 as well):

Re: Deseret keyboard (was:Re: Special Type Sorts Tray 2001)

2001-10-03 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] Well, careful now. The language is English. You mean someone who uses the script. Yes, that is what I meant... I was referring to users of the script. Though I suppose if they were going to try to tackle the original inner and outer plates it would not be English but

Re: Special Type Sorts Tray 2001 (derives from Egyptian Transliteration Characters)

2001-10-02 Thread Michael \(michka\) Kaplan
From: William Overington [EMAIL PROTECTED] Is there an official Unicode Consortium statement that states, for the record, that the Unicode Consortium refuses to encode more ligatures and precomposed characters please? I think it is quite clearly stated that the ones that ARE present are

Re: Special Type Sorts Tray 2001

2001-10-02 Thread Michael \(michka\) Kaplan
From: John H. Jenkins [EMAIL PROTECTED] At 5:28 PM +0100 10/2/01, Michael Everson wrote: The CSUR is maintained to support scripts of various kinds. Some of those (Shavian, Deseret, Tengwar, Cirth) are expected to graduate into Unicode. And one of them already has! And I am sure Apple

Re: Special Type Sorts Tray 2001

2001-10-02 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] I still live in hopes that someone, John or someone else, will one day send me a Deseret keyboard layout that is at least SLIGHTLY standard (meaning more than one person has ever used it). I need something I can download and read on a Windows machine. Text or a GIF

Re: GB18030

2001-09-27 Thread Michael \(michka\) Kaplan
From: Yung-Fong Tang [EMAIL PROTECTED] Can anyone tell me where can I find a online version of the GB18030 standard (yes, I want the STANDARD itself. Not someone's paper talk about the standard) . Or anyone could tell me where to get a copy of the standard. You mean the original Chinese?

Re: GB18030

2001-09-27 Thread Michael \(michka\) Kaplan
From: Yung-Fong Tang Case mapping ? You have no way to generate mapping table for case mapping with knowing the character unless you already define those character have no case or only one case. Um, Unicode defines a behavior and even properties for unassigned code points. If you choose not

Re: GB18030

2001-09-26 Thread Michael \(michka\) Kaplan
From: Geoffrey Waigh [EMAIL PROTECTED] It shouldn't require honest-to-goodness we-were't-kidding see-here's-one-defined-now characters In many cases, it did. for developers to slap themselves on the head They did -- and they are slapping others around them, too. and start developing

Re: a joke

2001-09-24 Thread Michael \(michka\) Kaplan
From: Suzanne M. Topping [EMAIL PROTECTED] From: Michael Everson [mailto:[EMAIL PROTECTED]] Three fonts walk into a bar. The barman, wiping a glass, shakes his head and says to them: I'll have none of your type in here. Gee, and I thought he was going to say: Why the long face?

Re: UTF-8 UCS-2/UTF-16 conversion for library use

2001-09-24 Thread Michael \(michka\) Kaplan
From: Ayers, Mike [EMAIL PROTECTED] Analyze problem. Pick solution. In that order. Wiser advise was ne'er spoken, on *this* topic at least. I wonder is there is some way that a policy decision can be made to declare a moratorium on the whole *My* UTF is better than *your* UTF for a while?

Re: 3rd-party cross-platform UTF-8 support

2001-09-24 Thread Michael \(michka\) Kaplan
From: Tom Emerson [EMAIL PROTECTED] But if I have a text string, and that string is encoded in UTF-16, and I want to access Unicode character values, then I cannot index that string in constant time. To find character n I have to walk all of the 16-bit values in that string accounting for

Re: 3rd-party cross-platform UTF-8 support

2001-09-22 Thread Michael \(michka\) Kaplan
From: "Marcin 'Qrczak' Kowalczyk" [EMAIL PROTECTED] Why would UTF-16 be easier for internal processing than UTF-8? Both are variable-length encodings. Good straw man! Working with UTF-16 is immensely easier than working with UTF-8. As I am am sure you know! :-) MichKa Michael Kaplan

Re: PDUTR #26 posted

2001-09-19 Thread Michael \(michka\) Kaplan
From: Ayers, Mike [EMAIL PROTECTED] From: John Cowan [mailto:[EMAIL PROTECTED]] [EMAIL PROTECTED] scripsit: Oops! One of two Unicode 101 mistakes I made in the same day. Where was my brain? Unicode Ate Your Brain, of course! (See my tutorial at Orlando this year.) Nah,

Re: discontent about Indic scripts and Unicode

2001-09-19 Thread Michael \(michka\) Kaplan
From: Carl W. Brown [EMAIL PROTECTED] However, I do not understand the TSCII for Tamil. Unicode provides the script separation that they want. TSCII is mostly out of favor now (tamil.net being the main exception, and that only because its webmaster hates all established standards for doing

Re: discontent about Indic scripts and Unicode

2001-09-18 Thread Michael \(michka\) Kaplan
This is the same problem that was discussed extensively for Tamil at TI2001 in Kuala Lampur last month. Basically, it boils down to three problems: 1) Most of the people involved do not understand Unicode or how it works. 2) Most of the people involved expect natural language processing to be a

Re: PDUTR #26 posted

2001-09-17 Thread Michael \(michka\) Kaplan
From: Marco Cimarosti [EMAIL PROTECTED] Does renaming UTF-8S to CESU-8 fix all the issues that were discussed on this mailing list at the beginning of last spring? In my opinion (and the opinion of some others), no. But they do represent the *attempt* to answer them. Specifically: - How

Re: PDUTR #26 posted

2001-09-17 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] If Michka is referring to non-compliant CESU-8 parsers, I really wouldn't care much because CESU-8 is supposed to live in its own little private world. But if people start compromising their UTF-8 parsers to accommodate CESU-8 adaptively, it would be a great blow to

Re: PDUTR #26 posted

2001-09-17 Thread Michael \(michka\) Kaplan
From: Mark Davis [EMAIL PROTECTED] - A significant reason for CESU-8 garnering enough support was that its introduction allows the definition of UTF-8 itself to be tightened, to formally exclude the 3-byte surrogates both in reading and writing. I do not see this as a valid argument at all

Re: PDUTR #26 posted

2001-09-17 Thread Michael \(michka\) Kaplan
From: Carl W. Brown [EMAIL PROTECTED] In actuality it would be difficult for IANA to deny a character set for any official character set so the decision is actually up to the Unicode committee. I concur. I don't believe that the idea of registering CESU-8 with IANA came from the Unicode

Re: PDUTR #26 posted

2001-09-17 Thread Michael \(michka\) Kaplan
From: John Cowan [EMAIL PROTECTED] False. IANA's registry is merely de facto: what they register is not in fact encodings, but *names* of encodings. The charset name ISO646-DE is legal as an XML encoding, but it would astonish me if any extant XML parser supports it. (This is one of

Re: PDUTR #26 posted

2001-09-17 Thread Michael \(michka\) Kaplan
From: Carl W. Brown [EMAIL PROTECTED] It would seem to be that if you either have to change the UTF-8 code to support CESU-8 or change the UTF-16 compare logic then changing the UTF-16 logic to do code point order compares is a much more containable change with a much lower processing

Re: CESU-8 vs UTF-8

2001-09-15 Thread Michael \(michka\) Kaplan
Carl, Doug, The issues you and Doug brought up were vigorously discussed. For the decision, all I can say is that not everyone voted for it (which will be a matter of public record once the preliminary minutes are posted). D This section of the TR amazed me. In the Summary and D elsewhere,

Re: PDUTR #26 posted

2001-09-14 Thread Michael \(michka\) Kaplan
From: "Marcin 'Qrczak' Kowalczyk" [EMAIL PROTECTED] Thu, 13 Sep 2001 12:52:04 -0700, Asmus Freytag [EMAIL PROTECTED] utf-8 cannot as readily be used as internal format. It's as easy as UTF-16. Unless you want a broken implementation which treats surrogates as pairs of characters. It's as

Re: PDUTR #26 posted

2001-09-14 Thread Michael \(michka\) Kaplan
From: Ayers, Mike [EMAIL PROTECTED] Not in the best mood, am I? Well, you did forget the all important My encoding is better than your encoding! at the end. :-) MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/

Re: What code point is assigned for the Newton unit?

2001-09-12 Thread Michael \(michka\) Kaplan
Actually, you are mistaken. The decision to encode the Angstrom sign had more to do with the fact that it ws encoded in many legacy encoding sets. There is no specific rule that every unit sign must also be encoded. If you can use Unicode to properly store and render what you need, then there is

Re: Terroists attacks the status of Unicode Conference

2001-09-11 Thread Michael \(michka\) Kaplan
More importantly, many speakers coming in later in the week are NOT YET in San Jose -- not sure what effect this will have on things. MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/ - Original Message - From: Carl W. Brown To: Mark Davis ; Unicode Sent:

Re: [OT] o-circumflex

2001-09-10 Thread Michael \(michka\) Kaplan
From: Keld Jørn Simonsen [EMAIL PROTECTED] Real-life sorts, like MS Windows sorting or Linux sorting, actually adheres to these Danish rules, once you have set up your machine for Danish. And this is the *true* answer to the whole mess of attempting *multilingual* sorts -- once the user

Re: [OT] o-circumflex

2001-09-10 Thread Michael \(michka\) Kaplan
From: Mark Davis [EMAIL PROTECTED] Michael, that isn't the point. There is a problem even when you stick to one language. That is, there are situations where two letters in a language, e.g. ch in Slovak, are normally sorted as one. However, in some exceptional circumstances those letters

Re: [OT] o-circumflex

2001-09-07 Thread Michael \(michka\) Kaplan
From: David Gallardo [EMAIL PROTECTED] As a practical matter, you need to take the diacritics into account when sorting, even in English where they (may or may not) have linguistic significance, otherwise you'll get nondeterministic behaviour. In other words, résumé and resume should fall

Re: Using Unicode fonts for plaintext display on windows 2000

2001-09-05 Thread Michael \(michka\) Kaplan
The script setting will not really end up being used in this case. It is present because it is a fundamental member of the LOGFONT structure, but it is only used in cases where a device context will not be using Unicode and needs an intelligent guess as to what code page to use for rendering.

Re: Using Unicode fonts for plaintext display on windows 2000

2001-09-05 Thread Michael \(michka\) Kaplan
/ - Original Message - From: Tex Texin [EMAIL PROTECTED] To: Michael (michka) Kaplan [EMAIL PROTECTED] Cc: Unicoders [EMAIL PROTECTED]; Gary Clink [EMAIL PROTECTED] Sent: Wednesday, September 05, 2001 12:06 PM Subject: Re: Using Unicode fonts for plaintext display on windows 2000 Michael, thanks

Re: Using Unicode fonts for plaintext display on windows 2000

2001-09-05 Thread Michael \(michka\) Kaplan
Well, my big guesses: 1) not using the right function get the text in (using WM_SETTEXT via SendMessageA, TextOutA, ExtTextOutA) or 2) not creating the window via the right function (using CreateWindowExA) Those are the only two ways that the script should affect things on Win9x. Note that VB

Re: japanese xml

2001-09-04 Thread Michael \(michka\) Kaplan
From: David Starner [EMAIL PROTECTED] Frankly, the attitude of Forget all the stuff that you have working; just throw it all away and move to Unicode is not one that wins many converts. Backward compatibility and the ability to interface with other systems running different stuff is always

Re: japanese xml

2001-09-04 Thread Michael \(michka\) Kaplan
From: KUSANO Takayuki [EMAIL PROTECTED] This is only a problem for people who do not want to use Unicode. But, most people can't live without 'legacy' encodings, because there are many documents, data in 'legacy' encodings and there are stille many applications/terminals that cannot

Re: MSLU

2001-09-04 Thread Michael \(michka\) Kaplan
MSLU is documented in the Platform SDK. BUT you are not going to get Unicode *functionality* from MSLU, from VB or elsewhere; MSLU only gives you a wrapper layer (and it converts after that), so the work you would do to make it callable from VB would not actually be beneficial? MichKa Michael

Re: UTF-8 on NT

2001-09-04 Thread Michael \(michka\) Kaplan
This is not an NT issue so much as a Visual C++ CRT issue (the setloale function is implemented there, for what you are probably using). At present, there is no support for this (take a look at the code if you need to know why, it makes all kinds of assumptions like one byte per character that

Re: japanese xml

2001-09-03 Thread Michael \(michka\) Kaplan
This is only a problem for people who do not want to use Unicode. It is certainly not Unicode's fault that the various [vendor-provided] versions of standards are incomplete or that they conflict with each other. Well, I suppose you could also blame Misha, for thinking that EUC-JP + NCRs would

Re: Anyone see this?

2001-09-01 Thread Michael \(michka\) Kaplan
Well, clearly its a hoax. The assimilated press has always been this way. Kind of amusing, in its own way. But no, there is no Klingon Freedom League, and speakers/attendees do not have to fear problems with protests in San Jose surrounding the conference. :-) MichKa Michael Kaplan Trigeminal

Re: win95

2001-08-31 Thread Michael \(michka\) Kaplan
From: Carl W. Brown [EMAIL PROTECTED] Microsoft now has a solution for you. You can add Unicode support to Win95/98/Me http://www.microsoft.com/globaldev/Articles/mslu_announce.asp Well, as wonderful as I think MSLU is (not that I am biased or anything) it is not going to add Unicode support

Re: win95

2001-08-31 Thread Michael \(michka\) Kaplan
From: Carl W. Brown [EMAIL PROTECTED] I had presumed that he was able to get Unicode support on NT but not 95. I did something like this for a VB 3.0 application by writing controls to extend the language. Indeed, you can get Unicode support on NT -- but that is *real* Unicode support. The

Re: exchanging Arabic data in utf-8

2001-08-22 Thread Michael \(michka\) Kaplan
From: Iman Saad [EMAIL PROTECTED] I tried adding the following header in the section of the cgi script that includes the html code, but that did not change anything: META HTTP-EQUIV=Content-Type CONTENT=text/html; CHARSET=UTF-8 If you look at the following link (all on one line)

Re: Big question: CJK font support in systems and applications

2001-08-08 Thread Michael \(michka\) Kaplan
From: "Adam Twardoch" [EMAIL PROTECTED] I have just finished reading Ken Lude's "CJKV Information Processing" (O'Reilly, 1999). While I found much of the information contained in that book highly helpful, I can't help the feeling that its structure might need a slightly more systemmatic

Re: Codepage

2001-08-01 Thread Michael \(michka\) Kaplan
Microsoft's Euro story can be seen at: http://www.microsoft.com/europe/euro/ Specifically, the Windows info is at http://microsoft.com/windows/euro.asp There is no way to arbitarily add code points to a Windows code page, though. Either you have the patch or the newest file, or you do not. If

Re: some kind of virus?

2001-07-24 Thread Michael \(michka\) Kaplan
I have gotten roughly 100 of them, from various email addresses on my web site. michka - Original Message - From: Carl W. Brown [EMAIL PROTECTED] To: Michael Everson [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, July 24, 2001 8:47 AM Subject: RE: some kind of virus? Michael,

Re: RTF language codes

2001-07-23 Thread Michael \(michka\) Kaplan
From: jgo [EMAIL PROTECTED] The following table defines the standard languages used by Microsoft. This table was generated by the Unicode group for use with TrueType and Unicode. I don't see such a table via search from the Unicode site. Is this just another M$ non-standard standard

Re: RTF language codes

2001-07-23 Thread Michael \(michka\) Kaplan
From: Marc Durdin [EMAIL PROTECTED] I must disagree with this statement. I know of quite a few changes to the LCID list, some of which have caused me considerable pain in the past. Any of them in winnt.h? So, there are significant issues with Microsoft's LCIDs: 1. The tables are not

Re: RTF language codes

2001-07-23 Thread Michael \(michka\) Kaplan
From: Marc Durdin [EMAIL PROTECTED] I must disagree with this statement. I know of quite a few changes to the LCID list, some of which have caused me considerable pain in the past. Any of them in winnt.h? Try Serbo-Croatian. Documents created with the old Cyrillic LCID definitely would

Re: Unicode and windows menus

2001-07-18 Thread Michael \(michka\) Kaplan
From: Dennis L. Goyette Sr. [EMAIL PROTECTED] Anybody have any idea of how to display chinese characters in windows menus bars? All I get is parallel bars. thanks This would mean that the font choice for menus is not one that will accept Chinese characters and you need to change the

Re: Is there Unicode mail out there?

2001-07-14 Thread Michael \(michka\) Kaplan
michka the only book on internationalization in VB at http://www.i18nWithVB.com/ - Original Message - From: Michael Everson [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Saturday, July 14, 2001 9:56 AM Subject: Re: Is there Unicode mail out there? At 09:49 -0700 2001-07-14, Mark

Re: Is there Unicode mail out there?

2001-07-14 Thread Michael \(michka\) Kaplan
From: Michael Everson [EMAIL PROTECTED] Then it's not standard and can't be relied upon. Pity. Actually, it is a standard, as of HTML 4.0. All you need is compliant browser. MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/

Re: Is there Unicode mail out there?

2001-07-11 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] Can you read this? This is coming from Lotus Notes. Yes, it looks like you are confused (all those question marks!) MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/

Re: regarding unicode input

2001-07-10 Thread Michael \(michka\) Kaplan
If you mean under Windows, then the answer is that they return Unicode in Unicode applications. Perhaps more details on the platform you are using? MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/ - Original Message - From: Adarsh [EMAIL PROTECTED] To: [EMAIL

Re: Erratum in Unicode book

2001-07-10 Thread Michael \(michka\) Kaplan
From: James Kass [EMAIL PROTECTED] So, can Internet Explorer now display non-BMP characters ?! interrobang Still not having any luck on Windows M.E. with Marco Cimarosti's java charts. Any suggestions? Thus far, I can make it work on Windows 2000 and Windows XP (with IE5.0, 5.5, and 6.0)

Re: Erratum in Unicode book

2001-07-09 Thread Michael \(michka\) Kaplan
From: John H. Jenkins [EMAIL PROTECTED] Has the UNIHAN.TXT file been updated to include radical-stroke data for Plane Two characters? Yes. Ever since Unicode 3.1 was released. (We still don't have an Extension B font, however.) There is one in Office XP's CHS and CHP language packs

Re: Erratum in Unicode book

2001-07-09 Thread Michael \(michka\) Kaplan
From: Richard Cook [EMAIL PROTECTED] This must be the Beijing Zhong Yi Electronics font ... I heard that Microsoft was licensing it, but didn't imagine they'd release it so soon ... The font vendor is listed as BDFX, and the copyright is for the Founder Corporation. Further respondant sayeth

Re: Re: Erratum in Unicode book

2001-07-08 Thread Michael \(michka\) Kaplan
From: てんどうりゅうじ [EMAIL PROTECTED] I mean. You take the radical of 水 (water) and add 7 strokes a certain way to get 酒 (sake). It was not there, alas. Actually, you are mistaken; U+9152 does indeed represent the character you wanted, else this (UTF-8 encoded!) message would not be able to

Re: Re: Erratum in Unicode book

2001-07-08 Thread Michael \(michka\) Kaplan
From: James Kass [EMAIL PROTECTED] Perhaps he (てんどうりゅうじ) was lamenting the character's absence in the Han Radical Index section under radical # 85. If all the characters made from the water radical were listed under that radical in the Han Radical Index (and so forth), where would the

Re: Re: Erratum in Unicode book

2001-07-08 Thread Michael \(michka\) Kaplan
From: Michael Everson [EMAIL PROTECTED] At 09:47 -0700 2001-07-08, Michael \(michka\) Kaplan wrote: Perhaps a rule needs to be imposed about the amount of sake that should be consumed before submitting a character proposal? I've never had any trouble with beer. Ah, but that would indicate

How far afield can we go? (was Re: Re: Erratum in Unicode book)

2001-07-08 Thread Michael \(michka\) Kaplan
From: てんどうりゅうじ [EMAIL PROTECTED] Perhaps he (縺ヲ繧薙←縺・j繧・≧縺・ was lamenting the character's absence in the Han Radical Index section under radical # 85. Yes. It belongs there. Its so sad that you do not have a UTF-8 compatible e-mail client. :-( Come on. What ワープロばか (which probably most

Re: Arial Unicode MS and Code2000

2001-07-06 Thread Michael \(michka\) Kaplan
Well, I cannot speak for PowerBuilder (my knowledge of it is very out of date), but for both Netscape and MS SQL Server you may or may not be able to support Indic scripts -- the deciding factor will be based on what version of each product you are using. Beyond that, I do not think that any one

Re: Shavian

2001-07-06 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] What's bad is that work seems to get done on fictional scripts while there are still millions of real people (some of whom even have access to computers) who can't express texts of their natively-used languages with Unicode because we don't have their scripts encoded.

Re: Shavian

2001-07-06 Thread Michael \(michka\) Kaplan
From: John Cowan [EMAIL PROTECTED] Just so, which means that the energy spent on invented scripts is nowise taken away from the energy that could be spent on obscure-but-real scripts. Would that it were otherwise. No one is arguing the FACTUAL basis for the above, but it is quite reasonable

Re: Shavian

2001-07-06 Thread Michael \(michka\) Kaplan
From: Michael Everson [EMAIL PROTECTED] The editorial response to comments from national groups, in the public archive of ISO 10646 stuff that you linked to at the start of this message, included a complaint about Deseret from the German Standards body, in that it was inappropriate for being

Re: Shavian

2001-07-06 Thread Michael \(michka\) Kaplan
From: Kenneth Whistler [EMAIL PROTECTED] I've been lurking on this discussion, but have to chime in here. I do appreciate it, for what its worth. The chime was very much in tune. While fully recognizing the importance of Middle Earth to some people it is difficult for me to get past the fact

Re: Shavian

2001-07-06 Thread Michael \(michka\) Kaplan
From: Kenneth Whistler [EMAIL PROTECTED] You can just call me a consciencious objector to having anyone who subscribes to Vinyar Tengwar considering themselves to be among the Númenoreans (a.k.a. the Dúnedain), who alone of all the races of Men knew Elvish tongues. :-) Aha! I see you

Re: Unicode transliterations (and other operations)

2001-07-05 Thread Michael \(michka\) Kaplan
Hee hee - unless you're packing a guide to anime, you'll never find 'em anyway. らんま is Ranma, as in Ranma Saotome, and あかね is Akane, as in Akane Tendo, the two main stars of Rumiko Takahashi's bizarre (if monothematic) sex comedy Ranma 1/2. Seeing this wonderful use of Unicode text in

Re: Shavian (was: Re: UTF-17)

2001-07-04 Thread Michael \(michka\) Kaplan
From: Richard Cook [EMAIL PROTECTED] now, I know of other phonemic alphabets for English ... e.g., I think Ben Franklin invented one, ... and I have one of my own. Are any of these slated for encoding too? Fictional scripts have been, are, and will likely continue to be a constant source of

Re: Shavian (was: Re: UTF-17)

2001-07-04 Thread Michael \(michka\) Kaplan
From: John H. Jenkins [EMAIL PROTECTED] FWIW, there is a small but non-zero Shavian user community, and a number of fonts are available, some of them very pretty. Of this I have no doubt -- but this was true of Klingon, also. g I was expressing doubt that the majority of the community are:

Re: Shavian (was: Re: UTF-17)

2001-07-04 Thread Michael \(michka\) Kaplan
From: John Cowan [EMAIL PROTECTED] As for whether your script would be encoded, where it ends up vis-a-vis the potential roadmap is more a side effect of who you know than anything else. Smiley or not, someone might actually believe that, and it isn't true. Michael Everson is more than

Re: validity of lone surrogates (was Re: Unicode surroga tes: just say no!)

2001-07-03 Thread Michael \(michka\) Kaplan
From: "Marcin 'Qrczak' Kowalczyk" [EMAIL PROTECTED] It's a pity that UTF-16 doesn't encode characters up to U+F, such that code points corresponding to lone surrogates can be encoded as pairs of surrogates. Unfortunately, we would then be stuck with what happens when two such surrogate

Re: UTF-17

2001-06-23 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] Oh yeah, well, I can be more tongue-in-cheek than all of you. I've already implemented it. Doug, this is one of those things one should be ashamed of, like believing in the April Fool's Day message about self serve encodings enough to have put together a proposal to

Re: Playing with Unicode (was: Re: UTF-17)

2001-06-23 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] I'm never ashamed of perfectly good code I've written to fulfill a humorous requirement. I'm only ashamed of badly written code, or code that implements a bad idea that someone else thinks is a good idea. The latter is kind of the worry I had -- a long time ago I

Re: UTF8 encoding - What should I tell my customers?

2001-06-20 Thread Michael \(michka\) Kaplan
From: Jianping Yang [EMAIL PROTECTED] Carl W. Brown wrote: If there are no surrogates in the database, is there any reason that I can not change the database from UTF8 to AL32UTF8? You can change the database from UTF8 to AL32UTF8 in this case. Also you can use Oracle database scanner to

Re: FSS-UTF, UTF-2, UTF-8, and UTF-16

2001-06-19 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] Waiting until characters were assigned outside the BMP to start working on the UCS-2 problem is like waiting until 2000-01-01 to start working on the Y2K problem. Its actually a bit worse than this -- its coming up with a solution to Y2K problems that requires other

Re: First of many newbie questions

2001-06-15 Thread Michael \(michka\) Kaplan
From: Youtie Effaight [EMAIL PROTECTED] Well, Mister Constable. What's new about that? Looks to me like e-Leven Digit Grrl just forgot to turn off her microphone again... We're witnessing the spacey under-mumble of a quickly crumbling mind. Maybe we'll get lucky and she'll burn up on

Re: informative due to variation across langauges

2001-06-15 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] Can anyone give me a specific example of why Line Breaking or East Asian Width properties aren't normative? Why be more specific then there are a lot of people who think they might possibly have made TOO MUCH normative and do not want to make things unchangeable that

Re: informative due to variation across langauges

2001-06-15 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] On 06/15/2001 06:29:51 PM Michael \(michka\) Kaplan wrote: Why be more specific then there are a lot of people who think they might possibly have made TOO MUCH normative and do not want to make things unchangeable that might be in error or might need to change later

Re: U+2011 and U+2010

2001-06-12 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] Out of curiousity, is there documentation on XCCS available anywhere? Check out google.com: it will get about 120+ hits on the words XCCS standard and several of them seem vaguely relevant. :-) MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/

Re: UTF-16 problems

2001-06-11 Thread Michael \(michka\) Kaplan
From: Carl W. Brown [EMAIL PROTECTED] I think that UTF-16x would be a better approach than UTF-8s. I am sure that I have missed some issues feel free to comment. In any case UTF-16s would naturally be in Unicode code point order. It would be easy to transform to UCS-2 for applications

Re: UTF-16 problems

2001-06-11 Thread Michael \(michka\) Kaplan
From: Carl W. Brown [EMAIL PROTECTED] I first I thought the same thing but I have changed my mind. There are problems but the problems are with UTF-16 not UTF-8. I don't think that I am the only one who thinks that UTF-8s will create more problems that it fixes. Worse yet they will also

Re: UTF-16 problems

2001-06-11 Thread Michael \(michka\) Kaplan
From: Rick McGowan [EMAIL PROTECTED] ... asking for a lavicious license to be lecherously lazy Parse error at lavicious. No such word appears in any English dictionary I own, not even the OED. Sorry, that was to be lascivious. Glad someone is still parsing in this thread. michka

Re: UTF-16 problems

2001-06-11 Thread Michael \(michka\) Kaplan
(whoops, sent too soon!) From: Carl W. Brown [EMAIL PROTECTED] I am proposing that we fix UTF-16. Are you formally proposing this? For the next UTC meeting? Without an actual customer that is wanting it for an implementation I am pretty sure this will be voted down pretty loudly. michka

Re: UTF-16 problems

2001-06-11 Thread Michael \(michka\) Kaplan
From: Carl W. Brown [EMAIL PROTECTED] I am proposing that we fix UTF-16. Are you formally proposing this? For the next UTC meeting? michka

Re: UTF-16 problems

2001-06-11 Thread Michael \(michka\) Kaplan
From: Jianping Yang [EMAIL PROTECTED] If UTF-8S were to by some miracle be accepted by the UTC, implementers will be put out and offended for most of the next decade. If it is, that is rule of law from UTC. Very true. devil's advocate And if they vote against it, will you do the

Re: UTF-16 problems

2001-06-11 Thread Michael \(michka\) Kaplan
From: Jianping Yang [EMAIL PROTECTED] Oracle is promoting and following the standard. Same as most other database vendors, our database does not fully support supplementary character in Oracle 8i and Oracle 7. But as we see the need to support it, we extend this support in Oracle 9i. So far,

Re: UTF8 vs AL32UTF8

2001-06-11 Thread Michael \(michka\) Kaplan
From: Mark Davis [EMAIL PROTECTED] UTF-8 was defined before UTF-16. At the time it was first defined, there were no surrogates, so there was no special handling of the D800..DFFF code points. In other words, Oracle has an alternate solution here for 9i -- they can simply explain that the old

Re: Lenient search engine

2001-06-10 Thread Michael \(michka\) Kaplan
From: "$B$F$s$I$&$j$e$&$8(B" [EMAIL PROTECTED] A search engine regards the words "stone" and "STONE" as identical. So why isn't $B$$$7(B treated the same as $B%$%7(B? The difference can be quite marked, such as $B%l%$%W(B versus $B$l$$$W(B or such. Well, there is nothing to stop

Re: 16 bit character sets

2001-06-07 Thread Michael \(michka\) Kaplan
We don't have Paul Clayton's e-mail address, but I assume you can forward on, Magda? SQL Server, ASP, and VB are all able to support UTF-16, which is a 16-bit per code point encoding form. The term 16 bit character set is a bit unclear in its meaning, what exactly Paul is looking for here would

Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)

2001-06-04 Thread Michael \(michka\) Kaplan
From: Mark Davis [EMAIL PROTECTED] 2. Auto-detection does not particularly favor one side or the other. UTF-8 and UTF-8s are strictly non-overlapping. If you ever encounter a supplementary character expressed with two 3-byte values, you know you do not have pure UTF-8. If you ever encounter

Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)

2001-06-04 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] On 06/04/2001 02:10:35 AM Doug Ewell wrote: While we are at it, here's another argument against the existence of both UTF-8 and this new UTF-8s. Recently there was a discussion about the use of the U+FEFF signature in UTF-8 files, with a fair number of Unicode

Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)

2001-06-04 Thread Michael \(michka\) Kaplan
From: Misha Wolf [EMAIL PROTECTED] Let's be careful with the word legal. The strange (per-)version of UTF-8 which re-encodes UTF-16 is legal input as far as The Unicode Standard is concerned. It is, however, totally illegal as far as the IETF, the Internet, the W3C, the WWW, XML, and HTML

Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)

2001-06-04 Thread Michael \(michka\) Kaplan
From: Marco Cimarosti [EMAIL PROTECTED] No, please, let's not make waters more muddied than they already are. Let's keep on calling Oracle's proposal UTF-8S, as there is no point in finding a cuter name for it. Fair enough. Wrong point! Perhaps it will not hurt applications which read text

Re: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-30 Thread Michael \(michka\) Kaplan
Simon, Would you care to answer (officially) why exactly Oracle needs for anything to be done here? Per the spec, it is not illegal for a process to interpret 5/6-byte supplementary characters; it is only illegal to emit them. It seems that Oracle and everyone else is well covered with the

Re: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-30 Thread Michael \(michka\) Kaplan
Simon, Would you care to answer (officially) why exactly Oracle needs for anything to be done here? Per the spec, it is not illegal for a process to interpret 5/6-byte supplementary characters; it is only illegal to emit them. It seems that Oracle and everyone else is well covered with the

Re: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

2001-05-28 Thread Michael \(michka\) Kaplan
From: Jianping Yang [EMAIL PROTECTED] As a matter of fact, the surrogate or supplementary character was not defined in the past, so we could live without Premise B in the past. But now the supplementary character is defined and will soon be supported, we have to bother with it. Poor

Re: Single Unicode Font

2001-05-24 Thread Michael (michka) Kaplan
From: G. Adam Stanislav [EMAIL PROTECTED] At 13:11 22-05-2001 -0700, Carl W. Brown wrote: There is no easy solution. Yes, there is, though it is probably beyond the scope of this list. Nevertheless, there is a very simple solution. It needs to be done on the OS level: Create metafonts.

Re: Pan UniCode fonts

2001-05-24 Thread Michael (michka) Kaplan
From: 11 digit boy [EMAIL PROTECTED] I have worked with many terminal emulator systems that use mono-spaced fonts. The first place you start having problems is with script fonts like Arabic. With Indic languages you often have to reorder characters before rendering Um. How about having all

Re: Single Unicode Font

2001-05-23 Thread Michael \(michka\) Kaplan
From: Graham Asher [EMAIL PROTECTED] But I guess this is obvious. I just wanted to chime in with the view that a single Unicode Font would be useful, and a whole lot better than some people suggest. As an implementer of rasterizers and text layout systems I can also state that the problem

<    1   2   3   4   5   6   >