Why isnt the posting address on the list?

2002-05-29 Thread Theodore H. Smith
the correct Reply-To: field, and have that point to [EMAIL PROTECTED] -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

How is UTF8, UTF16 and UTF32 encoded?

2002-05-29 Thread Theodore H. Smith
it easier to understand would help. Or perhaps I'm just reacting to the confusion of the UniCode website and its not that hard to understand and a simple definition would do? But the first idea certainly wouldn't hurt. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website

UniCode website is confusing

2002-05-29 Thread Theodore H. Smith
standard definition. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

roundtrip on UTF8 value 1114048 ?

2002-06-07 Thread Theodore H. Smith
not a major problem though until then, because thats above what almost anyone will be using. I don't know if its allocated yet, anyhow. Its below 10 though. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

roundtrip on UTF8 value 1114048 ?

2002-06-07 Thread Theodore H. Smith
not a major problem though until then, because thats above what almost anyone will be using. I don't know if its allocated yet, anyhow. Its below 10 though. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Multiple encodings for 1 character

2002-07-08 Thread Theodore H. Smith
What is going to be done about the confusion generated from having multiple ways to encode the same character? For example, for filenames, OSX will encode an accented Roman letter one way, while for filenames Windows will encode it the other way. These kind of confusions are totally expected,

Re: Multiple encodings for 1 character

2002-07-08 Thread Theodore H. Smith
appreciate them. ;-) Thats a shame. Simplicity is wonderful. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Whats the difference between a composite and a combining sequence?

2002-07-08 Thread Theodore H. Smith
http://www.unicode.org/unicode/reports/tr15/ mentions both composites and combining sequences. But it doesn't tell us the difference. I know what a combining sequence is. If I didn't know what a composite was, I'd guess it was the same thing as a combining sequence. However, the two are

Problem with ConvertUTF.c?

2002-07-16 Thread Theodore H. Smith
was tightened up. Perhaps this code should be tightened up along with the standard now? -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Re: Problem with ConvertUTF.c?

2002-07-16 Thread Theodore H. Smith
Seems like I missed the isLegalUTF8 function calls that verified if the UTF was valid UTF8, nevermind then, its all OK. On Wednesday, July 17, 2002, at 01:57 , Theodore H. Smith wrote: The file ConvertUTF.c contains this array: static const char trailingBytesForUTF8[256

Re: Scripts in Unicode 4.0

2002-08-14 Thread Theodore H. Smith
Ugaritic Cuneiform Shavian Osmanya Cypriot Syllabary Whats the point of having more Latin characters? Do they look like normal Roman characters? I think we have a few versions (3 or more?) of them, already. I thought once was enough. -- Theodore H. Smith - Macintosh Consultant / Contractor

ATSUI for MacOS9

2002-11-19 Thread Theodore H. Smith
not? Is this a bug in the demo, or a bug in ATSUI for OS9? Does ATSUI for Carbon on OS9 work if ATSUI for Classic OS9 doesn't? If anyone knows ATSUI well, could you please contact me so I can ask a few more questions? Thanks a lot. -- Theodore H. Smith - Macintosh Consultant / Contractor

ATSUI text length parameters

2002-11-19 Thread Theodore H. Smith
, and not a char count like it claims. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Re: ATSUI for MacOS9

2002-11-19 Thread Theodore H. Smith
://developer.apple.com/techpubs/macosx/Carbon/text/ATSUI/atsui.html -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Re: ATSUI for MacOS9

2002-11-19 Thread Theodore H. Smith
Thank you for the mail list address. I tried out the demo on OS9, and they work! Apparantly, the OS9 version won't hit test, emulated on OSX. And the Carbon version won't run on OS9 emulated, because all my attempts to set Run in Classic Mode in the info window failed. The check box wouldn't

Quick ATSUI question

2002-11-20 Thread Theodore H. Smith
, but if the whim takes you then do so. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Why isn't my character displaying

2002-11-29 Thread Theodore H. Smith
Hi list, I'm directly calling ATSUI, for a framework I am writing. I have a character of value 987, Stigma. This is part of my UTF16 string. The rest of the string displays just fine. But my Stigma doesn't, it shows up as the Rectangle. What is wrong? Is it something to do with font

Re: Why isn't my character displaying

2002-11-29 Thread Theodore H. Smith
Stigma is not a common character. Can you see it in any applications? Which fonts do you have that contain Greek characters? On a standard OS X install, I think this character is only present in the Japanese Hiragino Pro fonts. Also in Code2000, if you add this. I don't know what fonts

Not snazzy (was: New Unicode Savvy Logo)

2003-05-27 Thread Theodore H. Smith
My first reaction, is that the logos don't look like they compare to other logos in terms of style. For example Mac OSX logos, XML logos, and that generally do look more snazzy. My second reaction is that I hope I haven't annoyed anyone. My third was that I probably ought to say it anyhow.

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-05-29 Thread Theodore H. Smith
a tiff of the Unicode word (in it's large original format) which is the part that I actually did like, I could re-do the rest for you in PhotoShop v6 format, and submit as a suggestion. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-05-29 Thread Theodore H. Smith
use it myself. I don't think I can be breaking a copyright by accepting a tiff emailed to me from Unicode.org staff. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Emailing logos to the list

2003-05-30 Thread Theodore H. Smith
I'm not sure what other people experience, but I see a note saying the attachment was (quite correctly I think) removed from the email, and instead just lists the name and format of the attachment. I'm on the digest format.

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-05-30 Thread Theodore H. Smith
, but in a different sense. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

[OT] OSX's bad Arabic support (was RE: [OT] No more IE for Mac)

2003-06-16 Thread Theodore H. Smith
of text via the mouse, occasional crashing with Arabic, etc. Actually Arabic displays in Safari OK. It just doesn't select OK. Entering one line of Arabic into Safari is OK, but multi lines give some of those bugs I mentioned. -- Theodore H. Smith - Macintosh Consultant / Contractor. My

Re: Arabic script web site hosting solution for all platforms

2003-06-18 Thread Theodore H. Smith
for your abuse of its AUP in this webmail. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Arabic/Hebrew coding for the Mac

2003-07-06 Thread Theodore H. Smith
a different set of headaches, not really less. Such is computing for real-world problems! Reply directly to me if you can please? At [EMAIL PROTECTED] -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Questions on ZWNBS

2003-08-02 Thread Theodore H. Smith
ZWNBS, I think that char is discouraged. Where is the rule that discourages it? CC me directly please? -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

[off] XML. And RAM

2003-08-14 Thread Theodore H. Smith
if no one answers, already writing this, in the aim for people to understand, this helps me a lot get my thoughts straight! -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Re: Non-ascii string processing?

2003-10-05 Thread Theodore H. Smith
Hi Doug, heres some things I think. If you really aren't processing anything but the ASCII characters within your strings, like and in your example, you can probably get away with keeping your existing byte-oriented code. At least you won't get false matches on the ASCII characters (this was

A sensible answer (was Re: UTF-9)

2003-11-01 Thread Theodore H. Smith
From: Michael Everson [EMAIL PROTECTED] Please drop this thread. Thats one of the most sensible answers I've heard to the nonsensical propositions that tend to fill this list ;oD!! (Including new hex characters, and other madnesses).

Please help knock my FAQ into shape

2003-11-10 Thread Theodore H. Smith
, and replacing them with urls to Unicode.org ;o) . Also suggestions like putting urls to the reference of where I got my data from! Thanks to all who answer. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Unicode dictionary coding? UTF8, UTF32, etc

2003-11-14 Thread Theodore H. Smith
Can someone give me some advice? If I was to write a dictionary class for Unicode, would I be better off writing it using a b-tree, or hash-bin system? Or maybe an array of pointers to arrays system? I suppose, that if I wanted an array of pointers to arrays, that I couldn't use UTF32, I could

What does i18n mean?

2003-11-14 Thread Theodore H. Smith
what does i18n mean? I see it bandied about a lot. My guess is internationalisation, but actually when you pronounce eye won ayht en it doesn't sound anything like that word. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Re: Ternary search trees for Unicode dictionaries

2003-11-17 Thread Theodore H. Smith
I've looked into the TST thing. I'm not sure that it is optimal, despite how popular they are! Look at this, if I add 1, 2,3, 4, 5, 6, 7, 8, 9 to a TST, they will all be in a line, in the tree. All will be reference via the high node. So, to find 9, I have to read through 9 items! Now, I'm

Re: Ternary search trees for Unicode dictionaries

2003-11-18 Thread Theodore H. Smith
Hi Mark, Your tries are nice, however they are being used for single unicode characters, not a whole string of them, right? Well, sure some of them are being used for whole strings, but for me, ALL of mine will be used for whole strings. Yours are quite rare. Does this make the advantage of

Re: Ternary search trees for Unicode dictionaries

2003-11-18 Thread Theodore H. Smith
We tend to use tries, which have very good performance characteristics. See bits of unicode on my site: www.macchiato.com. You'll find many references of books, and papers available in PDF format on: http://citeseer.nj.nec.com/147026.html Thanks so much. This is actually the best article I've

Digest doesn't display unicode properly?

2003-11-21 Thread Theodore H. Smith
I see a lot of garbage characters in the unicode digest. The same emails, however, display fine when emailed to me directly, (although I can't understand them sometimes ;o) but someone who speaks the correct language would). Is this a problem with my mailer, or the unicode digest program? I

Thanks for the answers on dictionaries

2003-11-21 Thread Theodore H. Smith
. -- Theodore H. Smith - Macintosh Consultant / Contractor. My website: www.elfdata.com/

Exclamation mark comma

2003-11-26 Thread Theodore H. Smith
I've often wanted to type a symbol, that's like an exclamation mark, and a comma at the same time. That is, instead of the . on the bottom of a !, it has a , instead. Is there such a Unicode code point? Just out of curiosity! Or I suppose, is there a way to compose such a character.

Re: UTF-8 stress test file?

2004-10-10 Thread Theodore H. Smith
UTF8 into different files! That way I can use readline type code to do my UTF-8 verification. It would be nice if someone had a automated test ready UTF-8 file. If not, I'll modify this one and then put the results up on my website, someday. (week or so). -- Theodore H. Smith - Software

Re: UTF-8 stress test file?

2004-10-11 Thread Theodore H. Smith
that simply aren't in the file? - the file mixes UTF-8 and UTF-16 Does this file mix UTF-8 and UTF-16? I thought it just had surrogates encoded into UTF-8? Of course a surrogate should never exist in UTF-8. -- Theodore H. Smith - Software Developer. http://www.elfdata.com

Windows Latin1?

2004-11-06 Thread Theodore H. Smith
Are the Unicode Code Points from +U0 to +UFF, equivalent to the Windows Latin 1 code points? Or is it equivalent to ISO-Latin-1? Sorry for asking this, I know its answered somewhere, but I can't seem to find the answer on your website. -- Theodore H. Smith - Software Developer. http

Opinions on this Java URL?

2004-11-12 Thread Theodore H. Smith
http://java.sun.com/j2se/1.5.0/docs/api/java/io/ DataInput.html#modified-utf-8 If only people could sue for suggesting bad coding practices ;o) -- Theodore H. Smith - Software Developer. http://www.elfdata.com

Re: Opinions on this Java URL?

2004-11-12 Thread Theodore H. Smith
strive to avoid. Theodore H. Smith wrote: http://java.sun.com/j2se/1.5.0/docs/api/java/io/ DataInput.html#modified-utf-8 If only people could sue for suggesting bad coding practices ;o) -- Theodore H. Smith - Software Developer. http://www.elfdata.com

Nicest UTF

2004-12-01 Thread Theodore H. Smith
as markup (XML for example) with no decompression, so its quite handy. Anyhow, thats why I think UTF-8 is really the way to go. Its too bad MicroSoft and Apple didn't realise the same, before they made their silly UCS-2 APIs. -- Theodore H. Smith - Software Developer - www.elfdata.com/plugin

If only MS Word was coded this well (was Re: Nicest UTF)

2004-12-07 Thread Theodore H. Smith
, for you. (The words would all be spelt correctly though, so as to not require expensive RAM copying when doing the replacements.) Yes, I do know how to code ;o) Too bad so few others do. -- Theodore H. Smith - Software Developer - www.elfdata.com/plugin/ Industrial strength string

Re: Software support costs (was: Nicest UTF)

2004-12-10 Thread Theodore H. Smith
. -- Theodore H. Smith - Software Developer - www.elfdata.com/plugin/ Industrial strength string processing code, made easy. (If you believe that's an oxymoron, see for yourself.)