the correct
Reply-To: field,
and have that point to [EMAIL PROTECTED]
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
it easier to understand would help.
Or perhaps I'm just reacting to the confusion of the UniCode
website and its not that hard to understand and a simple definition
would do? But the first idea certainly wouldn't hurt.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website
standard
definition.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
not a major problem though until then, because thats above
what almost anyone will be using. I don't know if its allocated
yet, anyhow. Its below 10 though.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
not a major problem though until then, because thats above
what almost anyone will be using. I don't know if its allocated
yet, anyhow. Its below 10 though.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
What is going to be done about the confusion generated from
having multiple ways to encode the same character?
For example, for filenames, OSX will encode an accented Roman
letter one way, while for filenames Windows will encode it the
other way. These kind of confusions are totally expected,
appreciate
them. ;-)
Thats a shame. Simplicity is wonderful.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
http://www.unicode.org/unicode/reports/tr15/ mentions both
composites and combining sequences.
But it doesn't tell us the difference. I know what a combining
sequence is. If I didn't know what a composite was, I'd guess it
was the same thing as a combining sequence.
However, the two are
was tightened up.
Perhaps this code should be tightened up along with the standard
now?
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
Seems like I missed the isLegalUTF8 function calls that verified
if the UTF was valid UTF8, nevermind then, its all OK.
On Wednesday, July 17, 2002, at 01:57 , Theodore H. Smith wrote:
The file ConvertUTF.c contains this array:
static const char trailingBytesForUTF8[256
Ugaritic Cuneiform
Shavian
Osmanya
Cypriot Syllabary
Whats the point of having more Latin characters? Do they look
like normal Roman characters? I think we have a few versions (3
or more?) of them, already. I thought once was enough.
--
Theodore H. Smith - Macintosh Consultant / Contractor
not? Is this a bug in the demo, or a bug in ATSUI for OS9? Does
ATSUI for Carbon on OS9 work if ATSUI for Classic OS9 doesn't?
If anyone knows ATSUI well, could you please contact me so I can ask a
few more questions? Thanks a lot.
--
Theodore H. Smith - Macintosh Consultant / Contractor
, and not a char
count like it claims.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
://developer.apple.com/techpubs/macosx/Carbon/text/ATSUI/atsui.html
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
Thank you for the mail list address.
I tried out the demo on OS9, and they work! Apparantly, the OS9 version
won't hit test, emulated on OSX. And the Carbon version won't run on
OS9 emulated, because all my attempts to set Run in Classic Mode in
the info window failed. The check box wouldn't
, but if
the whim takes you then do so.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
Hi list,
I'm directly calling ATSUI, for a framework I am writing.
I have a character of value 987, Stigma. This is part of my UTF16
string. The rest of the string displays just fine. But my Stigma
doesn't, it shows up as the Rectangle.
What is wrong? Is it something to do with font
Stigma is not a common character. Can you see it in any applications?
Which fonts do you have that contain Greek characters?
On a standard OS X install, I think this character is only present in
the
Japanese Hiragino Pro fonts. Also in Code2000, if you add this.
I don't know what fonts
My first reaction, is that the logos don't look like they compare to
other logos in terms of style. For example Mac OSX logos, XML logos,
and that generally do look more snazzy.
My second reaction is that I hope I haven't annoyed anyone.
My third was that I probably ought to say it anyhow.
a tiff of the Unicode word (in it's large
original format) which is the part that I actually did like, I could
re-do the rest for you in PhotoShop v6 format, and submit as a
suggestion.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
use it myself. I don't think I can be breaking a copyright by
accepting a tiff emailed to me from Unicode.org staff.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
I'm not sure what other people experience, but I see a note saying the
attachment was (quite correctly I think) removed from the email, and
instead just lists the name and format of the attachment.
I'm on the digest format.
, but in a different sense.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
of text via the mouse,
occasional crashing with Arabic, etc.
Actually Arabic displays in Safari OK. It just doesn't select OK.
Entering one line of Arabic into Safari is OK, but multi lines give
some of those bugs I mentioned.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My
for your abuse of its AUP in this webmail.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
a different set of headaches, not really less.
Such is computing for real-world problems!
Reply directly to me if you can please? At [EMAIL PROTECTED]
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
ZWNBS, I think that char is discouraged. Where
is the rule that discourages it?
CC me directly please?
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
if no one
answers, already writing this, in the aim for people to understand,
this helps me a lot get my thoughts straight!
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
Hi Doug,
heres some things I think.
If you really aren't processing anything but the ASCII characters
within
your strings, like and in your example, you can probably get
away with keeping your existing byte-oriented code. At least you won't
get false matches on the ASCII characters (this was
From: Michael Everson [EMAIL PROTECTED]
Please drop this thread.
Thats one of the most sensible answers I've heard to the nonsensical
propositions that tend to fill this list ;oD!! (Including new hex
characters, and other madnesses).
, and replacing them with urls to Unicode.org
;o) . Also suggestions like putting urls to the reference of where I
got my data from!
Thanks to all who answer.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
Can someone give me some advice? If I was to write a dictionary class
for Unicode, would I be better off writing it using a b-tree, or
hash-bin system? Or maybe an array of pointers to arrays system?
I suppose, that if I wanted an array of pointers to arrays, that I
couldn't use UTF32, I could
what does i18n mean? I see it bandied about a lot.
My guess is internationalisation, but actually when you pronounce
eye won ayht en it doesn't sound anything like that word.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
I've looked into the TST thing.
I'm not sure that it is optimal, despite how popular they are!
Look at this, if I add 1, 2,3, 4, 5, 6, 7, 8, 9 to a
TST, they will all be in a line, in the tree. All will be reference via
the high node.
So, to find 9, I have to read through 9 items!
Now, I'm
Hi Mark,
Your tries are nice, however they are being used for single unicode
characters, not a whole string of them, right? Well, sure some of them
are being used for whole strings, but for me, ALL of mine will be used
for whole strings. Yours are quite rare.
Does this make the advantage of
We tend to use tries, which have very good performance
characteristics.
See
bits of unicode on my site: www.macchiato.com.
You'll find many references of books, and papers available in PDF
format on:
http://citeseer.nj.nec.com/147026.html
Thanks so much. This is actually the best article I've
I see a lot of garbage characters in the unicode digest.
The same emails, however, display fine when emailed to me directly,
(although I can't understand them sometimes ;o) but someone who speaks
the correct language would).
Is this a problem with my mailer, or the unicode digest program? I
.
--
Theodore H. Smith - Macintosh Consultant / Contractor.
My website: www.elfdata.com/
I've often wanted to type a symbol, that's like an exclamation mark,
and a comma at the same time. That is, instead of the . on the bottom
of a !, it has a , instead.
Is there such a Unicode code point? Just out of curiosity! Or I
suppose, is there a way to compose such a character.
UTF8 into different files! That way I
can use readline type code to do my UTF-8 verification.
It would be nice if someone had a automated test ready UTF-8 file.
If not, I'll modify this one and then put the results up on my website,
someday. (week or so).
--
Theodore H. Smith - Software
that simply aren't in the file?
- the file mixes UTF-8 and UTF-16
Does this file mix UTF-8 and UTF-16? I thought it just had surrogates
encoded into UTF-8? Of course a surrogate should never exist in UTF-8.
--
Theodore H. Smith - Software Developer.
http://www.elfdata.com
Are the Unicode Code Points from +U0 to +UFF, equivalent to the Windows
Latin 1 code points?
Or is it equivalent to ISO-Latin-1?
Sorry for asking this, I know its answered somewhere, but I can't seem
to find the answer on your website.
--
Theodore H. Smith - Software Developer.
http
http://java.sun.com/j2se/1.5.0/docs/api/java/io/
DataInput.html#modified-utf-8
If only people could sue for suggesting bad coding practices ;o)
--
Theodore H. Smith - Software Developer.
http://www.elfdata.com
strive to avoid.
Theodore H. Smith wrote:
http://java.sun.com/j2se/1.5.0/docs/api/java/io/
DataInput.html#modified-utf-8
If only people could sue for suggesting bad coding practices ;o)
--
Theodore H. Smith - Software Developer.
http://www.elfdata.com
as
markup (XML for example) with no decompression, so its quite handy.
Anyhow, thats why I think UTF-8 is really the way to go.
Its too bad MicroSoft and Apple didn't realise the same, before they
made their silly UCS-2 APIs.
--
Theodore H. Smith - Software Developer - www.elfdata.com/plugin
, for you. (The
words would all be spelt correctly though, so as to not require
expensive RAM copying when doing the replacements.)
Yes, I do know how to code ;o)
Too bad so few others do.
--
Theodore H. Smith - Software Developer - www.elfdata.com/plugin/
Industrial strength string
.
--
Theodore H. Smith - Software Developer - www.elfdata.com/plugin/
Industrial strength string processing code, made easy.
(If you believe that's an oxymoron, see for yourself.)
47 matches
Mail list logo