Take a look at:


The Absolute Minimum Every Software Developer Absolutely, Positively
Must Know About Unicode and Character Sets (No Excuses!)

First Paragraphs:

Ever wonder about that mysterious Content-Type tag? You know, the one
you're supposed to put in HTML and you never quite know what it should

Did you ever get an email from your friends in Bulgaria with the subject
line "???? ?????? ??? ????"?

I've been dismayed to discover just how many software developers aren't
really completely up to speed on the mysterious world of character sets,
encodings, Unicode, all that stuff. A couple of years ago, a beta tester
for FogBUGZ was wondering whether it could handle incoming email in
Japanese. Japanese? They have email in Japanese? I had no idea. When I
looked closely at the commercial ActiveX control we were using to parse
MIME email messages, we discovered it was doing exactly the wrong thing
with character sets, so we actually had to write heroic code to undo the
wrong conversion it had done and redo it correctly. When I looked into
another commercial library, it, too, had a completely broken character
code implementation. I corresponded with the developer of that package
and he sort of thought they "couldn't do anything about it." Like many
programmers, he just wished it would all blow over somehow.

FarsiWeb mailing list

Reply via email to