At 12:00 am +0900 27/12/05, Joel Rees wrote:

I'll have to tell you a war story or two, sometime.

Unicode is a kludge. It's one of the better kludges, and evidence that kludges make the world go round...

Just as well it doesn't rely on iso-2022-jp or us-ascii. It's not Unicode that is the kludge; Unicode is simply the assignment of a unique character to a large range of numbers rather than the assignment of an arbitrary number of characters to a range any American president can conceive of. The present temporary problems with Unicode arise only from a long anarchic heritage of monumental kludges.

...The frustrating thing about this is that I've been here before, about three years back when the perl implementation wasn't quite as complete, but I can't remember what I did, and I don't have access to the code I built then anymore.

I have the same problem again and again with a mere hour's interval!

The script below reduces the problem to its simplest. Notice the deadly caveats. In my experience (and I have war stories too) the harder one tries with Perl/Unicode the worse the mess you get into. You can probably forget about locale -- try “use encoding (":locale")” in the script below and see what you get! -- and lots of other things. It's certainly a jungle, and it's growing, but it's getting tidier.

#!/usr/bin/perl
#
#  In BBEdit/TextWrangler set this document's
#  encoding to Japanese (Shift JIS); always open/reopen
#  as Japanese (Shift JIS).
#
#  In BBEdit/TextWrangler Preferences/Unix Scripting
#  check “use UTF-8” for Unix Script I/O.
#
#  When running in Terminal set Window Settings...
# [Display] [Character Set Encoding] to “Unicode (UTF-8)”.
#
### use utf8; # NO !!
# no encoding; # OK, optional
# binmode STDOUT, "UTF-8"; # OK, optional
### binmode STDOUT, ":utf8"; ### NO !! Quite different !!
use Encode qw~from_to~;
while (<DATA>) { /^#/ and next;
        from_to ($_, "Shift_JIS", "utf8");
        print
}
__DATA__
# Must not contain non-Shift_JIS characters
空欄を埋めたり、完全な文書で質問に答えたり、
一番適切に思う解答を〇で記したりする。
##################################################

Reply via email to