Re: Unicode. Perl does the right thing?

H. Merijn Brand Sat, 26 Oct 2002 05:17:18 -0700

[EMAIL PROTECTED] (Dan Kogai) wrote in
news:1E456D1E-E7DE-11D6-BF8B-0003939A104C@;dan.co.jp:


> On Friday, Oct 25, 2002, at 14:10 Asia/Tokyo, Philip Newton wrote:
>> Well, partially because there's no "good" names for many of the
>> characters. What do you call "生"? "CJK UNIFIED IDEOGRAPH-751F"?
>> (That's the current Unicode "name", but it's not particularly useful.)
>> "CJK shou"? "CJK sei"? "CJK sheng1"? "CJK saeng"? "CJK ikiru"? ikasu,
>> ikeru, umareru, umu, ou, haeru, hayasu, ki, nama, naru, nasu, musu,
>> .... which one do you pick?
> 
> If we are stuck with de jure, ex officio names from Unicode Consortium 
> we are out of luck but this is perl; if there are more than one way to 
> do it,  Why not more than one way to name it?  I am kind of wondering a 
> charnames extension that goes like
> 
> use charnames ":ja"; # Japanese
> print "\N{sei-ikiru}";
> #
> use charnames ":ko";
> print "\N{saeng}";
> #
> use charanames ":zh";
> print "\N{sheng1}";

All ideal for the new aliassing module!

use charnames ":full", ":alias" => "ja";

which is the same as

use charnames ":alias" => ":ja";

Maybe we should supply such alias files, which have no restriction in the 
number of aliases to the same long name.

Asuming sei-ikiru's real name is "CHINESE BLUBBER POND WITH FROGS", there 
is no problem with

use charnames ":full", ":alias" => {
        "sei-ikiru" => "CHINESE BLUBBER POND WITH FROGS",
        "saeng"     => "CHINESE BLUBBER POND WITH FROGS",
        "sheng1"    => "CHINESE BLUBBER POND WITH FROGS",
        "frog-pond" => "CHINESE BLUBBER POND WITH FROGS",
      };

Forgive my ignorance of korean, japanese, chinese and CJK codings in 
general. Just pointing out the new welth of possibilities.

Now we can support

unicore/ko_alias.pl
unicore/ja_alias.pl
unicore/zh_alias.pl
...

> Since pragmatic approach is rather inflexible, I would prefer OO 
> aproach, like
> 
> use Char::Name;
> 
> my $char = Char::Name->new;
> 
> print $char->jp("sei-ikiru");
> 
> I know Japanese is the biggest nightmare to name characters because in 
> Japanese we give too many "names" to each character; It's really hard 
> to disambiguate these....
> 
> I may come up with something as I look though Unihan DB, now accessible 
> via CPAN (Unicode::Unihan)....
> 
>> Cheers,
>> Philip Newton (不衣律不入豚)
> 
> \x{5c0f}\x{98fc} \x{5f3e}
> 
> 



-- 
H.Merijn Brand    Amsterdam Perl Mongers (http://www.amsterdam.pm.org/)
using perl-5.6.1, 5.7.2 & 630 on HP-UX 10.20 & 11.00, AIX 4.2, AIX 4.3,
     WinNT 4, Win2K pro & WinCE 2.11 often with Tk800.022 &/| DBD-Unify
ftp://ftp.funet.fi/pub/languages/perl/CPAN/authors/id/H/HM/HMBRAND/

Re: Unicode. Perl does the right thing?

Reply via email to