Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;

2002-10-02 Thread Jarkko Hietaniemi
On Wed, Oct 02, 2002 at 10:44:06PM +0900, Dan Kogai wrote: > On Wednesday, Oct 2, 2002, at 22:34 Asia/Tokyo, Jarkko Hietaniemi wrote: > >>Yes. that's where hiragana -> katakana conversion is attempted; > >>English equivalent of tr/A-Z/a-z/. > > > >Okay... What are the {begin,end} codepoints of th

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;

2002-10-02 Thread Dan Kogai
On Wednesday, Oct 2, 2002, at 22:34 Asia/Tokyo, Jarkko Hietaniemi wrote: >> Yes. that's where hiragana -> katakana conversion is attempted; >> English equivalent of tr/A-Z/a-z/. > > Okay... What are the {begin,end} codepoints of those ranges, > both LHS and RHS of tr, both in EUC-JP and in Unicod

Re: Parsing JIS X 0208 & Shift JIS with 5.8.0 +++++Success

2002-10-02 Thread Robin
I'm cross posting this to the perl unicode list because the pods say they might be interested in my dopey luser feedback, well actually not with those words they don't :-). The process by which I arrived at the solution might seem painful to some, but I'm listing it here in case anyone else is/wi

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;

2002-10-02 Thread Jarkko Hietaniemi
> I can explain that. "\x{3af}bc\x{3af}de" is is a string literal so > it gets encoded. however, my example in escaped form is; > > $kana =~ tr/\xA4\xA1-\xA4\xF3/\xA5\xA1-\xA5\xF3/ > > which does not get encoded. the intention was; > > $kana =~ tr/\x{3041}-\x{3093}/\x{30a1}-\x{30f3}/

Re[4]: Encode::compat 0.01 says "Unsupported conversion"

2002-10-02 Thread Robert Allerstorfer
Hi, On Tue, 1 Oct 2002, 10:07 GMT+08 (04:07 local time) Autrijus Tang wrote: > On Wed, Sep 25, 2002 at 07:44:47PM +0200, Robert Allerstorfer wrote: >> my ($from, $to) = map { s/^utf8$/utf-8/i; lc($_) } ($_[1], $_[2]); >> But this fails due to the attempt to change $_[1]. I fixed this by >>

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;

2002-10-02 Thread Jarkko Hietaniemi
> >Are you doing character ranges in the tr/// under 'use encoding'? > >(I'm asking because I see a "-" in the middle of what I assume is > >mangled EUC-JP) > > Yes. that's where hiragana -> katakana conversion is attempted; > English equivalent of tr/A-Z/a-z/. Okay... What are the {begin,end

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;

2002-10-02 Thread Dan Kogai
On Wednesday, Oct 2, 2002, at 21:51 Asia/Tokyo, Jarkko Hietaniemi wrote: > However, I will need to stare at your example some more, since > for simpler cases I think tr/// *is* obeying the 'use encoding': > > use encoding 'greek'; > ($a = "\x{3af}bc\x{3af}de") =~ tr/\xdf/a/; > print $a, "\n"; > >

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;

2002-10-02 Thread Dan Kogai
On Wednesday, Oct 2, 2002, at 22:15 Asia/Tokyo, Jarkko Hietaniemi wrote: > (Hi, it's me again...) > > Are you doing character ranges in the tr/// under 'use encoding'? > (I'm asking because I see a "-" in the middle of what I assume is > mangled EUC-JP) Yes. that's where hiragana -> katakana conv

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;

2002-10-02 Thread Jarkko Hietaniemi
(Hi, it's me again...) Are you doing character ranges in the tr/// under 'use encoding'? (I'm asking because I see a "-" in the middle of what I assume is mangled EUC-JP) -- Jarkko Hietaniemi <[EMAIL PROTECTED]> http://www.iki.fi/jhi/ "There is this special biologist word we use for 'stable'.

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;

2002-10-02 Thread Jarkko Hietaniemi
(Not that I understand any Japanese but) could you resend your script as an attachment? I'm afraid it might get mangled otherwise. In the headers I see the following: Content-Type: text/plain; charset=ISO-2022-JP; format=flowed ... Content-Transfer-Encoding: 7bit and when I save the mess

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;

2002-10-02 Thread Jarkko Hietaniemi
However, I will need to stare at your example some more, since for simpler cases I think tr/// *is* obeying the 'use encoding': use encoding 'greek'; ($a = "\x{3af}bc\x{3af}de") =~ tr/\xdf/a/; print $a, "\n"; This does print "abcade\n", and it also works when I replace the \xdf with the literal

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;

2002-10-02 Thread Jarkko Hietaniemi
> As you see, tr/// is not subject to the magic of 'use encoding'. > jhi, have we made it so deliberately ? I am begging to think tr/// Not deliberately, no. I agree that making tr/// to understand 'use encoding' would be good. > is happier to enbrace the power thereof. > > Still, it can b

[FYI] use encoding 'non-utf8-encoding'; use CGI;

2002-10-02 Thread Dan Kogai
I am currently writing yet another CGI book. That is for the Japanese market and written in Japanese. So it is inevitable that you have to face the labyrinth of character encoding. Before perl 5.8.0, most book teaches how to handle Japanese in CGI goes as follows; * stick with EUC-JP. it do