Re: Encode-JIS2K-0.02 problem
Hello Joel, On Jan 4, 2007, at 4:09 PM, Joel Rees wrote: On 2007/01/03, at 23:52, Nobumi Iyanaga wrote: $_ = decode ("shiftjisx0123", $_); print; I get this error message: untitled text 4:21: Unknown encoding 'shiftjisx0123' Is that a typo? What am I doing wrong...?? Maybe 0123 should be 2013? (I've never seen the version number for jis tagged on the end, but ...) Ah! thank you! That's right. It is "shiftjisx0213"! My excuse, if there is any, is that I copied "shiftjisx0123" from search.cpan.org/~dankogai/Encode-JIS2K-0.02/JIS2K.pm>, under ABSTRACT ("Canonical") --- And -- if I can solve this problem, I would like to find out from text files in shiftjisx0123 characters which belong only to JIS X 0213, not to JIS X 0212. Is this possible...?? I'm sure it's possible, either by making something like an isprint boolean table for each entire character set, or be slurping the file and scanning it in parallel from memory. I think it should even be possible to open two read-only streams on the same file, read characters out, and throw some message when the one doesn't match the other. Don't know if there are any shortcut tools for it. Thank you. I will try to study a little more on this problem. Best regards, Nobumi Iyanaga Tokyo, Japan
Encode-JIS2K-0.02 problem
Hello, I downloaded and installed Encode-JIS2K-0.02. Install log says that all tests were successful. But when I do this: #!/usr/bin/perl use strict; use warnings; use Encode::JIS2K; use Encode qw/encode decode/; my $infile = "some_shiftjisx0123.txt"; undef $/; open (IN, $infile); $_ = ; close (IN); binmode (STDOUT, ":utf8"); $_ = decode ("shiftjisx0123", $_); print; I get this error message: untitled text 4:21: Unknown encoding 'shiftjisx0123' What am I doing wrong...?? --- And -- if I can solve this problem, I would like to find out from text files in shiftjisx0123 characters which belong only to JIS X 0213, not to JIS X 0212. Is this possible...?? Thank you very much in advance. Best regards, Nobumi Iyanaga Tokyo, Japan
Re: encode qp a Unicode string
Hello Gisle, On Oct 7, 2006, at 11:04 PM, Gisle Aas wrote: Nobumi Iyanaga <[EMAIL PROTECTED]> writes: What am I doing wrong? You did not read 'perldoc MIME::QuotedPrint' to the end :) |Perl v5.8 and better allow extended Unicode characters in strings. Such strings |cannot be encoded directly, as the quoted-printable encoding is only defined for |single-byte characters. The solution is to use the Encode module to select the byte |encoding you want. For example: | |use MIME::QuotedPrint qw(encode_qp); |use Encode qw(encode); | |$encoded = encode_qp(encode("UTF-8", "\x{}\n")); |print $encoded; Ah, thank you very much indeed! Best regards, Nobumi Iyanaga Tokyo, Japan
encode qp a Unicode string
Hello, I have a Unicode string that I would like to convert into quoted- printable encoding, but if I do: #!/usr/bin/perl use utf8; use MIME::QuotedPrint; my $unicode_string = "xxx" # where I have real Unicode string, for example Japanese characters... $encoded = encode_qp($unicode_string); print "qp: $encoded\n"; I get the error message: "Wide character in subroutine entry" If I comment out the "use utf8", I get the right result, but I need it for my script. I tried also to convert the Unicode string to data using the code $native_string = pack("C*", unpack("U*", $Unicode_string)); that I found in perluniintro, but I get the error message: "Character in 'C' format wrapped in pack" What am I doing wrong? Thank you very much in advance for any help. Best regards, Nobumi Iyanaga Tokyo, Japan
How to know if a module is installed
Hello, This is a newbie question: how can I determine if a specific module is installed on a client machine? I would like to do something like this: if (MacPerl installed is true) { do this...; } else { do nothing...; } Thank you in advance for any help. Best regards, Nobumi Iyanaga Tokyo, Japan
Re: Enconding, locate, etc.
Hello Ende, On Apr 19, 2006, at 8:58 PM, ende wrote: Wow! It is near the full solution! It is a pity it fails when you do not use accented chars!! I tried again, and with the following script, it *seems* that you can use either non-accented characters or accented character: #!/usr/bin/perl use utf8; use Encode; use Unicode::Normalize; binmode (STDOUT, ":utf8"); my $re = join("|", @ARGV); $re = decode ("utf8", $re); my $listin = "/Users/me/Documents/documentos/Familia/Casa/Telistin.txt"; open my $f, "<:encoding(MacRoman)", "$listin" or die "$listin no abre: $!"; while (<$f>) { chomp; if (/$re/i) { print $_, "\n"; } else { my $temp = NFD($_); $temp =~ s/[\x{0300}-\x{036F}\x{0081}]+//g; print $_, "\n" if $temp =~ /$re/i; } if ($re !~ /^[\x{}-\x{007F}]+$/) { my $temp = NFD($re); $temp =~ s/[\x{0300}-\x{036F}\x{0081}]+//g; print $_, "\n" if /$temp/i; } } close $f; You would call this script either: perl Ende_test.pl angeles or perl Ende_test.pl ángeles to get: Ángeles Angeles ángeles angeles Is this what you would want...? Note that I am not sure at all if this will work for all cases. Best regards, Nobumi Iyanaga Tokyo, Japan
Re: Enconding, locate, etc.
Dear Ende, On Apr 19, 2006, at 5:22 PM, ende wrote: Thanks Nobumi, Your solution is not only shorter but also more precise and correct than my first attempt. But, anyway, although it works better it doesn't find words with different accented capitalization. That is, if you look for "Ángeles" it doesn't find nor "Angeles" nor "angeles" nor "ángeles"... Well, on my machine, if I call that script with: perl Ende_test.pl Ángeles it does find "Ángeles" AND "ángeles" (because it has the "i" option in the regex). But you seem to want to do a kind of "accent insensitive search"...? That should not be simple. One possible -- and rather simple -- solution would be to use "Unicode::Normalize". I just tried this script: #!/usr/bin/perl use utf8; use Encode; use Unicode::Normalize; binmode (STDOUT, ":utf8"); my $re = join("|", @ARGV); $re = decode ("utf8", $re); my $listin = "/Users/me/Documents/documentos/Familia/Casa/Telistin.txt"; open my $f, "<:encoding(MacRoman)", "$listin" or die "$listin no abre: $!"; while (<$f>) { chomp; if (/$re/i) { print $_, "\n"; } else { my $temp = NFD($re); $temp =~ s/[\x{0300}-\x{036F}\x{0081}]+//g; print $_, "\n" if /$temp/i; } } close $f; I can call this script from Terminal like this: perl Ende_test.pl Ángeles or perl Ende_test.pl ángeles and get the reply: Ángeles Angeles ángeles angeles -- But you have to use the accented character to match non-accented characters -- that is, you will find only Angeles angeles if you invoke the script with: perl Ende_test.pl Angeles or perl Ende_test.pl angeles Best regards, Nobumi Iyanaga Tokyo, Japan