[Encode] Charset-0.01 released

Dan Kogai Fri, 29 Mar 2002 11:21:29 -0800

Encode hackers,

   With Encode-1.00 released,  I felt like a more casual programming so I 
came up with Charset-0.01.  I'll just post a pod2text-rendered doc to 
show you how it is like.  My favorite is a one-liner from shell script.


Dan the Yet Another Perl Hacker -- Now in Various Languages
====
NAME
     Charset - write perl codes in any encodings you like

SYNOPSIS
       use Charset "euc-jp"; # Jperl!
       #...
       sub tricky_part{
          no Charset;
          #...
       }
       use Charset "euc-jp"; # restore the state; Filter::Simple bug.
       # Handy for EUC-JP => UTF-8 converter
       # when your text editor only supports Shift_JIS !
       use Charset "shiftjis", IN => "euc-jp", OUT => "utf8";
       # If your shell supports EUC-JP, you can even do this!
       perl -MCharset=euc-jp 'print "Nihongo\n" x 4'

ABSTRACT
     This module allows you to write your perl codes in not only ASCII (or
     EBCDIC where your environment allows) or UTF-8 but any character
     encodings that Encode module supports.

USAGE
     First argument to the "use" line must be the name of encoding which
     matches your script. It croaks if none specified or the one 
specified is
     unsupported by the Encode module.

     You can optionally feed the argument in hash. The followin options 
are
     supported.

     STDIN => *enc_name*
         Sets the discipline of STDIN to ":encoding(*enc_name*)". By 
default,
         the same encoding as the caller script is used.

     STDOUT => *enc_name*
         Sets the discipline of STDOUT to ":encoding(*enc_name*)". By
         default, the same encoding as the caller script is used.

     IN => *enc_name*
         Internally does "use open IN => ":encoding(*enc_name*)"". No 
default
         is set. See open.

     OUT => *enc_name*
         Internally does "use open OUT => ":encoding(*enc_name*)"". No
         default is set. See open.

     IO => *enc_name*
         Internally does "use open IO => ":encoding(*enc_name*)"". No 
default
         is set. IN or OUT overrides this setting.

DESCRIPTION
     This is a technology demonstrator of Perl 5.8.0. It uses Encode and
     Filter::Util::Call, both of which will be inlucuded in perl
     distribution.

     Before perl 5.6.0, a character means a byte. Though it was possible 
to
     include literals in multibyte characters in certain encodings (such 
as
     EUC-JP), You needed to handle them with care. Some encodings didn't 
even
     allow this (such as Shift_JIS) and you needed things like Jperl to do
     that. If your multibyte encoding was not Japanese, you were out of 
luck.

     As of Perl 5.6.0, you could use UTF-8 strings internally so you could
     apply everything you wanted to do to multilingual string, including
     regexes. You could even use UTF-8 string for identifiers you could go
     like

       my $Ren++; #   "Ren" is really a U+4EBA

     to make a child :) But there was one precondition. Your source file 
must
     be in UTF-8. With decent text editors and environments that can 
handle
     UTF-8 was rare (and still is to some extent), You still needed 
character
     encoding converters like Jcode.pm

     With perl 5.8.0 and this module, this will all change. Your old 
script
     in your regional character encoding suddenly starts working just by
     adding

       use Charset qw(your-encoding);

BUGS
     This modules uses Filter::Simple. So it is subject to the limitation 
of
     Filter::Simple. Filter::Simple and Text::Balance which Filter::Simple
     uses does a pretty good job for block detection

SEE ALSO
     Encode, Filter::Simple, open, PerlIO

AUTHOR
     Dan Kogai <[EMAIL PROTECTED]>

COPYRIGHT AND LICENSE
     Copyright 2002 by Dan Kogai, all rights reserved.

     This library is free software; you can redistribute it and/or modify 
it
     under the same terms as Perl itself.

[Encode] Charset-0.01 released

Reply via email to