Change 30872: Integrate:

Nicholas Clark Sun, 08 Apr 2007 04:15:30 -0700

Change 30872 by [EMAIL PROTECTED] on 2007/04/08 11:06:16

        Integrate:
        [ 28568]
        Subject: [PATCH] z/OS: CPAN-ized ext/ and lib/
        From: Jarkko Hietaniemi <[EMAIL PROTECTED]>
        Date: Thu, 13 Jul 2006 23:10:27 +0300
        Message-ID: <[EMAIL PROTECTED]>
        
        [ 28569]
        Version bumps for z/OS fixes.
        
        [ 28846]
        Subject: [PATCH] C++ Encode
        From: Jarkko Hietaniemi <[EMAIL PROTECTED]>
        Date: Thu, 14 Sep 2006 09:05:10 +0300
        Message-ID: <[EMAIL PROTECTED]>
        
        [ 28849]
        Avoid warnings when $Config{d_cplusplus} is undefined.
        
        [ 28974]
        Subject: [PATCH] Encode.xs: add an explicit cast to make g++ happier
        From: [EMAIL PROTECTED] (Jarkko Hietaniemi)
        Date: Mon,  9 Oct 2006 16:54:12 +0300 (EEST)
        Message-Id: <[EMAIL PROTECTED]>
        
        [ 28980]
        Subject: [PATCH] enc2xs and C++: add extern "C" to  data
        From: Jarkko Hietaniemi <[EMAIL PROTECTED]>
        Date: Tue, 10 Oct 2006 13:52:57 +0300
        Message-ID: <[EMAIL PROTECTED]>
        
        [ 29121]
        Spelling nits from Debian bug list...
        
        Subject: Bug#395426: perl: spelling errors
        From: Matt Taggart <[EMAIL PROTECTED]>
        Date: Thu, 26 Oct 2006 15:23:29 -0700
        Message-Id: <[EMAIL PROTECTED]>
        
        [ 29151]
        Delete Encode's MANIFEST (or else the make process complains
        about the missing Encode's META.yml file)
        
        [ 30357]
        Revert change #28980 per Jarkko's suggestion
        (it was actually breaking g++ builds)
        
        [ 30493]
        Subject: Re: [PATCH] (Re: [PATCH] unicode/utf8 pod)
        From: Juerd Waalboer <[EMAIL PROTECTED]>
        Date: Sun, 4 Mar 2007 16:00:19 +0100
        Message-ID: <[EMAIL PROTECTED]>
        
        [ 30693]
        Subject: [PATCH] Re: [perl #32687] Encode::is_utf8 on tainted UTF8 
string
        From: Rafael Garcia-Suarez <[EMAIL PROTECTED]>
        Date: Thu, 16 Nov 2006 17:36:44 +0100
        Message-ID: <[EMAIL PROTECTED]>
        
        [ 30836]
        C++ compilation patch by Jarkko
        
        [ 30866]
        Upgrade to Encode 2.19


Affected files ...

... //depot/maint-5.8/perl/MANIFEST#364 integrate
... //depot/maint-5.8/perl/ext/Encode/AUTHORS#18 integrate
... //depot/maint-5.8/perl/ext/Encode/Changes#32 integrate
... //depot/maint-5.8/perl/ext/Encode/Encode.pm#33 integrate
... //depot/maint-5.8/perl/ext/Encode/Encode.xs#19 integrate
... //depot/maint-5.8/perl/ext/Encode/MANIFEST#16 delete
... //depot/maint-5.8/perl/ext/Encode/bin/enc2xs#9 integrate
... //depot/maint-5.8/perl/ext/Encode/bin/piconv#10 integrate
... //depot/maint-5.8/perl/ext/Encode/encoding.pm#23 integrate
... //depot/maint-5.8/perl/ext/Encode/lib/Encode/Alias.pm#19 integrate
... //depot/maint-5.8/perl/ext/Encode/lib/Encode/CJKConstants.pm#8 integrate
... //depot/maint-5.8/perl/ext/Encode/lib/Encode/JP/H2Z.pm#5 integrate
... //depot/maint-5.8/perl/ext/Encode/lib/Encode/JP/JIS7.pm#9 integrate
... //depot/maint-5.8/perl/ext/Encode/lib/Encode/MIME/Header.pm#9 integrate
... //depot/maint-5.8/perl/ext/Encode/lib/Encode/Unicode/UTF7.pm#7 integrate
... //depot/maint-5.8/perl/ext/Encode/t/Aliases.t#5 integrate
... //depot/maint-5.8/perl/ext/Encode/t/mime-header.t#8 integrate
... //depot/maint-5.8/perl/ext/Encode/t/utf8strict.t#3 integrate

Differences ...

==== //depot/maint-5.8/perl/MANIFEST#364 (text) ====
Index: perl/MANIFEST
--- perl/MANIFEST#363~30810~    2007-03-31 06:10:12.000000000 -0700
+++ perl/MANIFEST       2007-04-08 04:06:16.000000000 -0700
@@ -429,7 +429,6 @@
 ext/Encode/lib/Encode/Supported.pod    Documents for supported encodings
 ext/Encode/lib/Encode/Unicode/UTF7.pm  Encode extension
 ext/Encode/Makefile.PL         Encode extension makefile writer
-ext/Encode/MANIFEST            Encode extension
 ext/Encode/README              Encode extension
 ext/Encode/Symbol/Makefile.PL  Encode extension
 ext/Encode/Symbol/Symbol.pm    Encode extension

==== //depot/maint-5.8/perl/ext/Encode/AUTHORS#18 (text) ====
Index: perl/ext/Encode/AUTHORS
--- perl/ext/Encode/AUTHORS#17~28165~   2006-05-11 09:01:23.000000000 -0700
+++ perl/ext/Encode/AUTHORS     2007-04-08 04:06:16.000000000 -0700
@@ -50,6 +50,7 @@
 SUGAWARA Hajime                        <[EMAIL PROTECTED]>
 SUZUKI Norio                   <[EMAIL PROTECTED]>
 Simon Cozens                   <[EMAIL PROTECTED]>
+Slaven Rezic                   <[EMAIL PROTECTED]>
 Spider Boardman                        <[EMAIL PROTECTED]>
 Steve Hay                      <[EMAIL PROTECTED]>
 Steve Peters                   <[EMAIL PROTECTED]>

==== //depot/maint-5.8/perl/ext/Encode/Changes#32 (text) ====
Index: perl/ext/Encode/Changes
--- perl/ext/Encode/Changes#31~30047~   2007-01-27 15:49:02.000000000 -0800
+++ perl/ext/Encode/Changes     2007-04-08 04:06:16.000000000 -0700
@@ -1,8 +1,33 @@
 # Revision history for Perl extension Encode.
 #
-# $Id: Changes,v 2.17 2006/06/03 20:28:48 dankogai Exp dankogai $
+# $Id: Changes,v 2.19 2007/04/06 12:53:41 dankogai Exp dankogai $
 #
-$Revision: 2.17 $ $Date: 2006/06/03 20:28:48 $
+$Revision: 2.19 $ $Date: 2007/04/06 12:53:41 $
+! lib/Encode/JP/JIS7.pm
++ t/jis7-fallback.t
+  encode('iso-2022-jp') fallback support added by MIYAGAWA++
+  decode()'s fallback remains unchanged (FB_PERLQQ) since UTF-8
+  contains all characters in iso-2022-jp so there's no need for fancy stuff.
+  Message-Id: <[EMAIL PROTECTED]>
+! Encode.pm
+  #25216 ([PATCH] Encode.pm: postpone the load of Encode::Encoding)
+  http://rt.cpan.org/NoAuth/Bug.html?id=#25216
+! lib/Encode/MIME/Header.pm t/mime-header.t
+  #24418 (Encode::MIME::Header: wrong encoding with latin1 characters)
+  http://rt.cpan.org/NoAuth/Bug.html?id=#24418
+! Encode.pm
+  #23876 (Add documentation for LEAVE_SRC)
+  http://rt.cpan.org/NoAuth/Bug.html?id=#23876
+! lib/Encode/Alias.pm t/Aliases.t
+  #20781: Thai encoding needs alias for tis-620
+  http://rt.cpan.org/NoAuth/Bug.html?id=#20781
+! bin/piconv AUTHORS
+  #20344: piconv: wrong conversion of utf-16le encoded files (with PATCH)
+  http://rt.cpan.org/NoAuth/Bug.html?id=#20344
+! Encode.pm Encode.xs bin/enc2xs encoding.pm t/Aliases.t t/utf8strict.t
+  Imported from bleedperl's 2.18_01
+
+2.18 2006/06/03 20:28:48
 ! bin/enc2xs
   overhauled the -C option
   - added ascii-ctrl', 'null', 'utf-8-strict' to core

==== //depot/maint-5.8/perl/ext/Encode/Encode.pm#33 (text) ====
Index: perl/ext/Encode/Encode.pm
--- perl/ext/Encode/Encode.pm#32~30306~ 2007-02-14 14:38:24.000000000 -0800
+++ perl/ext/Encode/Encode.pm   2007-04-08 04:06:16.000000000 -0700
@@ -1,10 +1,10 @@
 #
-# $Id: Encode.pm,v 2.18 2006/06/03 20:28:48 dankogai Exp dankogai $
+# $Id: Encode.pm,v 2.19 2007/04/06 12:53:41 dankogai Exp dankogai $
 #
 package Encode;
 use strict;
 use warnings;
-our $VERSION = sprintf "%d.%02d", q$Revision: 2.18 $ =~ /(\d+)/g;
+our $VERSION = sprintf "%d.%02d", q$Revision: 2.19 $ =~ /(\d+)/g;
 sub DEBUG () { 0 }
 use XSLoader ();
 XSLoader::load( __PACKAGE__, $VERSION );
@@ -210,7 +210,7 @@
 #
 
 sub predefine_encodings {
-    use Encode::Encoding;
+    require Encode::Encoding;
     no warnings 'redefine';
     my $use_xs = shift;
     if ($ON_EBCDIC) {
@@ -406,10 +406,10 @@
   $octets = encode("iso-8859-1", $string);
 
 B<CAVEAT>: When you run C<$octets = encode("utf8", $string)>, then $octets
-B<may not be equal to> $string.  Though they both contain the same data, the 
utf8 flag
-for $octets is B<always> off.  When you encode anything, utf8 flag of
+B<may not be equal to> $string.  Though they both contain the same data, the 
UTF8 flag
+for $octets is B<always> off.  When you encode anything, UTF8 flag of
 the result is always off, even when it contains completely valid utf8
-string. See L</"The UTF-8 flag"> below.
+string. See L</"The UTF8 flag"> below.
 
 If the $string is C<undef> then C<undef> is returned.
 
@@ -427,8 +427,8 @@
 
 B<CAVEAT>: When you run C<$string = decode("utf8", $octets)>, then $string
 B<may not be equal to> $octets.  Though they both contain the same data,
-the utf8 flag for $string is on unless $octets entirely consists of
-ASCII data (or EBCDIC on EBCDIC machines).  See L</"The UTF-8 flag">
+the UTF8 flag for $string is on unless $octets entirely consists of
+ASCII data (or EBCDIC on EBCDIC machines).  See L</"The UTF8 flag">
 below.
 
 If the $string is C<undef> then C<undef> is returned.
@@ -458,11 +458,11 @@
   $data = decode("iso-8859-1", $data);  #2
 
 Both #1 and #2 make $data consist of a completely valid UTF-8 string
-but only #2 turns utf8 flag on.  #1 is equivalent to
+but only #2 turns UTF8 flag on.  #1 is equivalent to
 
   $data = encode("utf8", decode("iso-8859-1", $data));
 
-See L</"The UTF-8 flag"> below.
+See L</"The UTF8 flag"> below.
 
 =item $octets = encode_utf8($string);
 
@@ -659,6 +659,12 @@
 
 =back
 
+=item Encode::LEAVE_SRC
+
+If the C<Encode::LEAVE_SRC> bit is not set, but I<CHECK> is, then the second
+argument to C<encode()> or C<decode()> may be assigned to by the functions. If
+you're not interested in this, then bitwise-or the bitmask with it.
+
 =head2 coderef for CHECK
 
 As of Encode 2.12 CHECK can also be a code reference which takes the
@@ -684,13 +690,13 @@
 
 See L<Encode::Encoding> for more details.
 
-=head1 The UTF-8 flag
+=head1 The UTF8 flag
 
-Before the introduction of utf8 support in perl, The C<eq> operator
+Before the introduction of Unicode support in perl, The C<eq> operator
 just compared the strings represented by two scalars. Beginning with
-perl 5.8, C<eq> compares two strings with simultaneous consideration
-of I<the utf8 flag>. To explain why we made it so, I will quote page
-402 of C<Programming Perl, 3rd ed.>
+perl 5.8, C<eq> compares two strings with simultaneous consideration of
+I<the UTF8 flag>. To explain why we made it so, I will quote page 402 of
+C<Programming Perl, 3rd ed.>
 
 =over 2
 
@@ -719,27 +725,27 @@
 Back when C<Programming Perl, 3rd ed.> was written, not even Perl 5.6.0
 was born and many features documented in the book remained
 unimplemented for a long time.  Perl 5.8 corrected this and the introduction
-of the UTF-8 flag is one of them.  You can think of this perl notion as of a
-byte-oriented mode (utf8 flag off) and a character-oriented mode (utf8
+of the UTF8 flag is one of them.  You can think of this perl notion as of a
+byte-oriented mode (UTF8 flag off) and a character-oriented mode (UTF8
 flag on).
 
-Here is how Encode takes care of the utf8 flag.
+Here is how Encode takes care of the UTF8 flag.
 
 =over 2
 
 =item *
 
-When you encode, the resulting utf8 flag is always off.
+When you encode, the resulting UTF8 flag is always off.
 
 =item *
 
-When you decode, the resulting utf8 flag is on unless you can
+When you decode, the resulting UTF8 flag is on unless you can
 unambiguously represent data.  Here is the definition of
 dis-ambiguity.
 
 After C<$utf8 = decode('foo', $octet);>,
 
-  When $octet is...   The utf8 flag in $utf8 is
+  When $octet is...   The UTF8 flag in $utf8 is
   ---------------------------------------------
   In ASCII only (or EBCDIC only)            OFF
   In ISO-8859-1                              ON
@@ -750,7 +756,7 @@
 Goal #1.  And with Encode Goal #2 is assumed but you still have to be
 careful in such cases mentioned in B<CAVEAT> paragraphs.
 
-This utf8 flag is not visible in perl scripts, exactly for the same
+This UTF8 flag is not visible in perl scripts, exactly for the same
 reason you cannot (or you I<don't have to>) see if a scalar contains a
 string, integer, or floating point number.   But you can still peek
 and poke these if you will.  See the section below.
@@ -766,7 +772,7 @@
 
 =item is_utf8(STRING [, CHECK])
 
-[INTERNAL] Tests whether the UTF-8 flag is turned on in the STRING.
+[INTERNAL] Tests whether the UTF8 flag is turned on in the STRING.
 If CHECK is true, also checks the data in STRING for being well-formed
 UTF-8.  Returns true if successful, false otherwise.
 
@@ -774,22 +780,22 @@
 
 =item _utf8_on(STRING)
 
-[INTERNAL] Turns on the UTF-8 flag in STRING.  The data in STRING is
+[INTERNAL] Turns on the UTF8 flag in STRING.  The data in STRING is
 B<not> checked for being well-formed UTF-8.  Do not use unless you
 B<know> that the STRING is well-formed UTF-8.  Returns the previous
-state of the UTF-8 flag (so please don't treat the return value as
+state of the UTF8 flag (so please don't treat the return value as
 indicating success or failure), or C<undef> if STRING is not a string.
 
 =item _utf8_off(STRING)
 
-[INTERNAL] Turns off the UTF-8 flag in STRING.  Do not use frivolously.
-Returns the previous state of the UTF-8 flag (so please don't treat the
+[INTERNAL] Turns off the UTF8 flag in STRING.  Do not use frivolously.
+Returns the previous state of the UTF8 flag (so please don't treat the
 return value as indicating success or failure), or C<undef> if STRING is
 not a string.
 
 =back
 
-=head1 UTF-8 vs. utf8
+=head1 UTF-8 vs. utf8 vs. UTF8
 
   ....We now view strings not as sequences of bytes, but as sequences
   of numbers in the range 0 .. 2**32-1 (or in the case of 64-bit
@@ -836,6 +842,8 @@
   find_encoding("utf_8")->name  # ditto. "_" are treated as "-"
   find_encoding("UTF8")->name  # is 'utf8'.
 
+The UTF8 flag is internally called UTF8, without a hyphen. It indicates
+whether a string is internally encoded as utf8, also without a hypen.
 
 =head1 SEE ALSO
 

==== //depot/maint-5.8/perl/ext/Encode/Encode.xs#19 (text) ====
Index: perl/ext/Encode/Encode.xs
--- perl/ext/Encode/Encode.xs#18~30047~ 2007-01-27 15:49:02.000000000 -0800
+++ perl/ext/Encode/Encode.xs   2007-04-08 04:06:16.000000000 -0700
@@ -1,5 +1,5 @@
 /*
- $Id: Encode.xs,v 2.10 2006/06/03 20:28:48 dankogai Exp dankogai $
+ $Id: Encode.xs,v 2.11 2007/04/06 12:53:41 dankogai Exp dankogai $
  */
 
 #define PERL_NO_GET_CONTEXT
@@ -333,7 +333,7 @@
                                );
 #if 1 /* perl-5.8.6 and older do not check UTF8_ALLOW_LONG */
         if (strict && uv > PERL_UNICODE_MAX)
-        ulen = -1;
+        ulen = (STRLEN) -1;
 #endif
             if (ulen == -1) {
                 if (strict) {
@@ -481,7 +481,8 @@
        /* Native bytes - can always encode */
     U8 *d = (U8 *) SvGROW(dst, 2*slen+1); /* +1 or assertion will botch */
        while (s < e) {
-           UV uv = NATIVE_TO_UNI((UV) *s++);
+           UV uv = NATIVE_TO_UNI((UV) *s);
+           s++; /* Above expansion of NATIVE_TO_UNI() is safer this way. */
             if (UNI_IS_INVARIANT(uv))
                *d++ = (U8)UTF_TO_NATIVE(uv);
             else {
@@ -756,15 +757,11 @@
 {
     if (SvGMAGICAL(sv)) /* it could be $1, for example */
     sv = newSVsv(sv); /* GMAGIG will be done */
-    if (SvPOK(sv)) {
     RETVAL = SvUTF8(sv) ? TRUE : FALSE;
     if (RETVAL &&
         check  &&
         !is_utf8_string((U8*)SvPVX(sv), SvCUR(sv)))
         RETVAL = FALSE;
-    } else {
-    RETVAL = FALSE;
-    }
     if (sv != ST(0))
     SvREFCNT_dec(sv); /* it was a temp copy */
 }

==== //depot/maint-5.8/perl/ext/Encode/bin/enc2xs#9 (text) ====
Index: perl/ext/Encode/bin/enc2xs
--- perl/ext/Encode/bin/enc2xs#8~30047~ 2007-01-27 15:49:02.000000000 -0800
+++ perl/ext/Encode/bin/enc2xs  2007-04-08 04:06:16.000000000 -0700
@@ -8,8 +8,9 @@
 use strict;
 use warnings;
 use Getopt::Std;
+use Config;
 my @orig_ARGV = @ARGV;
-our $VERSION  = do { my @r = (q$Revision: 2.4 $ =~ /\d+/g); sprintf 
"%d."."%02d" x $#r, @r };
+our $VERSION  = do { my @r = (q$Revision: 2.5 $ =~ /\d+/g); sprintf 
"%d."."%02d" x $#r, @r };
 
 # These may get re-ordered.
 # RAW is a do_now as inserted by &enter
@@ -176,6 +177,7 @@
  !!!!!!!   DO NOT EDIT THIS FILE   !!!!!!!
  This file was autogenerated by:
  $^X $0 @orig_ARGV
+ enc2xs VERSION $VERSION
 */
 END
   }
@@ -269,6 +271,9 @@
 
     # push(@{$encoding{$name}},outstring(\*C,$e2u->{Cname}.'_def',$erep));
    }
+  my $cpp = ($Config{d_cplusplus} || '') eq 'define';
+  my $exta = $cpp ? 'extern "C" ' : "static";
+  my $extb = $cpp ? 'extern "C" ' : "";
   foreach my $enc (sort cmp_name keys %encoding)
    {
     # my ($e2u,$u2e,$rep,$min_el,$max_el,$rsym) = @{$encoding{$enc}};
@@ -280,9 +285,9 @@
     $sym =~ s/\W+/_/g;
     my @info = ($e2u->{Cname},$u2e->{Cname},"${sym}_rep_character",$replen,
         $min_el,$max_el);
-    print C "static const U8 ${sym}_rep_character[] = \"$rep\";\n";
-    print C "static const char ${sym}_enc_name[] = \"$enc\";\n\n";
-    print C "const encode_t $sym = \n";
+    print C "${exta} const U8 ${sym}_rep_character[] = \"$rep\";\n";
+    print C "${exta} const char ${sym}_enc_name[] = \"$enc\";\n\n";
+    print C "${extb} const encode_t $sym = \n";
     # This is to make null encoding work -- dankogai
     for (my $i = (scalar @info) - 1;  $i >= 0; --$i){
     $info[$i] ||= 1;
@@ -687,8 +692,10 @@
   }
  if ($a->{'Forward'})
   {
-   my $var = $^O eq 'MacOS' ? 'extern' : 'static';
-   print $fh "$var const encpage_t $name\[",scalar(@{$a->{'Entries'}}),"];\n";
+   my $cpp = ($Config{d_cplusplus} || '') eq 'define';
+   my $var = $^O eq 'MacOS' || $cpp ? 'extern' : 'static';
+   my $const = $cpp ? '' : 'const';
+   print $fh "$var $const encpage_t $name\[",scalar(@{$a->{'Entries'}}),"];\n";
   }
  $a->{'DoneStrings'} = 1;
  foreach my $b (@{$a->{'Entries'}})
@@ -751,7 +758,9 @@
   }
 
   $strings = length $string_acc;
-  my $definition = "\nstatic const U8 $name\[$strings] = { " .
+  my $cpp = ($Config{d_cplusplus} || '') eq 'define';
+  my $var = $cpp ? '' : 'static';
+  my $definition = "\n$var const U8 $name\[$strings] = { " .
     join(',',unpack "C*",$string_acc);
   # We have a single long line. Split it at convenient commas.
   print $fh $1, "\n" while $definition =~ /\G(.{74,77},)/gcs;
@@ -776,7 +785,10 @@
    my ($s,$e,$out,$t,$end,$l) = @$b;
    outtable($fh,$t,$bigname) unless $t->{'Done'};
   }
- print $fh "\nstatic const encpage_t $name\[",
+ my $cpp = ($Config{d_cplusplus} || '') eq 'define';
+ my $var = $cpp ? '' : 'static';
+ my $const = $cpp ? '' : 'const';
+ print $fh "\n$var $const encpage_t $name\[",
    scalar(@{$a->{'Entries'}}), "] = {\n";
  foreach my $b (@{$a->{'Entries'}})
   {
@@ -1181,7 +1193,7 @@
 mappings.  This format is used by IBM's ICU package and was adopted
 by Nick Ing-Simmons for use with the Encode module.  Since UCM is
 more flexible than Tcl's Encoding Map and far more user-friendly,
-this is the recommended formet for Encode now.
+this is the recommended format for Encode now.
 
 A UCM file looks like this.
 

==== //depot/maint-5.8/perl/ext/Encode/bin/piconv#10 (text) ====
Index: perl/ext/Encode/bin/piconv
--- perl/ext/Encode/bin/piconv#9~28165~ 2006-05-11 09:01:23.000000000 -0700
+++ perl/ext/Encode/bin/piconv  2007-04-08 04:06:16.000000000 -0700
@@ -1,5 +1,5 @@
 #!./perl
-# $Id: piconv,v 2.2 2006/05/03 18:24:10 dankogai Exp $
+# $Id: piconv,v 2.3 2007/04/06 12:53:41 dankogai Exp dankogai $
 #
 use 5.8.0;
 use strict;
@@ -40,7 +40,7 @@
 my $from = $Opt{from} || $locale or help("from_encoding unspecified");
 my $to   = $Opt{to}   || $locale or help("to_encoding unspecified");
 $Opt{string} and Encode::from_to($Opt{string}, $from, $to) and print 
$Opt{string} and exit;
-my $scheme = exists $Scheme{$Opt{Scheme}} ? $Opt{Scheme} :  'from_to';
+my $scheme = exists $Scheme{$Opt{scheme}} ? $Opt{scheme} :  'from_to';
 $Opt{check} ||= $Opt{c};
 $Opt{perlqq}   and $Opt{check} = Encode::PERLQQ;
 $Opt{htmlcref} and $Opt{check} = Encode::HTMLCREF;
@@ -246,6 +246,9 @@
 
 The new perlIO layer is used.  NI-S' favorite.
 
+You should use this option if you are using UTF-16 and others which
+linefeed is not $/.
+
 =back
 
 Like the I<-D> option, this is also for Encode hackers.

==== //depot/maint-5.8/perl/ext/Encode/encoding.pm#23 (text) ====
Index: perl/ext/Encode/encoding.pm
--- perl/ext/Encode/encoding.pm#22~30306~       2007-02-14 14:38:24.000000000 
-0800
+++ perl/ext/Encode/encoding.pm 2007-04-08 04:06:16.000000000 -0700
@@ -1,6 +1,6 @@
-# $Id: encoding.pm,v 2.4 2006/06/03 20:28:48 dankogai Exp dankogai $
+# $Id: encoding.pm,v 2.5 2007/04/06 12:53:41 dankogai Exp dankogai $
 package encoding;
-our $VERSION = do { my @r = ( q$Revision: 2.4 $ =~ /\d+/g ); sprintf "%d." . 
"%02d" x $#r, @r };
+our $VERSION = do { my @r = ( q$Revision: 2.5 $ =~ /\d+/g ); sprintf "%d." . 
"%02d" x $#r, @r };
 
 use Encode;
 use strict;
@@ -307,6 +307,14 @@
 C<use encoding 'utf8';>, it will print C<4> instead, since C<$string>
 is three octets when interpreted as Latin-1.
 
+=head2 Side effects
+
+If the C<encoding> pragma is in scope then the lengths returned are
+calculated from the length of C<$/> in Unicode characters, which is not
+always the same as the length of C<$/> in the native encoding.
+
+This pragma affects utf8::upgrade, but not utf8::downgrade.
+
 =head1 FEATURES THAT REQUIRE 5.8.1
 
 Some of the features offered by this pragma requires perl 5.8.1.  Most

==== //depot/maint-5.8/perl/ext/Encode/lib/Encode/Alias.pm#19 (text) ====
Index: perl/ext/Encode/lib/Encode/Alias.pm
--- perl/ext/Encode/lib/Encode/Alias.pm#18~30306~       2007-02-14 
14:38:24.000000000 -0800
+++ perl/ext/Encode/lib/Encode/Alias.pm 2007-04-08 04:06:16.000000000 -0700
@@ -3,7 +3,7 @@
 use warnings;
 no warnings 'redefine';
 use Encode;
-our $VERSION = do { my @r = ( q$Revision: 2.6 $ =~ /\d+/g ); sprintf "%d." . 
"%02d" x $#r, @r };
+our $VERSION = do { my @r = ( q$Revision: 2.7 $ =~ /\d+/g ); sprintf "%d." . 
"%02d" x $#r, @r };
 sub DEBUG () { 0 }
 
 use base qw(Exporter);
@@ -189,8 +189,9 @@
         'greek'    => 'iso-8859-7',
         'hebrew'   => 'iso-8859-8',
         'thai'     => 'iso-8859-11',
-        'tis620'   => 'iso-8859-11',
     );
+    # RT #20781
+    define_alias(qr/\btis-?620\b/i  => '"iso-8859-11"');
 
     # At least AIX has IBM-NNN (surprisingly...) instead of cpNNN.
     # And Microsoft has their own naming (again, surprisingly).

==== //depot/maint-5.8/perl/ext/Encode/lib/Encode/CJKConstants.pm#8 (text) ====
Index: perl/ext/Encode/lib/Encode/CJKConstants.pm
--- perl/ext/Encode/lib/Encode/CJKConstants.pm#7~30047~ 2007-01-27 
15:49:02.000000000 -0800
+++ perl/ext/Encode/lib/Encode/CJKConstants.pm  2007-04-08 04:06:16.000000000 
-0700
@@ -1,12 +1,12 @@
 #
-# $Id: CJKConstants.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp dankogai $
+# $Id: CJKConstants.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp $
 #
 
 package Encode::CJKConstants;
 
 use strict;
 use warnings;
-our $RCSID = q$Id: CJKConstants.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp 
dankogai $;
+our $RCSID = q$Id: CJKConstants.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp $;
 our $VERSION = do { my @r = ( q$Revision: 2.2 $ =~ /\d+/g ); sprintf "%d." . 
"%02d" x $#r, @r };
 
 use Carp;

==== //depot/maint-5.8/perl/ext/Encode/lib/Encode/JP/H2Z.pm#5 (text) ====
Index: perl/ext/Encode/lib/Encode/JP/H2Z.pm
--- perl/ext/Encode/lib/Encode/JP/H2Z.pm#4~30047~       2007-01-27 
15:49:02.000000000 -0800
+++ perl/ext/Encode/lib/Encode/JP/H2Z.pm        2007-04-08 04:06:16.000000000 
-0700
@@ -1,5 +1,5 @@
 #
-# $Id: H2Z.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp dankogai $
+# $Id: H2Z.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp $
 #
 
 package Encode::JP::H2Z;
@@ -7,7 +7,7 @@
 use strict;
 use warnings;
 
-our $RCSID = q$Id: H2Z.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp dankogai $;
+our $RCSID = q$Id: H2Z.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp $;
 our $VERSION = do { my @r = ( q$Revision: 2.2 $ =~ /\d+/g ); sprintf "%d." . 
"%02d" x $#r, @r };
 
 use Encode::CJKConstants qw(:all);

==== //depot/maint-5.8/perl/ext/Encode/lib/Encode/JP/JIS7.pm#9 (text) ====
Index: perl/ext/Encode/lib/Encode/JP/JIS7.pm
--- perl/ext/Encode/lib/Encode/JP/JIS7.pm#8~30047~      2007-01-27 
15:49:02.000000000 -0800
+++ perl/ext/Encode/lib/Encode/JP/JIS7.pm       2007-04-08 04:06:16.000000000 
-0700
@@ -1,7 +1,7 @@
 package Encode::JP::JIS7;
 use strict;
 use warnings;
-our $VERSION = do { my @r = ( q$Revision: 2.2 $ =~ /\d+/g ); sprintf "%d." . 
"%02d" x $#r, @r };
+our $VERSION = do { my @r = ( q$Revision: 2.3 $ =~ /\d+/g ); sprintf "%d." . 
"%02d" x $#r, @r };
 
 use Encode qw(:fallbacks);
 
@@ -49,7 +49,7 @@
     # empty the input string in the stack so perlio is ok
     $_[1] = '' if $chk;
     my ( $h2z, $jis0212 ) = @$obj{qw(h2z jis0212)};
-    my $octet = Encode::encode( 'euc-jp', $utf8, FB_PERLQQ );
+    my $octet = Encode::encode( 'euc-jp', $utf8, $chk );
     $h2z and &Encode::JP::H2Z::h2z( \$octet );
     euc_jis( \$octet, $jis0212 );
     return $octet;

==== //depot/maint-5.8/perl/ext/Encode/lib/Encode/MIME/Header.pm#9 (text) ====
Index: perl/ext/Encode/lib/Encode/MIME/Header.pm
--- perl/ext/Encode/lib/Encode/MIME/Header.pm#8~30047~  2007-01-27 
15:49:02.000000000 -0800
+++ perl/ext/Encode/lib/Encode/MIME/Header.pm   2007-04-08 04:06:16.000000000 
-0700
@@ -3,7 +3,7 @@
 use warnings;
 no warnings 'redefine';
 
-our $VERSION = do { my @r = ( q$Revision: 2.4 $ =~ /\d+/g ); sprintf "%d." . 
"%02d" x $#r, @r };
+our $VERSION = do { my @r = ( q$Revision: 2.5 $ =~ /\d+/g ); sprintf "%d." . 
"%02d" x $#r, @r };
 use Encode qw(find_encoding encode_utf8 decode_utf8);
 use MIME::Base64;
 use Carp;
@@ -174,12 +174,13 @@
 
 sub _encode_q {
     my $chunk = shift;
+    $chunk = encode_utf8($chunk);
     $chunk =~ s{
         ([^0-9A-Za-z])
            }{
            join("" => map {sprintf "=%02X", $_} unpack("C*", $1))
            }egox;
-    return decode_utf8( HEAD . 'Q?' . $chunk . TAIL );
+    return HEAD . 'Q?' . $chunk . TAIL;
 }
 
 1;

==== //depot/maint-5.8/perl/ext/Encode/lib/Encode/Unicode/UTF7.pm#7 (text) ====
Index: perl/ext/Encode/lib/Encode/Unicode/UTF7.pm
--- perl/ext/Encode/lib/Encode/Unicode/UTF7.pm#6~30047~ 2007-01-27 
15:49:02.000000000 -0800
+++ perl/ext/Encode/lib/Encode/Unicode/UTF7.pm  2007-04-08 04:06:16.000000000 
-0700
@@ -1,5 +1,5 @@
 #
-# $Id: UTF7.pm,v 2.4 2006/06/03 20:28:48 dankogai Exp dankogai $
+# $Id: UTF7.pm,v 2.4 2006/06/03 20:28:48 dankogai Exp $
 #
 package Encode::Unicode::UTF7;
 use strict;

==== //depot/maint-5.8/perl/ext/Encode/t/Aliases.t#5 (text) ====
Index: perl/ext/Encode/t/Aliases.t
--- perl/ext/Encode/t/Aliases.t#4~28165~        2006-05-11 09:01:23.000000000 
-0700
+++ perl/ext/Encode/t/Aliases.t 2007-04-08 04:06:16.000000000 -0700
@@ -42,6 +42,7 @@
         'hebrew'   => 'iso-8859-8',
         'thai'     => 'iso-8859-11',
         'tis620'   => 'iso-8859-11',
+        'tis-620'   => 'iso-8859-11',
         'WinLatin1'     => 'cp1252',
         'WinLatin2'     => 'cp1250',
         'WinCyrillic'   => 'cp1251',
@@ -140,6 +141,7 @@
 print "# alias test with alias overrides\n";
 
 foreach my $a (keys %a2c){     
+    print "# $a => $a2c{$a}\n";
     my $e = Encode::find_encoding($a);
     is((defined($e) and $e->name), $a2c{$a}, "Override $a")
     or warn "alias was $a";

==== //depot/maint-5.8/perl/ext/Encode/t/mime-header.t#8 (text) ====
Index: perl/ext/Encode/t/mime-header.t
--- perl/ext/Encode/t/mime-header.t#7~28165~    2006-05-11 09:01:23.000000000 
-0700
+++ perl/ext/Encode/t/mime-header.t     2007-04-08 04:06:16.000000000 -0700
@@ -1,5 +1,5 @@
 #
-# $Id: mime-header.t,v 2.2 2006/05/03 18:24:10 dankogai Exp $
+# $Id: mime-header.t,v 2.3 2007/04/06 12:53:41 dankogai Exp dankogai $
 # This script is written in utf8
 #
 BEGIN {
@@ -23,7 +23,7 @@
 
 use strict;
 #use Test::More qw(no_plan);
-use Test::More tests => 11;
+use Test::More tests => 12;
 use_ok("Encode::MIME::Header");
 
 my $eheader =<<'EOS';
@@ -116,4 +116,7 @@
     is(Encode::encode('MIME-Q' => $pound_1024), '=?UTF-8?Q?=C2=A31024?=',
        'pound 1024');
 }
+
+is(Encode::encode('MIME-Q', "\x{fc}"), '=?UTF-8?Q?=C3=BC?=', 'Encode latin1 
characters');
+
 __END__;

==== //depot/maint-5.8/perl/ext/Encode/t/utf8strict.t#3 (text) ====
Index: perl/ext/Encode/t/utf8strict.t
--- perl/ext/Encode/t/utf8strict.t#2~28165~     2006-05-11 09:01:23.000000000 
-0700
+++ perl/ext/Encode/t/utf8strict.t      2007-04-08 04:06:16.000000000 -0700
@@ -40,14 +40,25 @@
          0x0000FFFF => 1, # 5.3.1
         );
      $NTESTS +=  scalar keys %ORD;
-     %SEQ = (
-         qq/ed 9f bf/    => 0, # 2.3.1
-         qq/ee 80 80/    => 0, # 2.3.2
-         qq/f4 8f bf bf/ => 0, # 2.3.3
-         qq/f4 90 80 80/ => 1, # 2.3.4 -- out of range so NG
-         # "3 Malformed sequences" are checked by perl.
-         # "4 Overlong sequences"  are checked by perl.
-        );
+     if (ord('A') == 193) {
+        %SEQ = (
+                qq/dd 64 73 73/    => 0, # 2.3.1
+                qq/dd 67 41 41/    => 0, # 2.3.2
+                qq/ee 42 73 73 73/ => 0, # 2.3.3
+                qq/f4 90 80 80/ => 1, # 2.3.4 -- out of range so NG
+                # "3 Malformed sequences" are checked by perl.
+                # "4 Overlong sequences"  are checked by perl.
+                );
+     } else {
+        %SEQ = (
+                qq/ed 9f bf/    => 0, # 2.3.1
+                qq/ee 80 80/    => 0, # 2.3.2
+                qq/f4 8f bf bf/ => 0, # 2.3.3
+                qq/f4 90 80 80/ => 1, # 2.3.4 -- out of range so NG
+                # "3 Malformed sequences" are checked by perl.
+                # "4 Overlong sequences"  are checked by perl.
+                );
+     }
      $NTESTS +=  scalar keys %SEQ;
 }
 use strict;
End of Patch.

Change 30872: Integrate:

Reply via email to