Change 30872 by [EMAIL PROTECTED] on 2007/04/08 11:06:16 Integrate: [ 28568] Subject: [PATCH] z/OS: CPAN-ized ext/ and lib/ From: Jarkko Hietaniemi <[EMAIL PROTECTED]> Date: Thu, 13 Jul 2006 23:10:27 +0300 Message-ID: <[EMAIL PROTECTED]> [ 28569] Version bumps for z/OS fixes. [ 28846] Subject: [PATCH] C++ Encode From: Jarkko Hietaniemi <[EMAIL PROTECTED]> Date: Thu, 14 Sep 2006 09:05:10 +0300 Message-ID: <[EMAIL PROTECTED]> [ 28849] Avoid warnings when $Config{d_cplusplus} is undefined. [ 28974] Subject: [PATCH] Encode.xs: add an explicit cast to make g++ happier From: [EMAIL PROTECTED] (Jarkko Hietaniemi) Date: Mon, 9 Oct 2006 16:54:12 +0300 (EEST) Message-Id: <[EMAIL PROTECTED]> [ 28980] Subject: [PATCH] enc2xs and C++: add extern "C" to data From: Jarkko Hietaniemi <[EMAIL PROTECTED]> Date: Tue, 10 Oct 2006 13:52:57 +0300 Message-ID: <[EMAIL PROTECTED]> [ 29121] Spelling nits from Debian bug list... Subject: Bug#395426: perl: spelling errors From: Matt Taggart <[EMAIL PROTECTED]> Date: Thu, 26 Oct 2006 15:23:29 -0700 Message-Id: <[EMAIL PROTECTED]> [ 29151] Delete Encode's MANIFEST (or else the make process complains about the missing Encode's META.yml file) [ 30357] Revert change #28980 per Jarkko's suggestion (it was actually breaking g++ builds) [ 30493] Subject: Re: [PATCH] (Re: [PATCH] unicode/utf8 pod) From: Juerd Waalboer <[EMAIL PROTECTED]> Date: Sun, 4 Mar 2007 16:00:19 +0100 Message-ID: <[EMAIL PROTECTED]> [ 30693] Subject: [PATCH] Re: [perl #32687] Encode::is_utf8 on tainted UTF8 string From: Rafael Garcia-Suarez <[EMAIL PROTECTED]> Date: Thu, 16 Nov 2006 17:36:44 +0100 Message-ID: <[EMAIL PROTECTED]> [ 30836] C++ compilation patch by Jarkko [ 30866] Upgrade to Encode 2.19
Affected files ... ... //depot/maint-5.8/perl/MANIFEST#364 integrate ... //depot/maint-5.8/perl/ext/Encode/AUTHORS#18 integrate ... //depot/maint-5.8/perl/ext/Encode/Changes#32 integrate ... //depot/maint-5.8/perl/ext/Encode/Encode.pm#33 integrate ... //depot/maint-5.8/perl/ext/Encode/Encode.xs#19 integrate ... //depot/maint-5.8/perl/ext/Encode/MANIFEST#16 delete ... //depot/maint-5.8/perl/ext/Encode/bin/enc2xs#9 integrate ... //depot/maint-5.8/perl/ext/Encode/bin/piconv#10 integrate ... //depot/maint-5.8/perl/ext/Encode/encoding.pm#23 integrate ... //depot/maint-5.8/perl/ext/Encode/lib/Encode/Alias.pm#19 integrate ... //depot/maint-5.8/perl/ext/Encode/lib/Encode/CJKConstants.pm#8 integrate ... //depot/maint-5.8/perl/ext/Encode/lib/Encode/JP/H2Z.pm#5 integrate ... //depot/maint-5.8/perl/ext/Encode/lib/Encode/JP/JIS7.pm#9 integrate ... //depot/maint-5.8/perl/ext/Encode/lib/Encode/MIME/Header.pm#9 integrate ... //depot/maint-5.8/perl/ext/Encode/lib/Encode/Unicode/UTF7.pm#7 integrate ... //depot/maint-5.8/perl/ext/Encode/t/Aliases.t#5 integrate ... //depot/maint-5.8/perl/ext/Encode/t/mime-header.t#8 integrate ... //depot/maint-5.8/perl/ext/Encode/t/utf8strict.t#3 integrate Differences ... ==== //depot/maint-5.8/perl/MANIFEST#364 (text) ==== Index: perl/MANIFEST --- perl/MANIFEST#363~30810~ 2007-03-31 06:10:12.000000000 -0700 +++ perl/MANIFEST 2007-04-08 04:06:16.000000000 -0700 @@ -429,7 +429,6 @@ ext/Encode/lib/Encode/Supported.pod Documents for supported encodings ext/Encode/lib/Encode/Unicode/UTF7.pm Encode extension ext/Encode/Makefile.PL Encode extension makefile writer -ext/Encode/MANIFEST Encode extension ext/Encode/README Encode extension ext/Encode/Symbol/Makefile.PL Encode extension ext/Encode/Symbol/Symbol.pm Encode extension ==== //depot/maint-5.8/perl/ext/Encode/AUTHORS#18 (text) ==== Index: perl/ext/Encode/AUTHORS --- perl/ext/Encode/AUTHORS#17~28165~ 2006-05-11 09:01:23.000000000 -0700 +++ perl/ext/Encode/AUTHORS 2007-04-08 04:06:16.000000000 -0700 @@ -50,6 +50,7 @@ SUGAWARA Hajime <[EMAIL PROTECTED]> SUZUKI Norio <[EMAIL PROTECTED]> Simon Cozens <[EMAIL PROTECTED]> +Slaven Rezic <[EMAIL PROTECTED]> Spider Boardman <[EMAIL PROTECTED]> Steve Hay <[EMAIL PROTECTED]> Steve Peters <[EMAIL PROTECTED]> ==== //depot/maint-5.8/perl/ext/Encode/Changes#32 (text) ==== Index: perl/ext/Encode/Changes --- perl/ext/Encode/Changes#31~30047~ 2007-01-27 15:49:02.000000000 -0800 +++ perl/ext/Encode/Changes 2007-04-08 04:06:16.000000000 -0700 @@ -1,8 +1,33 @@ # Revision history for Perl extension Encode. # -# $Id: Changes,v 2.17 2006/06/03 20:28:48 dankogai Exp dankogai $ +# $Id: Changes,v 2.19 2007/04/06 12:53:41 dankogai Exp dankogai $ # -$Revision: 2.17 $ $Date: 2006/06/03 20:28:48 $ +$Revision: 2.19 $ $Date: 2007/04/06 12:53:41 $ +! lib/Encode/JP/JIS7.pm ++ t/jis7-fallback.t + encode('iso-2022-jp') fallback support added by MIYAGAWA++ + decode()'s fallback remains unchanged (FB_PERLQQ) since UTF-8 + contains all characters in iso-2022-jp so there's no need for fancy stuff. + Message-Id: <[EMAIL PROTECTED]> +! Encode.pm + #25216 ([PATCH] Encode.pm: postpone the load of Encode::Encoding) + http://rt.cpan.org/NoAuth/Bug.html?id=#25216 +! lib/Encode/MIME/Header.pm t/mime-header.t + #24418 (Encode::MIME::Header: wrong encoding with latin1 characters) + http://rt.cpan.org/NoAuth/Bug.html?id=#24418 +! Encode.pm + #23876 (Add documentation for LEAVE_SRC) + http://rt.cpan.org/NoAuth/Bug.html?id=#23876 +! lib/Encode/Alias.pm t/Aliases.t + #20781: Thai encoding needs alias for tis-620 + http://rt.cpan.org/NoAuth/Bug.html?id=#20781 +! bin/piconv AUTHORS + #20344: piconv: wrong conversion of utf-16le encoded files (with PATCH) + http://rt.cpan.org/NoAuth/Bug.html?id=#20344 +! Encode.pm Encode.xs bin/enc2xs encoding.pm t/Aliases.t t/utf8strict.t + Imported from bleedperl's 2.18_01 + +2.18 2006/06/03 20:28:48 ! bin/enc2xs overhauled the -C option - added ascii-ctrl', 'null', 'utf-8-strict' to core ==== //depot/maint-5.8/perl/ext/Encode/Encode.pm#33 (text) ==== Index: perl/ext/Encode/Encode.pm --- perl/ext/Encode/Encode.pm#32~30306~ 2007-02-14 14:38:24.000000000 -0800 +++ perl/ext/Encode/Encode.pm 2007-04-08 04:06:16.000000000 -0700 @@ -1,10 +1,10 @@ # -# $Id: Encode.pm,v 2.18 2006/06/03 20:28:48 dankogai Exp dankogai $ +# $Id: Encode.pm,v 2.19 2007/04/06 12:53:41 dankogai Exp dankogai $ # package Encode; use strict; use warnings; -our $VERSION = sprintf "%d.%02d", q$Revision: 2.18 $ =~ /(\d+)/g; +our $VERSION = sprintf "%d.%02d", q$Revision: 2.19 $ =~ /(\d+)/g; sub DEBUG () { 0 } use XSLoader (); XSLoader::load( __PACKAGE__, $VERSION ); @@ -210,7 +210,7 @@ # sub predefine_encodings { - use Encode::Encoding; + require Encode::Encoding; no warnings 'redefine'; my $use_xs = shift; if ($ON_EBCDIC) { @@ -406,10 +406,10 @@ $octets = encode("iso-8859-1", $string); B<CAVEAT>: When you run C<$octets = encode("utf8", $string)>, then $octets -B<may not be equal to> $string. Though they both contain the same data, the utf8 flag -for $octets is B<always> off. When you encode anything, utf8 flag of +B<may not be equal to> $string. Though they both contain the same data, the UTF8 flag +for $octets is B<always> off. When you encode anything, UTF8 flag of the result is always off, even when it contains completely valid utf8 -string. See L</"The UTF-8 flag"> below. +string. See L</"The UTF8 flag"> below. If the $string is C<undef> then C<undef> is returned. @@ -427,8 +427,8 @@ B<CAVEAT>: When you run C<$string = decode("utf8", $octets)>, then $string B<may not be equal to> $octets. Though they both contain the same data, -the utf8 flag for $string is on unless $octets entirely consists of -ASCII data (or EBCDIC on EBCDIC machines). See L</"The UTF-8 flag"> +the UTF8 flag for $string is on unless $octets entirely consists of +ASCII data (or EBCDIC on EBCDIC machines). See L</"The UTF8 flag"> below. If the $string is C<undef> then C<undef> is returned. @@ -458,11 +458,11 @@ $data = decode("iso-8859-1", $data); #2 Both #1 and #2 make $data consist of a completely valid UTF-8 string -but only #2 turns utf8 flag on. #1 is equivalent to +but only #2 turns UTF8 flag on. #1 is equivalent to $data = encode("utf8", decode("iso-8859-1", $data)); -See L</"The UTF-8 flag"> below. +See L</"The UTF8 flag"> below. =item $octets = encode_utf8($string); @@ -659,6 +659,12 @@ =back +=item Encode::LEAVE_SRC + +If the C<Encode::LEAVE_SRC> bit is not set, but I<CHECK> is, then the second +argument to C<encode()> or C<decode()> may be assigned to by the functions. If +you're not interested in this, then bitwise-or the bitmask with it. + =head2 coderef for CHECK As of Encode 2.12 CHECK can also be a code reference which takes the @@ -684,13 +690,13 @@ See L<Encode::Encoding> for more details. -=head1 The UTF-8 flag +=head1 The UTF8 flag -Before the introduction of utf8 support in perl, The C<eq> operator +Before the introduction of Unicode support in perl, The C<eq> operator just compared the strings represented by two scalars. Beginning with -perl 5.8, C<eq> compares two strings with simultaneous consideration -of I<the utf8 flag>. To explain why we made it so, I will quote page -402 of C<Programming Perl, 3rd ed.> +perl 5.8, C<eq> compares two strings with simultaneous consideration of +I<the UTF8 flag>. To explain why we made it so, I will quote page 402 of +C<Programming Perl, 3rd ed.> =over 2 @@ -719,27 +725,27 @@ Back when C<Programming Perl, 3rd ed.> was written, not even Perl 5.6.0 was born and many features documented in the book remained unimplemented for a long time. Perl 5.8 corrected this and the introduction -of the UTF-8 flag is one of them. You can think of this perl notion as of a -byte-oriented mode (utf8 flag off) and a character-oriented mode (utf8 +of the UTF8 flag is one of them. You can think of this perl notion as of a +byte-oriented mode (UTF8 flag off) and a character-oriented mode (UTF8 flag on). -Here is how Encode takes care of the utf8 flag. +Here is how Encode takes care of the UTF8 flag. =over 2 =item * -When you encode, the resulting utf8 flag is always off. +When you encode, the resulting UTF8 flag is always off. =item * -When you decode, the resulting utf8 flag is on unless you can +When you decode, the resulting UTF8 flag is on unless you can unambiguously represent data. Here is the definition of dis-ambiguity. After C<$utf8 = decode('foo', $octet);>, - When $octet is... The utf8 flag in $utf8 is + When $octet is... The UTF8 flag in $utf8 is --------------------------------------------- In ASCII only (or EBCDIC only) OFF In ISO-8859-1 ON @@ -750,7 +756,7 @@ Goal #1. And with Encode Goal #2 is assumed but you still have to be careful in such cases mentioned in B<CAVEAT> paragraphs. -This utf8 flag is not visible in perl scripts, exactly for the same +This UTF8 flag is not visible in perl scripts, exactly for the same reason you cannot (or you I<don't have to>) see if a scalar contains a string, integer, or floating point number. But you can still peek and poke these if you will. See the section below. @@ -766,7 +772,7 @@ =item is_utf8(STRING [, CHECK]) -[INTERNAL] Tests whether the UTF-8 flag is turned on in the STRING. +[INTERNAL] Tests whether the UTF8 flag is turned on in the STRING. If CHECK is true, also checks the data in STRING for being well-formed UTF-8. Returns true if successful, false otherwise. @@ -774,22 +780,22 @@ =item _utf8_on(STRING) -[INTERNAL] Turns on the UTF-8 flag in STRING. The data in STRING is +[INTERNAL] Turns on the UTF8 flag in STRING. The data in STRING is B<not> checked for being well-formed UTF-8. Do not use unless you B<know> that the STRING is well-formed UTF-8. Returns the previous -state of the UTF-8 flag (so please don't treat the return value as +state of the UTF8 flag (so please don't treat the return value as indicating success or failure), or C<undef> if STRING is not a string. =item _utf8_off(STRING) -[INTERNAL] Turns off the UTF-8 flag in STRING. Do not use frivolously. -Returns the previous state of the UTF-8 flag (so please don't treat the +[INTERNAL] Turns off the UTF8 flag in STRING. Do not use frivolously. +Returns the previous state of the UTF8 flag (so please don't treat the return value as indicating success or failure), or C<undef> if STRING is not a string. =back -=head1 UTF-8 vs. utf8 +=head1 UTF-8 vs. utf8 vs. UTF8 ....We now view strings not as sequences of bytes, but as sequences of numbers in the range 0 .. 2**32-1 (or in the case of 64-bit @@ -836,6 +842,8 @@ find_encoding("utf_8")->name # ditto. "_" are treated as "-" find_encoding("UTF8")->name # is 'utf8'. +The UTF8 flag is internally called UTF8, without a hyphen. It indicates +whether a string is internally encoded as utf8, also without a hypen. =head1 SEE ALSO ==== //depot/maint-5.8/perl/ext/Encode/Encode.xs#19 (text) ==== Index: perl/ext/Encode/Encode.xs --- perl/ext/Encode/Encode.xs#18~30047~ 2007-01-27 15:49:02.000000000 -0800 +++ perl/ext/Encode/Encode.xs 2007-04-08 04:06:16.000000000 -0700 @@ -1,5 +1,5 @@ /* - $Id: Encode.xs,v 2.10 2006/06/03 20:28:48 dankogai Exp dankogai $ + $Id: Encode.xs,v 2.11 2007/04/06 12:53:41 dankogai Exp dankogai $ */ #define PERL_NO_GET_CONTEXT @@ -333,7 +333,7 @@ ); #if 1 /* perl-5.8.6 and older do not check UTF8_ALLOW_LONG */ if (strict && uv > PERL_UNICODE_MAX) - ulen = -1; + ulen = (STRLEN) -1; #endif if (ulen == -1) { if (strict) { @@ -481,7 +481,8 @@ /* Native bytes - can always encode */ U8 *d = (U8 *) SvGROW(dst, 2*slen+1); /* +1 or assertion will botch */ while (s < e) { - UV uv = NATIVE_TO_UNI((UV) *s++); + UV uv = NATIVE_TO_UNI((UV) *s); + s++; /* Above expansion of NATIVE_TO_UNI() is safer this way. */ if (UNI_IS_INVARIANT(uv)) *d++ = (U8)UTF_TO_NATIVE(uv); else { @@ -756,15 +757,11 @@ { if (SvGMAGICAL(sv)) /* it could be $1, for example */ sv = newSVsv(sv); /* GMAGIG will be done */ - if (SvPOK(sv)) { RETVAL = SvUTF8(sv) ? TRUE : FALSE; if (RETVAL && check && !is_utf8_string((U8*)SvPVX(sv), SvCUR(sv))) RETVAL = FALSE; - } else { - RETVAL = FALSE; - } if (sv != ST(0)) SvREFCNT_dec(sv); /* it was a temp copy */ } ==== //depot/maint-5.8/perl/ext/Encode/bin/enc2xs#9 (text) ==== Index: perl/ext/Encode/bin/enc2xs --- perl/ext/Encode/bin/enc2xs#8~30047~ 2007-01-27 15:49:02.000000000 -0800 +++ perl/ext/Encode/bin/enc2xs 2007-04-08 04:06:16.000000000 -0700 @@ -8,8 +8,9 @@ use strict; use warnings; use Getopt::Std; +use Config; my @orig_ARGV = @ARGV; -our $VERSION = do { my @r = (q$Revision: 2.4 $ =~ /\d+/g); sprintf "%d."."%02d" x $#r, @r }; +our $VERSION = do { my @r = (q$Revision: 2.5 $ =~ /\d+/g); sprintf "%d."."%02d" x $#r, @r }; # These may get re-ordered. # RAW is a do_now as inserted by &enter @@ -176,6 +177,7 @@ !!!!!!! DO NOT EDIT THIS FILE !!!!!!! This file was autogenerated by: $^X $0 @orig_ARGV + enc2xs VERSION $VERSION */ END } @@ -269,6 +271,9 @@ # push(@{$encoding{$name}},outstring(\*C,$e2u->{Cname}.'_def',$erep)); } + my $cpp = ($Config{d_cplusplus} || '') eq 'define'; + my $exta = $cpp ? 'extern "C" ' : "static"; + my $extb = $cpp ? 'extern "C" ' : ""; foreach my $enc (sort cmp_name keys %encoding) { # my ($e2u,$u2e,$rep,$min_el,$max_el,$rsym) = @{$encoding{$enc}}; @@ -280,9 +285,9 @@ $sym =~ s/\W+/_/g; my @info = ($e2u->{Cname},$u2e->{Cname},"${sym}_rep_character",$replen, $min_el,$max_el); - print C "static const U8 ${sym}_rep_character[] = \"$rep\";\n"; - print C "static const char ${sym}_enc_name[] = \"$enc\";\n\n"; - print C "const encode_t $sym = \n"; + print C "${exta} const U8 ${sym}_rep_character[] = \"$rep\";\n"; + print C "${exta} const char ${sym}_enc_name[] = \"$enc\";\n\n"; + print C "${extb} const encode_t $sym = \n"; # This is to make null encoding work -- dankogai for (my $i = (scalar @info) - 1; $i >= 0; --$i){ $info[$i] ||= 1; @@ -687,8 +692,10 @@ } if ($a->{'Forward'}) { - my $var = $^O eq 'MacOS' ? 'extern' : 'static'; - print $fh "$var const encpage_t $name\[",scalar(@{$a->{'Entries'}}),"];\n"; + my $cpp = ($Config{d_cplusplus} || '') eq 'define'; + my $var = $^O eq 'MacOS' || $cpp ? 'extern' : 'static'; + my $const = $cpp ? '' : 'const'; + print $fh "$var $const encpage_t $name\[",scalar(@{$a->{'Entries'}}),"];\n"; } $a->{'DoneStrings'} = 1; foreach my $b (@{$a->{'Entries'}}) @@ -751,7 +758,9 @@ } $strings = length $string_acc; - my $definition = "\nstatic const U8 $name\[$strings] = { " . + my $cpp = ($Config{d_cplusplus} || '') eq 'define'; + my $var = $cpp ? '' : 'static'; + my $definition = "\n$var const U8 $name\[$strings] = { " . join(',',unpack "C*",$string_acc); # We have a single long line. Split it at convenient commas. print $fh $1, "\n" while $definition =~ /\G(.{74,77},)/gcs; @@ -776,7 +785,10 @@ my ($s,$e,$out,$t,$end,$l) = @$b; outtable($fh,$t,$bigname) unless $t->{'Done'}; } - print $fh "\nstatic const encpage_t $name\[", + my $cpp = ($Config{d_cplusplus} || '') eq 'define'; + my $var = $cpp ? '' : 'static'; + my $const = $cpp ? '' : 'const'; + print $fh "\n$var $const encpage_t $name\[", scalar(@{$a->{'Entries'}}), "] = {\n"; foreach my $b (@{$a->{'Entries'}}) { @@ -1181,7 +1193,7 @@ mappings. This format is used by IBM's ICU package and was adopted by Nick Ing-Simmons for use with the Encode module. Since UCM is more flexible than Tcl's Encoding Map and far more user-friendly, -this is the recommended formet for Encode now. +this is the recommended format for Encode now. A UCM file looks like this. ==== //depot/maint-5.8/perl/ext/Encode/bin/piconv#10 (text) ==== Index: perl/ext/Encode/bin/piconv --- perl/ext/Encode/bin/piconv#9~28165~ 2006-05-11 09:01:23.000000000 -0700 +++ perl/ext/Encode/bin/piconv 2007-04-08 04:06:16.000000000 -0700 @@ -1,5 +1,5 @@ #!./perl -# $Id: piconv,v 2.2 2006/05/03 18:24:10 dankogai Exp $ +# $Id: piconv,v 2.3 2007/04/06 12:53:41 dankogai Exp dankogai $ # use 5.8.0; use strict; @@ -40,7 +40,7 @@ my $from = $Opt{from} || $locale or help("from_encoding unspecified"); my $to = $Opt{to} || $locale or help("to_encoding unspecified"); $Opt{string} and Encode::from_to($Opt{string}, $from, $to) and print $Opt{string} and exit; -my $scheme = exists $Scheme{$Opt{Scheme}} ? $Opt{Scheme} : 'from_to'; +my $scheme = exists $Scheme{$Opt{scheme}} ? $Opt{scheme} : 'from_to'; $Opt{check} ||= $Opt{c}; $Opt{perlqq} and $Opt{check} = Encode::PERLQQ; $Opt{htmlcref} and $Opt{check} = Encode::HTMLCREF; @@ -246,6 +246,9 @@ The new perlIO layer is used. NI-S' favorite. +You should use this option if you are using UTF-16 and others which +linefeed is not $/. + =back Like the I<-D> option, this is also for Encode hackers. ==== //depot/maint-5.8/perl/ext/Encode/encoding.pm#23 (text) ==== Index: perl/ext/Encode/encoding.pm --- perl/ext/Encode/encoding.pm#22~30306~ 2007-02-14 14:38:24.000000000 -0800 +++ perl/ext/Encode/encoding.pm 2007-04-08 04:06:16.000000000 -0700 @@ -1,6 +1,6 @@ -# $Id: encoding.pm,v 2.4 2006/06/03 20:28:48 dankogai Exp dankogai $ +# $Id: encoding.pm,v 2.5 2007/04/06 12:53:41 dankogai Exp dankogai $ package encoding; -our $VERSION = do { my @r = ( q$Revision: 2.4 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r }; +our $VERSION = do { my @r = ( q$Revision: 2.5 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r }; use Encode; use strict; @@ -307,6 +307,14 @@ C<use encoding 'utf8';>, it will print C<4> instead, since C<$string> is three octets when interpreted as Latin-1. +=head2 Side effects + +If the C<encoding> pragma is in scope then the lengths returned are +calculated from the length of C<$/> in Unicode characters, which is not +always the same as the length of C<$/> in the native encoding. + +This pragma affects utf8::upgrade, but not utf8::downgrade. + =head1 FEATURES THAT REQUIRE 5.8.1 Some of the features offered by this pragma requires perl 5.8.1. Most ==== //depot/maint-5.8/perl/ext/Encode/lib/Encode/Alias.pm#19 (text) ==== Index: perl/ext/Encode/lib/Encode/Alias.pm --- perl/ext/Encode/lib/Encode/Alias.pm#18~30306~ 2007-02-14 14:38:24.000000000 -0800 +++ perl/ext/Encode/lib/Encode/Alias.pm 2007-04-08 04:06:16.000000000 -0700 @@ -3,7 +3,7 @@ use warnings; no warnings 'redefine'; use Encode; -our $VERSION = do { my @r = ( q$Revision: 2.6 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r }; +our $VERSION = do { my @r = ( q$Revision: 2.7 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r }; sub DEBUG () { 0 } use base qw(Exporter); @@ -189,8 +189,9 @@ 'greek' => 'iso-8859-7', 'hebrew' => 'iso-8859-8', 'thai' => 'iso-8859-11', - 'tis620' => 'iso-8859-11', ); + # RT #20781 + define_alias(qr/\btis-?620\b/i => '"iso-8859-11"'); # At least AIX has IBM-NNN (surprisingly...) instead of cpNNN. # And Microsoft has their own naming (again, surprisingly). ==== //depot/maint-5.8/perl/ext/Encode/lib/Encode/CJKConstants.pm#8 (text) ==== Index: perl/ext/Encode/lib/Encode/CJKConstants.pm --- perl/ext/Encode/lib/Encode/CJKConstants.pm#7~30047~ 2007-01-27 15:49:02.000000000 -0800 +++ perl/ext/Encode/lib/Encode/CJKConstants.pm 2007-04-08 04:06:16.000000000 -0700 @@ -1,12 +1,12 @@ # -# $Id: CJKConstants.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp dankogai $ +# $Id: CJKConstants.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp $ # package Encode::CJKConstants; use strict; use warnings; -our $RCSID = q$Id: CJKConstants.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp dankogai $; +our $RCSID = q$Id: CJKConstants.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp $; our $VERSION = do { my @r = ( q$Revision: 2.2 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r }; use Carp; ==== //depot/maint-5.8/perl/ext/Encode/lib/Encode/JP/H2Z.pm#5 (text) ==== Index: perl/ext/Encode/lib/Encode/JP/H2Z.pm --- perl/ext/Encode/lib/Encode/JP/H2Z.pm#4~30047~ 2007-01-27 15:49:02.000000000 -0800 +++ perl/ext/Encode/lib/Encode/JP/H2Z.pm 2007-04-08 04:06:16.000000000 -0700 @@ -1,5 +1,5 @@ # -# $Id: H2Z.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp dankogai $ +# $Id: H2Z.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp $ # package Encode::JP::H2Z; @@ -7,7 +7,7 @@ use strict; use warnings; -our $RCSID = q$Id: H2Z.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp dankogai $; +our $RCSID = q$Id: H2Z.pm,v 2.2 2006/06/03 20:28:48 dankogai Exp $; our $VERSION = do { my @r = ( q$Revision: 2.2 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r }; use Encode::CJKConstants qw(:all); ==== //depot/maint-5.8/perl/ext/Encode/lib/Encode/JP/JIS7.pm#9 (text) ==== Index: perl/ext/Encode/lib/Encode/JP/JIS7.pm --- perl/ext/Encode/lib/Encode/JP/JIS7.pm#8~30047~ 2007-01-27 15:49:02.000000000 -0800 +++ perl/ext/Encode/lib/Encode/JP/JIS7.pm 2007-04-08 04:06:16.000000000 -0700 @@ -1,7 +1,7 @@ package Encode::JP::JIS7; use strict; use warnings; -our $VERSION = do { my @r = ( q$Revision: 2.2 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r }; +our $VERSION = do { my @r = ( q$Revision: 2.3 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r }; use Encode qw(:fallbacks); @@ -49,7 +49,7 @@ # empty the input string in the stack so perlio is ok $_[1] = '' if $chk; my ( $h2z, $jis0212 ) = @$obj{qw(h2z jis0212)}; - my $octet = Encode::encode( 'euc-jp', $utf8, FB_PERLQQ ); + my $octet = Encode::encode( 'euc-jp', $utf8, $chk ); $h2z and &Encode::JP::H2Z::h2z( \$octet ); euc_jis( \$octet, $jis0212 ); return $octet; ==== //depot/maint-5.8/perl/ext/Encode/lib/Encode/MIME/Header.pm#9 (text) ==== Index: perl/ext/Encode/lib/Encode/MIME/Header.pm --- perl/ext/Encode/lib/Encode/MIME/Header.pm#8~30047~ 2007-01-27 15:49:02.000000000 -0800 +++ perl/ext/Encode/lib/Encode/MIME/Header.pm 2007-04-08 04:06:16.000000000 -0700 @@ -3,7 +3,7 @@ use warnings; no warnings 'redefine'; -our $VERSION = do { my @r = ( q$Revision: 2.4 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r }; +our $VERSION = do { my @r = ( q$Revision: 2.5 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r }; use Encode qw(find_encoding encode_utf8 decode_utf8); use MIME::Base64; use Carp; @@ -174,12 +174,13 @@ sub _encode_q { my $chunk = shift; + $chunk = encode_utf8($chunk); $chunk =~ s{ ([^0-9A-Za-z]) }{ join("" => map {sprintf "=%02X", $_} unpack("C*", $1)) }egox; - return decode_utf8( HEAD . 'Q?' . $chunk . TAIL ); + return HEAD . 'Q?' . $chunk . TAIL; } 1; ==== //depot/maint-5.8/perl/ext/Encode/lib/Encode/Unicode/UTF7.pm#7 (text) ==== Index: perl/ext/Encode/lib/Encode/Unicode/UTF7.pm --- perl/ext/Encode/lib/Encode/Unicode/UTF7.pm#6~30047~ 2007-01-27 15:49:02.000000000 -0800 +++ perl/ext/Encode/lib/Encode/Unicode/UTF7.pm 2007-04-08 04:06:16.000000000 -0700 @@ -1,5 +1,5 @@ # -# $Id: UTF7.pm,v 2.4 2006/06/03 20:28:48 dankogai Exp dankogai $ +# $Id: UTF7.pm,v 2.4 2006/06/03 20:28:48 dankogai Exp $ # package Encode::Unicode::UTF7; use strict; ==== //depot/maint-5.8/perl/ext/Encode/t/Aliases.t#5 (text) ==== Index: perl/ext/Encode/t/Aliases.t --- perl/ext/Encode/t/Aliases.t#4~28165~ 2006-05-11 09:01:23.000000000 -0700 +++ perl/ext/Encode/t/Aliases.t 2007-04-08 04:06:16.000000000 -0700 @@ -42,6 +42,7 @@ 'hebrew' => 'iso-8859-8', 'thai' => 'iso-8859-11', 'tis620' => 'iso-8859-11', + 'tis-620' => 'iso-8859-11', 'WinLatin1' => 'cp1252', 'WinLatin2' => 'cp1250', 'WinCyrillic' => 'cp1251', @@ -140,6 +141,7 @@ print "# alias test with alias overrides\n"; foreach my $a (keys %a2c){ + print "# $a => $a2c{$a}\n"; my $e = Encode::find_encoding($a); is((defined($e) and $e->name), $a2c{$a}, "Override $a") or warn "alias was $a"; ==== //depot/maint-5.8/perl/ext/Encode/t/mime-header.t#8 (text) ==== Index: perl/ext/Encode/t/mime-header.t --- perl/ext/Encode/t/mime-header.t#7~28165~ 2006-05-11 09:01:23.000000000 -0700 +++ perl/ext/Encode/t/mime-header.t 2007-04-08 04:06:16.000000000 -0700 @@ -1,5 +1,5 @@ # -# $Id: mime-header.t,v 2.2 2006/05/03 18:24:10 dankogai Exp $ +# $Id: mime-header.t,v 2.3 2007/04/06 12:53:41 dankogai Exp dankogai $ # This script is written in utf8 # BEGIN { @@ -23,7 +23,7 @@ use strict; #use Test::More qw(no_plan); -use Test::More tests => 11; +use Test::More tests => 12; use_ok("Encode::MIME::Header"); my $eheader =<<'EOS'; @@ -116,4 +116,7 @@ is(Encode::encode('MIME-Q' => $pound_1024), '=?UTF-8?Q?=C2=A31024?=', 'pound 1024'); } + +is(Encode::encode('MIME-Q', "\x{fc}"), '=?UTF-8?Q?=C3=BC?=', 'Encode latin1 characters'); + __END__; ==== //depot/maint-5.8/perl/ext/Encode/t/utf8strict.t#3 (text) ==== Index: perl/ext/Encode/t/utf8strict.t --- perl/ext/Encode/t/utf8strict.t#2~28165~ 2006-05-11 09:01:23.000000000 -0700 +++ perl/ext/Encode/t/utf8strict.t 2007-04-08 04:06:16.000000000 -0700 @@ -40,14 +40,25 @@ 0x0000FFFF => 1, # 5.3.1 ); $NTESTS += scalar keys %ORD; - %SEQ = ( - qq/ed 9f bf/ => 0, # 2.3.1 - qq/ee 80 80/ => 0, # 2.3.2 - qq/f4 8f bf bf/ => 0, # 2.3.3 - qq/f4 90 80 80/ => 1, # 2.3.4 -- out of range so NG - # "3 Malformed sequences" are checked by perl. - # "4 Overlong sequences" are checked by perl. - ); + if (ord('A') == 193) { + %SEQ = ( + qq/dd 64 73 73/ => 0, # 2.3.1 + qq/dd 67 41 41/ => 0, # 2.3.2 + qq/ee 42 73 73 73/ => 0, # 2.3.3 + qq/f4 90 80 80/ => 1, # 2.3.4 -- out of range so NG + # "3 Malformed sequences" are checked by perl. + # "4 Overlong sequences" are checked by perl. + ); + } else { + %SEQ = ( + qq/ed 9f bf/ => 0, # 2.3.1 + qq/ee 80 80/ => 0, # 2.3.2 + qq/f4 8f bf bf/ => 0, # 2.3.3 + qq/f4 90 80 80/ => 1, # 2.3.4 -- out of range so NG + # "3 Malformed sequences" are checked by perl. + # "4 Overlong sequences" are checked by perl. + ); + } $NTESTS += scalar keys %SEQ; } use strict; End of Patch.