Gisle Aas wrote:
>Do you know if decoding always fails with this version of if it's only
>when  \x{FFFD is substituted?

Decoding generally works, it's just substitution that goes wrong:

$ for pver in 5.8.0 5.8.9; do perl$pver -MEncode -MData::Dumper -lwe 
'$Data::Dumper::Useqq=$Data::Dumper::Terse=1;$Data::Dumper::Indent=0; print 
$Encode::VERSION, " ", Dumper($_), " ", Dumper(decode("UTF-8", $_, 0)) foreach 
"a\xc2\x80", "a\xff"'; done
1.75 "a\302\200" "a\x{80}"
1.75 "a\377" undef
2.29 "a\302\200" "a\x{80}"
2.29 "a\377" "a\x{fffd}"
$

There appears to be no value of the CHECK parameter that will make
Encode-1.75 generate a replacement character.

>                                               If the decoding
>sometimes work adding this might break stuff that used to work for the
>poor perl-5.8.0 users.

Yes, it would, unless we can force an upgrade of Encode at install time.
I haven't figured out a good way to do that, though, given that LWP is
meant to generally work in the absence of Encode.

>Otherwise I'm fine with disabling the failure for the test, but I
>would leave in some noise to warn that things are not really all good
>and that users should really upgrade to a newer perl.

I suggest a three-part strategy:

0. Warn at build time if the perl version is 5.8 or later (so Encode is
   possible) but Encode is either absent or exhibits this bug.

1. If decode() returns undef, treat it as an error.  This means that,
   depending on the value of the raise_error option, the method will
   either return undef or die.  (Currently in this situation it returns
   undef even if raise_error is true.)

2. Skip that decoding test if Encode exhibits the bug.

The attached patch implements this.

-zefram
diff -ur libwww-perl-5.823.orig/Makefile.PL libwww-perl-5.823.mod0/Makefile.PL
--- libwww-perl-5.823.orig/Makefile.PL  2008-11-25 20:07:09.000000000 +0000
+++ libwww-perl-5.823.mod0/Makefile.PL  2009-02-12 21:12:42.339056457 +0000
@@ -59,6 +59,14 @@
     },
     clean => { FILES => join(" ", map "bin/$_", grep /^[A-Z]+$/, @prog) },
 );
+
+if($] >= 5.008 && !(eval { require Encode; defined(Encode::decode("UTF-8", 
"\xff")) })) {
+    warn "\nYou lack a working Encode module, and so you will miss out on\n".
+           "lots of character set goodness from LWP.  However, your perl is\n".
+           "sufficiently recent to support it.  It is recommended that you\n".
+           "install the latest Encode from CPAN.\n\n";
+}
+
 exit;
 
 
diff -ur libwww-perl-5.823.orig/lib/HTTP/Message.pm 
libwww-perl-5.823.mod0/lib/HTTP/Message.pm
--- libwww-perl-5.823.orig/lib/HTTP/Message.pm  2008-11-25 20:07:09.000000000 
+0000
+++ libwww-perl-5.823.mod0/lib/HTTP/Message.pm  2009-02-12 21:00:41.336457356 
+0000
@@ -305,6 +305,7 @@
                }
                $content_ref = \Encode::decode($charset, $$content_ref,
                     ($opt{charset_strict} ? Encode::FB_CROAK() : 0) | 
Encode::LEAVE_SRC());
+               die "Encode::decode() returned undef improperly" unless defined 
$$content_ref;
            }
        }
     };
diff -ur libwww-perl-5.823.orig/t/base/message.t 
libwww-perl-5.823.mod0/t/base/message.t
--- libwww-perl-5.823.orig/t/base/message.t     2008-11-25 20:07:09.000000000 
+0000
+++ libwww-perl-5.823.mod0/t/base/message.t     2009-02-12 21:12:29.158988889 
+0000
@@ -383,8 +383,10 @@
 $m->remove_header("Content-Encoding");
 $m->content("a\xFF");
 
-skip($NO_ENCODE, sub { $m->decoded_content }, "a\x{FFFD}");
-skip($NO_ENCODE, sub { $m->decoded_content(charset_strict => 1) }, undef);
+my $BAD_ENCODE = $NO_ENCODE || !(eval { require Encode; 
defined(Encode::decode("UTF-8", "\xff")) });
+
+skip($BAD_ENCODE, sub { $m->decoded_content }, "a\x{FFFD}");
+skip($BAD_ENCODE, sub { $m->decoded_content(charset_strict => 1) }, undef);
 
 $m->header("Content-Encoding", "foobar");
 ok($m->decoded_content, undef);

Reply via email to