On Tue, Nov 29, 2005 at 12:34:11PM +0100, Michael Schroeder wrote:
> 
> On Mon, 28 Nov 2005 [EMAIL PROTECTED] wrote:
> > On Thu, Nov 24, 2005 at 11:42:08AM -0800, [EMAIL PROTECTED] de wrote:
> > > decode_utf8() doesn't return "false" if run with non-UTF-8 string. It just
> > > returns the non-UTF-8 string. To see this bug in action use convmv from
> > > http://j3e.de/linux/convmv/ and convert a filename from latin1 to utf8. 
> > > It will
> > > tell you that the file is already UTF-8 encoded. convmv evaluates 
> > > decode_utf8()
> > > to see if a file is already utf-8-encoded.
> > 
> > I don't see any indication in the Encode doc that decode_utf8 would
> > ever return false on error.  To use it to check for valid utf8, I
> > think you'd need to specify the CHECK parameter as FB_CROAK and wrap
> > the call in an eval {}; see:
> > http://perldoc.perl.org/Encode.html#Handling-Malformed-Data
> > 
> > Perhaps you should use utf8::decode() instead?
> 
> Well, the perluniintro manpage says:
> 
>  - How Do I Detect Data That's Not Valid In a Particular Encoding?
> 
>    Use the "Encode" package to try converting it.  For example,
> 
>        use Encode 'decode_utf8';
>        if (decode_utf8($string_of_bytes_that_I_think_is_utf8)) {
>          # valid
>        } else {
>          # invalid
>        }

Ah, I hadn't noticed that; that doesn't agree with the doc in Encode
itself, but up through Encode 2.09 (2.08 was included with perl5.8.6),
decode_utf8 did actually just call utf8::decode when no check
parameter was passed.  Encode 2.10 (in perl5.8.7) now works as
described in the Encode doc, but doesn't work as described in
perluniintro.

Dan, perhaps it would be a good idea to put back the old behavior
(reversing the change you made for
http://rt.cpan.org/NoAuth/Bug.html?id=8872 and changing the doc
instead) when no check parameter is passed?

Reply via email to