Jean-Michel Hiver <[EMAIL PROTECTED]> writes: > > How can I calculate the MD5 message digest of a Unicode string in Perl > > 5.8? The MD5 hash algorithm naturally expects a sequence of bytes as its > > input, and I have a string with a sequence of characters. I tried > > > > $ perl -e 'use Digest::MD5 qw(md5_hex); print md5_hex("\x{20ac}");' > > Wide character in subroutine entry at -e line 1. > > I'd do something like that: > > use Encode; > use Digest::MD5 qw(md5_hex); > > sub md5_hex > { > my $string = shift; > Encode::_utf8_off ($string); > return md5_hex ($string); > }
I would argue that it is much better to write it as: md5_hex(Encode::encode_utf8($string)) Playing with _utf8_{on,off} is ugly for good reason and will break if the internal representation change. Calling encode_utf8() should be almost as efficient and is future-proof. Regards, Gisle