Re: "use encoding 'utf8'" breaks Net::LDAP

Graham Barr Sat, 24 Sep 2005 07:28:42 -0700

On Sep 23, 2005, at 15:29 PM, Eugene Gladchenko wrote:

Maybe the pragma is doingmore than it should


Oh no!

"encoding" pragma does much more than just indicating the encodingof a

script. Here is the quote:

The encoding pragma also modifies the filehandle layers of STDIN and
STDOUT to the specified encoding.


But we do not use STDIN/STDOUT so that should be a non-issue.

By default, if strings operating under byte semantics and strings with

Unicode character data are concatenated, the new string will becreated

by decoding the byte strings as ISO 8859-1 (Latin-1).

The encoding pragma changes this to use the specified encodinginstead.


For example:

    use encoding 'utf8';
    my $string = chr(20000); # a Unicode string
    utf8::encode($string);   # now it's a UTF-8 encoded byte string
    # concatenate with another Unicode string
    print length($string . chr(20000));

Will print 2, because $string is upgraded as UTF-8. Without "use
encoding 'utf8';", it will print 4 instead, since $string is three
octets when interpreted as Latin-1.

Right, but the effect should be lexical. So other than strings thatare upgraded as utf-8 being passed through to Net::LDAP as arguments,it should have no effect. And Net::LDAP should already handle UTF-8strings ok. But maybe there is one place that has been missed.

What I don't get is that the issue arises as a decode issue and allstrings that are fed into the decode come from the socket and shouldnot have this issue.


Graham.

Re: "use encoding 'utf8'" breaks Net::LDAP

Reply via email to