Hi,

On Friday 16 April 2004 16:08, Hollinger, Robert-Alexandre wrote:
> I've got a problem with displaying information that I retrieve from Active
> Directory 2003. For example, "C�cile" is displayed "Cécile", and all the
> special characters like �, �, �, �, etc. are badly displayed. So under
> ActivePerl 5.6.1, I do the following :
>
> sub unicode2latin1
> {
>         my $string = shift;
>
>         use utf8;
>
>         if ($string)
>         {
>                 $string =~ s/([\x{80}-\x{FFFF}])/'&#' . ord($1) .';'/gse;
>         }
>
>         no utf8;
>         return $string;
> }
>
> And it works perfectly, but under the last version of perl (5.8.3), this
> code doesn't work ! Any idea ?

this looks like a really strange hack ;-)
It seems to make use of some internal encoding in Perl 5.6.1.

Although the question is a general Perl question and has nothing to do with 
perl-perl in particular I' give it a try.

In most of my Perl scripts dealing with LDAP I use the fowllowing
code to do conversion between UTF8 and the local character set
(which can be anyone defined in Unicode::Map8):


use Unicode::Map8;
use Unicode::String qw(utf8);

## generate CharSet<->UTF8 en-/de-coding subroutines / closures ##
# Synopsis:  ($encoder,$decoder) = UTF8converters($charset)
sub UTF8converters($)
{
my $charset = shift;
my $map = Unicode::Map8->new($charset);

  if ($map)
  {
  my $encoder = sub {
                my @arg = @_;
                  map { $_ = $map->tou($_)->utf8()  if (defined($_)); } @arg;
                  return(wantarray ? @arg : $arg[0]);
                };
  my $decoder = sub {
                my @arg = @_;
                  map { $_ = $map->to8(utf8($_)->utf16())  if (defined($_)); } 
@arg;
                  return(wantarray ? @arg : $arg[0]);
                };

    # die if creation if encoder or decoder failed
    error(1, 'unable to create UTF8 encoder/decoder')
      if ((!$encoder) || (!$decoder));

    return($encoder, $decoder);
  }

  die 'unable to create UTF8 charset mapping';
}


And here is how I use it 
($opt{charset} is a variable holding the name of the charset):

# create UTF8 en-/decoders (default to ISO8859-1)
if ((!defined($opt{charset})) || ($opt{charset} !~ /^(?:none|utf8)$/))
{
my $charset = defined($opt{charset}) ? $opt{charset} : 'iso8859-15';

  ($encodeUTF8,$decodeUTF8) = UTF8converters($charset);
}

if ($encodeUTF8)
{
   $BaseDN = &$encodeUTF8($BaseDN);
   $BindDN = &$encodeUTF8($BindDN);
}

...

@cn = $entry->get_values('cn');
@cn = &$decodeUTF8(@cn)   if ($decodeUTF8);


This works fine in Perl 5.6.x as well as in Perl 5.8.x 
I do not know if the modules are available in/for ActivePerl (you may need a C 
compiler to compile them)

Another alternative that only works in Perl 5.8 or higher is the Encoding.pm
module. IIRC there is also a Encoding::Compat.pm that makes the funkctions of 
Encoding.pm available for Perl 5.6.1+

Hope it helps
Peter
-- 
Peter Marschall
eMail: [EMAIL PROTECTED]

Reply via email to