Hi,
On Friday 16 April 2004 16:08, Hollinger, Robert-Alexandre wrote:
> I've got a problem with displaying information that I retrieve from Active
> Directory 2003. For example, "C�cile" is displayed "Cécile", and all the
> special characters like �, �, �, �, etc. are badly displayed. So under
> ActivePerl 5.6.1, I do the following :
>
> sub unicode2latin1
> {
> my $string = shift;
>
> use utf8;
>
> if ($string)
> {
> $string =~ s/([\x{80}-\x{FFFF}])/'&#' . ord($1) .';'/gse;
> }
>
> no utf8;
> return $string;
> }
>
> And it works perfectly, but under the last version of perl (5.8.3), this
> code doesn't work ! Any idea ?
this looks like a really strange hack ;-)
It seems to make use of some internal encoding in Perl 5.6.1.
Although the question is a general Perl question and has nothing to do with
perl-perl in particular I' give it a try.
In most of my Perl scripts dealing with LDAP I use the fowllowing
code to do conversion between UTF8 and the local character set
(which can be anyone defined in Unicode::Map8):
use Unicode::Map8;
use Unicode::String qw(utf8);
## generate CharSet<->UTF8 en-/de-coding subroutines / closures ##
# Synopsis: ($encoder,$decoder) = UTF8converters($charset)
sub UTF8converters($)
{
my $charset = shift;
my $map = Unicode::Map8->new($charset);
if ($map)
{
my $encoder = sub {
my @arg = @_;
map { $_ = $map->tou($_)->utf8() if (defined($_)); } @arg;
return(wantarray ? @arg : $arg[0]);
};
my $decoder = sub {
my @arg = @_;
map { $_ = $map->to8(utf8($_)->utf16()) if (defined($_)); }
@arg;
return(wantarray ? @arg : $arg[0]);
};
# die if creation if encoder or decoder failed
error(1, 'unable to create UTF8 encoder/decoder')
if ((!$encoder) || (!$decoder));
return($encoder, $decoder);
}
die 'unable to create UTF8 charset mapping';
}
And here is how I use it
($opt{charset} is a variable holding the name of the charset):
# create UTF8 en-/decoders (default to ISO8859-1)
if ((!defined($opt{charset})) || ($opt{charset} !~ /^(?:none|utf8)$/))
{
my $charset = defined($opt{charset}) ? $opt{charset} : 'iso8859-15';
($encodeUTF8,$decodeUTF8) = UTF8converters($charset);
}
if ($encodeUTF8)
{
$BaseDN = &$encodeUTF8($BaseDN);
$BindDN = &$encodeUTF8($BindDN);
}
...
@cn = $entry->get_values('cn');
@cn = &$decodeUTF8(@cn) if ($decodeUTF8);
This works fine in Perl 5.6.x as well as in Perl 5.8.x
I do not know if the modules are available in/for ActivePerl (you may need a C
compiler to compile them)
Another alternative that only works in Perl 5.8 or higher is the Encoding.pm
module. IIRC there is also a Encoding::Compat.pm that makes the funkctions of
Encoding.pm available for Perl 5.6.1+
Hope it helps
Peter
--
Peter Marschall
eMail: [EMAIL PROTECTED]