On Sep 13, 2005, at 07:42 , Paul Marquess wrote:
Dan, I'm not sure what is going on here. Can I walk through one of
the failing test to see if it rings any bells with you?
Before that I would like to make sure if I understand the scope of
the problem correctly. We are talking about the problem in EBCDIC
platform, not DBM_Filter vs. Encode at alarge, right?
I wrote a simple script and there seems no problem on ASCII platforms.
SO I expect to get these k/v pairs back
'alpha' => "\xCE\xB1",
'beta' => "\xCE\xB2",
"\xCE\xB3"=> "gamma",
But this is what I actually read from the DBM file.
'beta' => '¸ž'
'alpha' => '¸¨'
'¸ß' => 'gamma'
On ASCII platforms I got what you've expected.
Sastry, would you clarify the problem a little bit more? If that's
the problem of Encode vs. EBCDIC, say so. It occurs to me DBM_Filter
is not guilty for this case.
Dan the Encode Maintainer.
#
use strict;
use charnames 'greek';
use DBM_Filter;
use Fcntl;
use DB_File;
use SDBM_File;
my %file = (
DB_File => 'test.db',
SDBM_File => 'test.sdbm',
);
my %hash = (
"beta" => "\N{beta}",
'alpha' => "\N{alpha}",
"\N{gamma}"=> "gamma",
);
sub perlqq{
join '', map { chr($_) =~ /\w/ ? chr $_ : sprintf "\\x%02X",
$_ } unpack "C*", shift;
}
for my $dbmtype (keys %file){
print "$dbmtype\n";
tie my %db, $dbmtype, $file{$dbmtype}, O_RDWR|O_CREAT, 0644
or die "$dbmtype -> $file{$dbmtype} : $!";
%db = (); # clear
tied(%db)->Filter_Push('utf8');
for my $k (keys %hash){
$db{$k} = $hash{$k};
printf "\$k = %s(%d), \$db{\$k} = %s(%d)\n",
perlqq($k), Encode::is_utf8($k), perlqq($db{$k}),
Encode::is_utf8($db{$k});
}
untie %db;
tie my %db, $dbmtype, $file{$dbmtype}, O_RDONLY, 0644
or die "$dbmtype -> $file{$dbmtype} : $!";
while(my ($k, $v) = each %db){
printf "\$k = %s(%d), \$v = %s(%d)\n",
perlqq($k), Encode::is_utf8($k), perlqq($v),
Encode::is_utf8($v);
}
untie %db;
}
__END__