At 09:54 +0200 7/10/11, marcos rebelo wrote:

Unfortunatly someone has the code: < use encoding 'utf8'; >

and now I get:

#######################################################

$VAR1 = {
          'Subject' => "\x{fffd}\x{fffd}my subject",
          'CreationDate' => 'D:20111006161347+02\'00\'',
          'Producer' => "\x{fffd}\x{fffd}LibreOffice 3.3",
          'Creator' => "\x{fffd}\x{fffd}Writer",
          'Author' => "\x{fffd}\x{fffd}Marcos Rebelo",
          'Title' => "\x{fffd}\x{fffd}my title",
          'Keywords' => "\x{fffd}\x{fffd}my keywords"
        };

#######################################################

I can't remove the < use encoding 'utf8'; >, but I need to clean the hash.

How can I clean the hash?

Without reiterating the demerits of encoding.pm, if the only unicode character you are getting is \x{fffd} (REPLACEMENT CHARACTER),then you just need to get rid of it by looping through the hash -- or are you getting other spurious characters?



#!/usr/local/bin/perl
use strict;
use encoding 'utf8';
use Data::Dumper;
my $VAR1 = {
  'Subject' => "\x{fffd}\x{fffd}my subject",
  'CreationDate' => 'D:20111006161347+02\'00\'',
  'Producer' => "\x{fffd}\x{fffd}LibreOffice 3.3",
  'Creator' => "\x{fffd}\x{fffd}Writer",
  'Author' => "\x{fffd}\x{fffd}Marcos Rebelo",
  'Title' => "\x{fffd}\x{fffd}my title",
  'Keywords' => "\x{fffd}\x{fffd}my keywords"
};
my %pdf_hash = %$VAR1;
for (keys %pdf_hash){ $pdf_hash{$_} =~ s~\x{fffd =}~~g }
print Dumper \%pdf_hash;
__END__

Result:
$VAR1 = {
          'Subject' => 'my subject',
          'CreationDate' => 'D:20111006161347+02\'00\'',
          'Producer' => 'LibreOffice 3.3',
          'Creator' => 'Writer',
          'Author' => 'Marcos Rebelo',
          'Title' => 'my title',
          'Keywords' => 'my keywords'
};

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to