[EMAIL PROTECTED] said:
I need to convert strings obtained from a mysql database in utf8 format
into a fileformat to be uploaded to specific hardware (specifically
GPS's). Some of these formats may only allow unaccented characters, so I
need a way to convert accented characters into their respective base
characters, g.e. unicode '�' into ASCII 'o', '�' into 'a' and so forth.
Is there an easy way to do this in Perl?
There's a prior thread on this list about this very topic:
http://www.mail-archive.com/perl-unicode@perl.org/msg02000.html
Also, I've posted a couple different approaches on www.perlmonks.org --
here's my favorite:
#!/usr/bin/perl -CDS
use strict;
require 5.008;
my @charnames = grep /\tLATIN \S+ LETTER/, split( /^/, do 'unicore/Name.pl' );
my %accents;
for my $c ( split //, qq/AEIOUCNYaeioucny/ ) {
my $case = ( $c eq lc $c ) ? 'SMALL' : 'CAPITAL';
$accents{$c} =
join( '', map { chr hex( substr $_, 0, 4 ) }
grep /\tLATIN $case LETTER \U$c WITH/, @charnames );
}
# now use each element of %accents as a character class:
while () {
for my $c ( keys %accents ) {
s/[$accents{$c}]/$c/g;
}
print;
}
__END__
Another way would be to simply hard-code a set of tr/..././ steps, one
for each lower-case and upper-case unaccented letter (placed on the right),
with all its accented variants on the left. Tedious to code, but very fast
at run-time.
Dave Graff