At 5:14 pm +0200 2/2/04, ALexander N. Treyner wrote:
Hello All,
I'm using utf-8 Postgres database, where I save strings in many languages.
I have to match the database with strings encoded in mime base64 or quoted-printable format. Like next:
=?utf-8?B?15TXoNeUINee16nXlNeZINeR16LXkdeo15nXqi4=?=
or
=?KOI8-R?Q?=F0=D2=C9=D7=C5=D4=2C_=ED=C9=D2!!!?=
I think that I need first convert these strings to utf-8, but I can not find out how to do it.
The script below will do it in the two cases you mention, though I think you would need to elaborate the regular expression -- I've taken it to the point where it copes with just your examples. In this case both 'utf-8' and 'KOI8-R' are accepted by Encode rather than the default (and wrong) 'utf8' and 'koi8-r', so I think a reading of the perldoc will reveal that dashes and case are properly interpreted in most cases.
use Encode; use MIME::Base64; use MIME::QuotedPrint; my $string; $_ = <<_; =?utf-8?B?15TXoNeUINee16nXlNeZINeR16LXkdeo15nXqi4=?= _ /=\?(.+)\?([BQ])\?(.+)\?=/; my ( $charset, $encoding, $_7bit ) = ( $1, $2, $3 );
if ( $encoding eq 'B' ) { $string = decode_base64 $_7bit } if ( $encoding eq 'Q' ) { $string = decode_qp $_7bit }
Encode::from_to( $string, $charset, "utf8" ) or die $!; print $string;