Martin J. Evans napsal(a):
> On 31-Aug-2006 Daniel Zeman wrote:
>> Hi,
>> I wonder if anyone can help me.
>>
>> I am using
>> - Debian Linux
>> - Perl 5.8.8
>> - DBI (I do not know how to figure out its version)
>
> perl -MDBI -e 'print "$DBI::VERSION\n";'

1.51

>
>> - MySQL 5.0.22-Debian_3-log
>
> What DBD::mysql version are you using?
>
> perl -M'DBD::mysql' -e 'print "$DBD::mysql::VERSION\n";'

3.0004

>
>> I want to store and handle data in UTF-8 but so far I have not been able
>> to force Perl/DBI to do so.
>>
>> I have created a table using
>>
>>      my $sql = "CREATE TABLE $tbl (".join(", ", @columns).") CHARACTER
>> SET utf8 COLLATE utf8_czech_ci;";
>>      $dbh->do($sql);
>>
>> I feeded the table with data using
>>
>>      my $list_of_columns = join(", ", @names);
>> my $list_of_values = join(", ", map{"_utf8'$record->{$_}'"}(@names));
>>      my $sql = "INSERT INTO $tbl ($list_of_columns) VALUES
>> ($list_of_values);";
>>      $dbh->do($sql);
>
> You would be better using bound parameters as then only the data can be utf-8.
>
>> I have looked into the database using phpMyAdmin 2.8.2-Debian-0.1 and it
>> really looked like the data were stored in correct UTF-8.
>>
>> However, when I retrieve the data from Perl/DBI, something in the chain
>> (MySQL? the driver? DBI?) decides that another encoding (probably,
>> Latin1) would be better for me. It "converts" the strings from UTF-8 to
>> that encoding, which means, at the time the data arrives in my Perl
>> code, all the non-Latin1 characters have already been irrecoverably
>> converted to question marks. I would be happy to decode the data myself
>> but there is nothing I can do with the question marks.
>>
>> I am using the following code to retrieve the data:
>>
>>      my $sql = "SELECT kod, hry.nazev FROM hry INNER JOIN prodej ON
>> hry.kod = prodej.kod_hry GROUP BY kod, hry.nazev";
>>      my $sqlobj = $dbh->prepare($sql);
>>      $sqlobj->execute();
>>      while(my ($kod, $nazev) = $sqlobj->fetchrow_array())
>>      {
>>          ...
>>      }
>>
>> So far, the only workaround I have, is not to tell the DBI the data is
>> UTF-8 when I am inserting it (i.e., drop the "_utf8" part before the
>> single quote), and use Encode; decode("utf8", ...) on anything I fetch
>> from the database. This way, the database never knows the data was a
>> UTF-8 text, treats the bytes as Latin1 characters and returns them
>> undisturbed. However, I cannot access the data using phpMyAdmin (unless
>> I en/decode UTF in my brain), the string lengths do not reflect the
>> reality etc.
>>
>> Is there a better way to do it? I think there must be some small stupid
>> locale-like setting telling the machine that I am a UTF guy. But the
>> settings I was able to come up with did not help and I actually have no
>> idea which part of the MySQL-driver-DBI-Perl chain is responsible.
>>
>> Any hints are welcome.
>> Thanks
>>
>> Dan
>
> I posted a small patch (in dbi-dev a couple of months ago) to a recent(ish) > DBD::mysql which allows you to retrieve utf-8 data marked as utf-8 in perl. If
> is not a complete solution but works for most things.
>
> See
> http://www.nntp.perl.org/group/perl.dbi.dev/4548
>
> It may have been included in the 3.0006_1 development release of DBD::mysql.
>
> You also need to look up "set character set utf8" and "set names utf8" in mysql
> documentation.

This is it! Thanks a lot! Just doing

$dbh->prepare("SET NAMES 'utf8'")->execute();

after connecting solved my problem. It seems that I am getting the strings with the utf8 flag on so hopefully I can live without your patch (or someone has already patched the system I am on) - but I am going to store the link for future reference. Thanks!

Dan

Reply via email to