-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160
> When reading file names with e.g. Umlauts from a directory, either via > readdir() or glob() and storing them in a db these strings are not > correctly returned from the DB. This does not appear when the strings are > ordinary Perl Strings. I'm pretty sure this is because of a known problem with Perl, in that it doesn't treat globs and the like as utf-8 when they should be. To illustrate, I modified the original script a bit and added: use utf8; use Data::Peek; Then I took a look at the same named file, both provided directly in the script, and from the glob. Note the difference via DDump($file): SV = PV(0x8c0f0e8) at 0x8cd6430 REFCNT = 2 FLAGS = (POK,pPOK,UTF8) PV = 0x8cd9bf8 "./files/K\303\266ln"\0 [UTF8 "./files/K\x{f6}ln"] CUR = 14 LEN = 16 SV = PV(0x8c0f0d0) at 0x8cd62c8 REFCNT = 2 FLAGS = (POK,pPOK) PV = 0x8cd9c38 "./files/K\303\266ln"\0 CUR = 13 LEN = 16 The first one, which Perl recognizes as a UTF8 string, goes into and comes out of the database just fine. The second (via glob) does not. Ideally Perl would be smart enough to set UTF8 on for such filenames, but it does not. I'm not sure there is anything DBD::Pg could sensibly do. One solution to the problem at hand may be to simply upgrade the string yourself before handing it off to the database, like so: utf8::upgrade($file); - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201308292202 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAlIf/VcACgkQvJuQZxSWSshmSQCg7//0IBH3+GeBtmM6PHIRw9qO F6IAnA0ylRdrgh8xplMwNTn3h+Iqvi7J =yPxj -----END PGP SIGNATURE-----