On Sep 22, 2011, at 11:57 AM, Martin J. Evans wrote: > ok except what the oracle client libraries accept does not match with Encode > accepted strings so someone would have to come up with some sort of mapping > between the two.
Yes. That's one of the consequences of providing a single interface to multiple databases. >>> and what about when it conflicts with your locale/LANG? >> So what? > I'm not so sure this is a "So what" as Perl itself uses locale settings in > some cases - just thought it needed mentioning for consideration. I'm not really concerned about locales at this point. I tend to leave collation, for example, up to the database. Right now I'm strictly concerned about encoding. >>> and what about PERL_UNICODE flags, do they come into this? >> What are those? > See http://perldoc.perl.org/perlrun.html > > In particular "UTF-8 is the default PerlIO layer for input streams" of which > reading data from a database could be considered one? That'd be cool, but it's not currently implemented that way, obviously. DBI and PerlIO are completely independent AFAIK, and the DBI doesn't look like a file handle. > ok, I'm thinking through the ramifications of this. > > To add to the list I see DBD::SQLite has |sqlite_unicode |"strings coming > from the database and passed to the collation function will be properly > tagged with the utf8 flag; but this only works if the |sqlite_unicode| > attribute is set before the first call to a perl collation sequence" and "The > current FTS3 implementation in SQLite is far from complete with respect to > utf8 handling : in particular, variable-length characters are not treated > correctly by the builtin functions |offsets()| and |snippet()|." > > and DBD::CSV has > > f_encoding => "utf8", > > DBD::mysql has mysql_enable_utf8 which apparently "This attribute determines > whether DBD::mysql should assume strings stored in the database are utf8. > This feature defaults to off." > > I could not find any special flags for DBD::DB2. > > DBD::Sybase has syb_enable_utf8 "If this attribute is set then DBD::Sybase > will convert UNIVARCHAR, UNICHAR, and UNITEXT data to Perl's internal utf-8 > encoding when they are retrieved. Updating a unicode column will cause Sybase > to convert any incoming data from utf-8 to its internal utf-16 encoding." Yeah, so I think that can be generalized. Best, David