Re: Add Unicode Support to the DBI

Jonathan Leffler Tue, 04 Oct 2011 16:07:33 -0700

On Tue, Oct 4, 2011 at 15:24, Martin J. Evans <martin.ev...@easysoft.com>wrote:


> On 04/10/2011 22:38, Tim Bunce wrote:
>
>> I've not had time to devote to this thread. Sorry.
>>
>> I'd be grateful if someone could post a summary of it if/when it
>> approaches some kind of consensus.
>>
>>  I don't think there is a "kind of consensus" right now (although some
> useful discussion which probably will bear fruit) and I'd prefer to work out
> what unicode support already exists and how it is implemented first. For
> instance, Pg is very focussed on UTF-8 (as are most DBDs) and yet ODBC uses
> UCS2 under the hood and CSV can use anything you like. Greg/David/Postgres
> seem to have an immediate problem with unicode support in Postgres and I can
> imagine they are keen to resolve it and I'd suggest they do it now in the
> most appropriate way for DBD::Pg. I don't see why this should necessarily
> impact on any discussion as to what DBI should_do/should_say as already the
> DBDs which support unicode mostly do it in different ways.
>
> I've started gathering together details of what unicode support there is in
> DBDs, how it is implemented and what special flags there are to support it.
> However, this is a massive task. So far I've done ODBC, Oracle, CSV, Unify,
> mysql, SQLite, Firebird and sort of held off on Pg as I knew Greg was
> working on it. Some might disagree but DB2 is a main one I no longer have
> access to (please contact me if you use DBD::DB2 and are prepared to spare
> half an hour or so to modify examples I have which verify unicode support).
> Of course, if you use another DBD and can send me info on unicode support
> I'd love to hear from you.
>
> I thought the whole issue was an interesting topic and I had toyed with
> doing a talk for LPW but to be honest, it is already taking a lot of time
> and I have personal issues right now (and of course my $work) which mean my
> time is severely limited so I'm doubtful right now if I could have it ready
> in time as a talk. I might just post what I have gathered in a weeks time in
> the hope I get a little more input in the mean time.



DBD::Informix has had a couple of UTF-8 patches sent and one has been
applied to the code.  The other arrived this morning and has to be
compared.  The attribute names chosen are different, but both contain 'ix_'
as a prefix and UTF-8 in some form or another.

What I'm not sure about is how to test the code.  Creating an Informix
database that has UTF-8 data in it is trivial (well, nearly trivial).  The
difficulty is demonstrating where there were problems before and that the
problems are gone after.

If anyone has suggestions for how to show that UTF8 is working properly - in
the form of a fairly simple test case - I'd be very grateful to receive it
as guidance.

So, DBD::Informix is endeavouring to move forward, knowing that the
underlying database layers (ESQL/C on the client, and the Informix data
server) handle UTF-8 OK, so any problems are in persuading Perl (and perhaps
DBI) to handle it appropriately too.

[...Did I hear a chorus of "nice theory - shame about the practice"?...]

-- 
Jonathan Leffler <jonathan.leff...@gmail.com>  #include <disclaimer.h>
Guardian of DBD::Informix - v2011.0612 - http://dbi.perl.org
"Blessed are we who can laugh at ourselves, for we shall never cease to be
amused."

Re: Add Unicode Support to the DBI

Reply via email to