Hi, Steve,

[...]

> The problem is: How do I trap all input/output to/from DBI to do these 
> conversions?

[...]

> I've asked about this on the dbi-users mailing list, and the answer 
> (from Tim Bunce, no less) was that it is really the responsibility of 
> the DBD driver to perform such conversions if the data in question is UTF-8.

after letting my thoughts settle I come to the conclusion that I do not
agree completely. I think that DBI should do 80% of the job and leave
about 20% to the driver authors.

It is right, that the driver author possibly knows how to detect the
encoding of columns in the database. However, this is *only* possibly,
because doing a "SHOW COLUMNS FROM mytable" with any prepare statement
is not an option, IMO. There has to be at least a possibility to say
"this column is encoded in ISO-8859-1, but please be so kind to convert
into the UTF-8 which we are using in nowadays Perl". Whether MySQL 4.1
does return such flag or not, doesn't matter: The driver still has to
work with elder versions, where encoding matters quite the same.

IMO a good approach would go like follows:

- DBI's got to know about encodings. It is the task of the generic
  layer to decide on the character encoding of the input data, aka
  SQL statements and placeholder values.

  In short: DBI must decide on "What is input data and how is it
  encoded"?

- DBI should also know about the *desired* encodings of input data.
  Of course, it would be nice, if driver authors can provide the
  ability to determine these value automatically. However, IMO the
  DBI user should be able to override. Also, we should not forget that
  people may use binary columns to store text data, in which case the
  driver author will never be able to give suitable information.

- Based on the above facts, DBI would be able to decide that "we need
  to convert from encoding1 to encoding2". If they are the same, DBI
  could suppress the following. Otherwise DBI could invoke a method
  doing the conversion. That method ought to be overridable by the
  driver author. And it ought to be implementable in Perl or C.

 - Likewise for the output.


In short: Let DBI decide when to convert. And give it suitable hooks
that allow the driver authors to provide information.


Jochen

Reply via email to