On Wed, Mar 24, 2004 at 10:08:56AM +0100, Kristian Nielsen wrote:
> "Andy Hassall" <[EMAIL PROTECTED]> writes:
> 
> >     1b. If a Perl string with the utf8 flag is bound to a statement, it
> > is bound as UTF8 rather than the client character set. Otherwise it is bound
> > as normal (in the client character set).
> 
> Please do not do this. I will try to explain why.

> In Perl, the utf8 flag shouldn't carry any semantics, it should be
> purely a matter of internal representation of the string.

Agreed. But it is exactly that internal representation that's being
passed to the database API. It seems entirely obvious to me that
when passing into the database API a char* pointer to a string that
is utf8 encoded we should tell the database API that it is utf8 encoded.

(The real problems surround what to do when the string isn't utf8
or when it is but the database API can't be told that.)

> I haven't followed the discussion closely, but I believe the core of the
> problem is that some (old?) code may bind strings as sequences of bytes
> in the database character set. Whereas other (new?) code binds strings
> as sequences of unicode characters. As far as I can see, there is no way
> for DBI to reliably distinguish between these two situations, the user
> will have to tell one way or the other (whether by handle attribyte,
> bind option, or defaults based on environment/database config).

The important thing in the long run is to get the default behaviour
"right" (after agreeing what "right" means :). If that involves
incompatible changes then so be it.

Tim.

Reply via email to