Re: Add Unicode Support to the DBI

2011-10-13 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


David E. Wheeler wrote:
 I think what I haven't said is that we should just use the same 
 names that Perl I/O uses. Er, well, for the :raw and :utf8 
 varieties I was, anyway. Perhaps we should adopt it wholesale, 
 so you'd use :encoding(UTF-8) instead of UTF-8.

That's pretty ugly. I don't think we need to adopt the I/O 
convention, as there is no direct mapping anyway, it just 
confuses the issue.

 For DBD::Pg, at least, if client-encoding is set to Big5, then 
 you *have* to encode to send it to the database. Or change the 
 client encoding, of course.

Not sure I'm following this completely. Or rather, why this should 
be the DBDs role.

 How would one map things - just demand that 
 whatever is given must be a literal encoding the particular database 
 can understand?

 I think we should standardize on the Perl IO names for these things. 
 Some databases may not support them all, of course.

Hm... I don't know enough about the various DB's encodings to see 
how good an idea that is.

 So the above means these two actually behave very differently:
 
 $dbh-{encoding} = ':utf8';
 
 $dbh-{encoding} = 'utf8';
 
 Could be a little confusing, no? Methinks we some long ugly name, maybe 
 even worse than perl_native. Perhaps perl_internal_utf8_flag? 1/2 :)

 No, I think just encoding, and utf8 would be invalid, 
 but :encoding(UTF-8) would not.

Again, ugh. Although a *little* less confusing when contrasting:

$dbh-{encoding} = ':encoding(utf-8)';

$dbh-{encoding} = 'utf8';

 Well, I think we might have to have it with the pg_prefix until 
 this stuff is finalized here. Not sure, though.

That's my point - if we can get it finalized here, we can avoif the 
pg_prefix entirely, rather than add it now and then deprecate it later.

- -- 
Greg Sabino Mullane g...@turnstep.com
PGP Key: 0x14964AC8 201110130902
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAk6W4ZQACgkQvJuQZxSWSsiqUQCgo/icUz0enqn0BWSygNSeNJGW
lDsAoMbjgZrsGJyS7kS60RgNNkpXMIjG
=43Q3
-END PGP SIGNATURE-




Re: Add Unicode Support to the DBI

2011-10-13 Thread David E. Wheeler
On Oct 13, 2011, at 6:03 AM, Greg Sabino Mullane wrote:

 I think what I haven't said is that we should just use the same 
 names that Perl I/O uses. Er, well, for the :raw and :utf8 
 varieties I was, anyway. Perhaps we should adopt it wholesale, 
 so you'd use :encoding(UTF-8) instead of UTF-8.
 
 That's pretty ugly. I don't think we need to adopt the I/O 
 convention, as there is no direct mapping anyway, it just 
 confuses the issue.

Sure. In that case, I'd say :utf8, :raw, or $encoding.

 For DBD::Pg, at least, if client-encoding is set to Big5, then 
 you *have* to encode to send it to the database. Or change the 
 client encoding, of course.
 
 Not sure I'm following this completely. Or rather, why this should 
 be the DBDs role.

By default, yes, the DBD should DTRT here. But I think there also ought to be a 
way to tell it what to do.

 How would one map things - just demand that 
 whatever is given must be a literal encoding the particular database 
 can understand?
 
 I think we should standardize on the Perl IO names for these things. 
 Some databases may not support them all, of course.
 
 Hm... I don't know enough about the various DB's encodings to see 
 how good an idea that is.

I assume that it's all over the map, so we should be as general as we can. 
Specifying an encoding by name should cover everything.

 No, I think just encoding, and utf8 would be invalid, 
 but :encoding(UTF-8) would not.
 
 Again, ugh. Although a *little* less confusing when contrasting:
 
 $dbh-{encoding} = ':encoding(utf-8)';
 
 $dbh-{encoding} = 'utf8';

Yeah, or we can go with my original suggestion:

$dbh-{encoding} =  'UTF-8';
$dbh-{encoding} =  ':utf8';

 Well, I think we might have to have it with the pg_prefix until 
 this stuff is finalized here. Not sure, though.
 
 That's my point - if we can get it finalized here, we can avoif the 
 pg_prefix entirely, rather than add it now and then deprecate it later.

Sure. I suspect this is going to take a while, though.

Best,

David