Re: Add Unicode Support to the DBI
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 David E. Wheeler wrote: I think what I haven't said is that we should just use the same names that Perl I/O uses. Er, well, for the :raw and :utf8 varieties I was, anyway. Perhaps we should adopt it wholesale, so you'd use :encoding(UTF-8) instead of UTF-8. That's pretty ugly. I don't think we need to adopt the I/O convention, as there is no direct mapping anyway, it just confuses the issue. For DBD::Pg, at least, if client-encoding is set to Big5, then you *have* to encode to send it to the database. Or change the client encoding, of course. Not sure I'm following this completely. Or rather, why this should be the DBDs role. How would one map things - just demand that whatever is given must be a literal encoding the particular database can understand? I think we should standardize on the Perl IO names for these things. Some databases may not support them all, of course. Hm... I don't know enough about the various DB's encodings to see how good an idea that is. So the above means these two actually behave very differently: $dbh-{encoding} = ':utf8'; $dbh-{encoding} = 'utf8'; Could be a little confusing, no? Methinks we some long ugly name, maybe even worse than perl_native. Perhaps perl_internal_utf8_flag? 1/2 :) No, I think just encoding, and utf8 would be invalid, but :encoding(UTF-8) would not. Again, ugh. Although a *little* less confusing when contrasting: $dbh-{encoding} = ':encoding(utf-8)'; $dbh-{encoding} = 'utf8'; Well, I think we might have to have it with the pg_prefix until this stuff is finalized here. Not sure, though. That's my point - if we can get it finalized here, we can avoif the pg_prefix entirely, rather than add it now and then deprecate it later. - -- Greg Sabino Mullane g...@turnstep.com PGP Key: 0x14964AC8 201110130902 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAk6W4ZQACgkQvJuQZxSWSsiqUQCgo/icUz0enqn0BWSygNSeNJGW lDsAoMbjgZrsGJyS7kS60RgNNkpXMIjG =43Q3 -END PGP SIGNATURE-
Re: Add Unicode Support to the DBI
On Oct 13, 2011, at 6:03 AM, Greg Sabino Mullane wrote: I think what I haven't said is that we should just use the same names that Perl I/O uses. Er, well, for the :raw and :utf8 varieties I was, anyway. Perhaps we should adopt it wholesale, so you'd use :encoding(UTF-8) instead of UTF-8. That's pretty ugly. I don't think we need to adopt the I/O convention, as there is no direct mapping anyway, it just confuses the issue. Sure. In that case, I'd say :utf8, :raw, or $encoding. For DBD::Pg, at least, if client-encoding is set to Big5, then you *have* to encode to send it to the database. Or change the client encoding, of course. Not sure I'm following this completely. Or rather, why this should be the DBDs role. By default, yes, the DBD should DTRT here. But I think there also ought to be a way to tell it what to do. How would one map things - just demand that whatever is given must be a literal encoding the particular database can understand? I think we should standardize on the Perl IO names for these things. Some databases may not support them all, of course. Hm... I don't know enough about the various DB's encodings to see how good an idea that is. I assume that it's all over the map, so we should be as general as we can. Specifying an encoding by name should cover everything. No, I think just encoding, and utf8 would be invalid, but :encoding(UTF-8) would not. Again, ugh. Although a *little* less confusing when contrasting: $dbh-{encoding} = ':encoding(utf-8)'; $dbh-{encoding} = 'utf8'; Yeah, or we can go with my original suggestion: $dbh-{encoding} = 'UTF-8'; $dbh-{encoding} = ':utf8'; Well, I think we might have to have it with the pg_prefix until this stuff is finalized here. Not sure, though. That's my point - if we can get it finalized here, we can avoif the pg_prefix entirely, rather than add it now and then deprecate it later. Sure. I suspect this is going to take a while, though. Best, David