-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

> Uh, say what? Just as I need to
>
> binmode STDOUT, ':utf8';
> Before sending stuff to STDOUT (that is, turn off the flag), I would 
> expect DBDs to do the same before sending data to the database. 
> Unless, of course, it "just works".

I cannot imagine the flag really matters or not. We (Pg) simply dump a 
bunch of chars to the database, and build it by slurping in the string 
character by character until we hit a null. I suppose other databases 
may do things differently, but I can't imagine how/why.

>> Yes, very bad example. Let's call it utf8. Forget 'unicode' entirely.

> Yeah, better, though it' just perpetuates Perl's unfortunate use of 
> the term "utf8" for "internal string representation." Though I suppose 
> that ship has sunk already.

Yep. To paraphrase horribly, "Perl's unicode support is the worst, except for 
all the other languages".

>> Because it may still need to convert things. See the ODBC discussion.
>
> Oh, so you're saying it will decode and encode between Perl's internal 
> form and UTF-8, rather than just flip the flag on and off?

Yes, that's a possibility.

> Yes, because you were only talking about utf8 and UTF-8, not any 
> other encodings. Unless I missed something. If the data coming back 
> from the DB is Big5, I may well want to have some way to decode it 
> (and to encode it for write statements).

You mean at the DBD level -  such that you can say to the database, 
I don't care what encoding you stored it as, I want it encoded 
as X when you give it back to me? (update: yes, see below)

>> Well, because utf-8 is pretty much a defacto encoding, or at least 
>> way, way more popular than things like ucs2. Also, the Perl utf8 
>> flag encourages us to put everything into UTF-8.
>
> Yeah, but again, that might be some reason to call it something else, 
> like "perl_native" or something. The fact that it happens to be UTF-8 
> should be irrelevant. ER, except, I guess, you still have to know the 
> encoding of the database.

Well, I wouldn't call it irrelevant, but at the end of the day, we can 
call it perl_native, but that's just going to cause people to look it up 
in the docs and then say "aha! that means the utf8 flag is on" and then 
they have "perl_native -> utf8" burned into their head. Or worse, 
"perl_native -> unicode". :)

>> * 'A': the default, it means the DBD should do the best thing, which in most 
>> cases means setting SvUTF8_on if the data coming back is UTF-8.
>> * 'B': (on). The DBD should make every effort to set SvUTF8_on for returned 
>> data, even if it thinks it may not be UTF-8.
>> * 'C': (off). The DBD should not call SvUTF8_on, regardless of what it 
>> thinks the data is.

> I still prefer an encoding attribute that you can set as follows:

> * undef: Default; same as your A.
> * ':utf8': Same as your B:
> * ':raw': Same as your C
> * $encoding: Encode/decode to/from $encoding

I like that. Although the names are still odd. I guess it does map 
though: raw means no utf8 flag. Still not sure about the encode 
'to', but I'll start thinking about how we could implement the 
'from' in DBD::Pg. How would one map things - just demand that 
whatever is given must be a literal encoding the particular database 
can understand?

> With an encoding attribute, you don't need the utf8_flag at all.

Right, +1

So the above means these two actually behave very differently:

$dbh->{encoding} = ':utf8';

$dbh->{encoding} = 'utf8';

Could be a little confusing, no? Methinks we some long ugly name, maybe 
even worse than "perl_native". Perhaps "perl_internal_utf8_flag"? 1/2 :)

Thanks for plugging away at this. My short term goal is to get this finalized 
enough that I can release the next version of DBD::Pg without a 'pg_' prefix 
to control the encoding items.


- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201110061151
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAk6Nz28ACgkQvJuQZxSWSsiWJQCgt/F0r/sCPDa9GuYrGZpZHlQ2
WfYAn0asIYHmPKz1BDfcBo7wLADHmH7N
=eJmk
-----END PGP SIGNATURE-----


Reply via email to