On Sun, Aug 08, 2004 at 06:15:39PM +0100, Matt Sergeant wrote:
On 8 Aug 2004, at 17:35, David Wheeler wrote:
On Aug 8, 2004, at 9:14 AM, Matt Sergeant wrote:
i.e. for every fetch call, you need to do:
SvUTF8_off(AvARRAY(av)[i]);
Now, people using your DBD can decide to upgrade the variable if they
wish to, but most people who don't need to will be unaffected.
Or, more generally, explicitly call either SvUTF8_off or SvUTF8_on as
appropriate, but be sure to call one of them for each field.
Meanwhile I think it would be wise for the DBI to explicitly do SvUTF8_off
on the elements of the internal row buffer before each row is fetched.
That would avoid the utf8 flag 'leaking' from one row to the next.
I'll do that for DBI 1.44.
I think that this is fine as long as there's an easy way to upgrade
the variable. I could use Encode::_utf8_on(), but that seems like more
overhead than is necessary unless I've loaded Encode for some other
use already. Perhaps there could be a module or even a DBI method that
does the equivalent?
# Psudeocode;
sub utf8_on { SvUTF8_on($_[0]) }
Certainly fairly easy to export that from the DBI.
I'll do that (and utf8_off) for DBI 1.44.
Tim and I talked about long term plans for this, where the user might
specify in advance which columns he'd like UTF-8 turned on for, or some
(I thought horrible) heuristic method where the DBD automagically
decides to turn on the flag if it detects data that it can turn into
UTF-8 - but that sounds like a world of pain to me.
Sure, but some apps/drivers may need the choice.
I'm thinking in terms of something like $sth-{SetUTF8}-[$index] = $mode
0: Force SvUTF8_off regardless
undef: Do nothing (leave it up to the driver)
1: (value is well-formed utf8) ? SvUTF8_on : SvUTF8_off
2: Force SvUTF8_on regardless
(with a way to set it via bind_col as well)
And perhaps a $dbh-{SetUTF8} = $mode; to provide a default.
Umm, it's just dawned on me that the persistance of the utf8 flag
across sv_set functions means I could implement all but 1 in DBI v1.
(Option 1 requires looking at the value that's just been set and
that not simple/efficient for DBI v1.)
Better IMHO would be an extension to bind_col - it should be trivial to
add an attribute in there. The downside being that not many people use
bind_col.
Those that need to control utf8 settings need to make code changes anyway.
Tim.