----- Original Message -----
From: "Tim Bunce" <[EMAIL PROTECTED]>
To: "Merijn van den Kroonenberg" <[EMAIL PROTECTED]>
Sent: Tuesday, August 20, 2002 6:35 PM
Subject: Re: perl, unicode and databases (mysql)


> On Tue, Aug 20, 2002 at 06:05:32PM +0200, Merijn van den Kroonenberg
wrote:
> >
> > > In general the quote() method should be as aware of utf8 as the
> > > database is.  If the database supports utf8 then the quote() method
> > > should do-the-right-thing or else it's broken and needs fixing.
> >
> > Well, when i quote it manually:
> >
> > ############################################################
> > # utf8_quote(string)
> > sub utf8_quote($){
> >   my $astring = shift;
> >   $astring =~ s/(['"\\\0])/\\$1/g;
> >   return "'".$astring."'";
> > }# utf8_quote
> > ############################################################
> >
> > Then i can store and retrieve it just fine. So i guess it supports utf8
;-)
>
> It may just be storing a sequence of bytes. (You can check by using
> SQL functions like LENGTH() and SUBSTRING() on it.)

Probably yes, but as long as i don't do any manipulation in the database
like selecting on strings or sorting, it shouldn't matter, right? As long as
the app that retrieves it from the database can work with utf.

>
> Tim.
>
> > > > Oh yeah, one other thing, since Encode::_utf8_on is a internal
function,
> > > > wouldn't it be better to use Encode::decode("utf8",$somevar)
instead? As
> > far
> > > > as i can see, it should do exactly the same, but if i am mistaken,
let
> > me
> > > > know :)
> > >
> > > Encode::_utf8_on *just* sets the internal uft8 flag bit on the value
> > > which *must* be already valid uft8 (or else you'll get problems
later).
> > >
> > > I believe Encode::decode is different (but I've never used either and
> > > could easily not know what I'm talking about :)
> >
> > from perldoc Encode
> >  CAVEAT: When you run "$string = decode("utf8",
> >          $octets)", then $string may not be equal to $octets.
> >          Though they both contain the same data, the utf8 flag
> >          for $string is on unless $octets entirely consists of
> >          ASCII data (or EBCDIC on EBCDIC machines).  See "The
> >          UTF-8 flag" below.
> >
> > Thats why i got that idea, so i wondered, cause it also seems to set the
> > utf8 flag, and leave the data alone. Not sure tho.
> >


Reply via email to