On Wed, Dec 18, 2002 at 06:09:33PM +0000, Dominic Mitchell wrote: > David Wheeler wrote: > >On Wednesday, December 18, 2002, at 01:27 AM, Dominic Mitchell wrote: > >>% psql -l > >> List of databases > >> Name | Owner | Encoding > >>-----------+----------+----------- > >> dom | dom | UNICODE > >> template0 | postgres | SQL_ASCII > >> template1 | postgres | SQL_ASCII > >>(3 rows) > >> > >>I'm using the -PGDG rpm, and looking at the spec file, it seems to > >>indicate that --enable-multibyte is not specified, but it should be > >>the default anyway. Is there a way that I can verify this from the > >>installed software? > > > >I think the above does. I don't think you could have the encoding set to > >UNICODE if it hadn't been compiled with --enable-multibyte. > > Bother, that would have been easy. :-) > > Attached is a patch to the driver which makes it work as expected for > me. I don't think it's the right patch, however... It should probably > only set the UTF8 flag when there is a high bit set in the data.
High bit doesn't always mean uft8 - may be latin1 etc etc. Is it not possible to tell (from the database api) when data is utf8? I don't know PostgreSQL, but from the psql -l output above it seems that the charset is a per-database issue. So maybe the database charset needs to be queried and recorded at connect() time. (I think this is also going to be a big issue in MySQL 4.1, which can have different charsets in different fields of the same table...) Tim.
