Changing the charset seems to have fixed it, after a spot check of some pages I know had problems. If the client doesn't find more, we'll be in good shape.
Thanks, everyone! janine On Dec 2, 2010, at 2:08 AM, Bernhard van Woerden wrote: > Just found the same as Russell, it started life as a single byte but needs > some conversion to match it with the correct unicode character which can then > be stored in utf-8. > > What happens if you ask AOLserver to output iso-8859-1 does it convert 0xc2 > 0x92 to 0x92 ? > ns_param OutputCharset iso-8859-1 > > On 2 December 2010 07:39, Janine Ohmer <[email protected]> wrote: > On Dec 1, 2010, at 11:16 AM, Bernhard van Woerden wrote: > >> Can you check the byte sequence of the string in the db that's causing a >> problem. > > Sort of... the column in question is of type text, and the only function I > can find that will convert from text to bytea is decode, not encode. So > here's what I did: > > select decode(answer_3, 'escape') from public_places where > public_place_id=1012; > > The interesting part of the result is "Park\302\222s", which is supposed to > be "Park's". > > I then ran > > select encode('\\302\\222':bytea, 'hex'); > > and got "c292". > > Google tells me that this is the same as U+0092, which is defined simply as > <control>, or "private use 2". > > This doesn't seem very helpful, but it's possible that my initial assumptions > on how to do this were wrong and I'm really just trying to look up garbage > data. :) > > Bernhard, does this give you any clues? > > thanks, > > janine > >> >> select encode(column_name::bytea,'hex') from ... >> or use get_byte to isolate >> >> >> On 1 December 2010 18:44, Janine Ohmer <[email protected]> wrote: >> Hi Brian, >> >> The old system isn't accessible anymore (something went very wrong and even >> though it's RAID and all, fsck ended up removing a bunch of files and making >> the system unbootable). But from looking at the files in etc from the >> backups I can deduce that LANG was set to the same thing as it is on the new >> system, en_US.UTF-8. >> >> I had tried setting these: >> >> ns_param HackContentType 1 >> ns_param URLCharset "utf-8" >> ns_param OutputCharset "utf-8" >> ns_param HttpOpenCharset "utf-8" >> >> Which did not work. I will try adding DefaultCharset just in case that is >> the key, but I'm not holding my breath... >> >> There is an environment variable I can set, PGCLIENTENCODING, which will let >> me specify what encoding to use for the client, so I can add that to >> nsd-postgres. But I'm not sure what to set it to - everything's already in >> UTF8, so there shouldn't *be* any encoding issues (famous last words, I know >> :). >> >> I can't restart the site during the day (very picky client) so will not be >> able to try anything until late afternoon. >> >> thanks, >> >> janine >> >> On Dec 1, 2010, at 3:32 AM, Fenton, Brian wrote: >> >> > Hi Janine >> > >> > this may not be the answer but it can't do any harm. You should consider >> > adding these to the AOLserver tcl file: >> > ns_param HackContentType 1 >> > ns_param DefaultCharset utf-8 >> > ns_param HttpOpenCharset utf-8 >> > ns_param OutputCharset utf-8 >> > ns_param URLCharset utf-8 >> > >> > Also, rule out any OS differences by checking your locale, LANG etc. Does >> > Postgres have an equivalent to Oracle's NLS_LANG? Hopefully you won't have >> > to dive into codepages and all that stuff! >> > >> > best wishes >> > Brian Fenton >> > >> > >> > ________________________________________ >> > From: AOLserver Discussion [[email protected]] On Behalf Of >> > Janine Ohmer [[email protected]] >> > Sent: 01 December 2010 05:19 >> > To: [email protected] >> > Subject: [AOLSERVER] Charset differences between 3.3+ad13 and 4.0.10? >> > >> > Me again... still working out the last details on those sites I had to >> > move. >> > >> > We're having some issues with characters like apostrophes and dashes >> > either disappearing (Safari) or showing up as garbage characters (Firefox). >> > >> > I'm using the same version of Postgres and the same codebase. The >> > databases on both systems use the UNICODE encoding. The main thing that's >> > different is the AOLserver version. There isn't anything about charset in >> > either the old or the new config file. >> > >> > Has anyone done this conversion (in recent memory, that is :) and knows >> > what the problem might be? >> > >> > thanks, >> > >> > janine >> > >> > --- >> > Janine Ohmer (formerly Sisk) >> > President/CEO of furfly, LLC >> > 503-693-6407 >> > >> > >> > -- >> > AOLserver - http://www.aolserver.com/ >> > >> > To Remove yourself from this list, simply send an email to >> > <[email protected]> with the >> > body of "SIGNOFF AOLSERVER" in the email message. You can leave the >> > Subject: field of your email blank. >> > >> > >> > -- >> > AOLserver - http://www.aolserver.com/ >> > >> > To Remove yourself from this list, simply send an email to >> > <[email protected]> with the >> > body of "SIGNOFF AOLSERVER" in the email message. You can leave the >> > Subject: field of your email blank. >> >> --- >> Janine Ohmer (formerly Sisk) >> President/CEO of furfly, LLC >> 503-693-6407 >> >> >> -- >> AOLserver - http://www.aolserver.com/ >> >> To Remove yourself from this list, simply send an email to >> <[email protected]> with the >> body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: >> field of your email blank. >> >> >> -- >> AOLserver - http://www.aolserver.com/ >> >> >> To Remove yourself from this list, simply send an email to >> <[email protected]> with the >> body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: >> field of your email blank. >> > > --- > Janine Ohmer (formerly Sisk) > President/CEO of furfly, LLC > 503-693-6407 > > > > > > -- > AOLserver - http://www.aolserver.com/ > > > To Remove yourself from this list, simply send an email to > <[email protected]> with the > body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: > field of your email blank. > > > > > -- > AOLserver - http://www.aolserver.com/ > > > To Remove yourself from this list, simply send an email to > <[email protected]> with the > body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: > field of your email blank. > --- Janine Ohmer (formerly Sisk) President/CEO of furfly, LLC 503-693-6407 -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to <[email protected]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
