On Mon, Apr 16, 2018 at 12:09 PM, Tom Lane <[email protected]> wrote:
> Keith Fiske <[email protected]> writes: > > Running into an issue with helping a client upgrade from 8.3 to 10 (yes, > I > > know, please keep the out of support comments to a minimum, thanks :). > > > The old database was in SQL_ASCII and it needs to stay that way for now > > unfortunately. The dump and restore itself works fine, but we're now > > running into issues with some data returning encoding errors unless we > > specifically set the client_encoding value to SQL_ASCII. > > I'm guessing you might be hitting this 9.1 change: > > * Have psql set the client encoding from the operating system locale > by default (Heikki Linnakangas) > > This only happens if the PGCLIENTENCODING environment variable is > not set. > > I think the previous default was to set client encoding equal to the > server encoding. > > > Looking at the 8.3 database, it has the client_encoding value set to UTF8 > > and queries seem to work fine. Is this just a bug in the old 8.3 not > > enforcing encoding properly? > > Somewhere along the line we made SQL_ASCII -> something else conversions > check that the data was valid per the other encoding, even though no > actual data change happens. > > > The other thing I noticed on the 10 instance was that, while the LOCALE > was > > set to SQL_ASCII, > > You mean encoding, I assume. > > > the COLLATE and CTYPE values for the restored databases > > were en_US.UTF-8. Could this be having an affect? > > This is not a great idea, no. You could be getting strange misbehaviors > in e.g. string comparison, because strcoll() will expect UTF8 data and > will likely not cope well with data that isn't valid in that encoding. > > If you can't sanitize the encoding of your data, I'd suggest running > with lc_collate and lc_ctype set to "C". > > regards, tom lane > Thanks to both of you Adrian & Tom. It is the 9.1 change to the psql client that seems to be causing this. And pg_controldata was able to show that the CTYPE and COLLATE were UTF8 on the old system. If that's the case, do you still think it's a good idea to set the COLLATE and CTYPE to "C"? -- Keith Fiske Senior Database Engineer Crunchy Data - http://crunchydata.com
