While looking at bug #11431, I noticed that pg_upgrade still seems to
think that encoding and locale are cluster-wide properties. We got
per-database locale support in 8.4, and encoding has been per-database
much longer than that.
pg_upgrade checks the encoding and locale of template0 in both clusters,
and throws an error if they don't match. But it doesn't check the locale
or encoding of postgres or template1 databases. That leads to problems
if e.g. the postgres database was dropped and recreated with a different
encoding or locale in the old cluster. We will merrily upgrade it, but
strings in the database will be incorrectly encoded.
I propose the attached patch, for git master. It's more complicated in
back-branches, as they still support upgrading from pre-8.4 clusters. We
haven't heard any complaints from the field on this, so I don't think
it's worth trying to back-patch this.
This slightly changes the way the locale comparison works. First, it
ignores the encoding suffix of the locale name. It's of course important
that the databases have a compatible encoding, but pg_database has a
separate field for encoding, and that's now compared directly. Secondly,
it tries to canonicalize the names, by calling setlocale(). That seems
like a good idea, in response to bug #11431
(http://www.postgresql.org/message-id/5424090e.9060...@vmware.com).
- Heikki
>From ff44c80710ce16a8268ecfe63b6306026d4db87f Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas
Date: Tue, 7 Oct 2014 15:38:53 +0300
Subject: [PATCH 1/1] In pg_upgrade, check the encoding and locale of template1
and postgres dbs.
Lc_collate and lc_ctype have been per-database settings since server version
8.4, but we were still treating them as cluster-wide options, fetching the
values for template0 database, and comparing them. That's backwards; we
don't care about the encoding and locale of the template0 database, as
template0 is guaranteed to contain only ASCII characters. But if there are
any other databases that exist on both clusters (in particular template1 and
postgres databases), their encodings and locales must be compatible.
No backpatching, as earlier versions of pg_upgrade still support upgrading
from 8.3 servers. That would be more complicated.
---
contrib/pg_upgrade/check.c | 204 ++-
contrib/pg_upgrade/controldata.c | 34 ---
contrib/pg_upgrade/info.c| 14 ++-
contrib/pg_upgrade/pg_upgrade.h | 6 +-
4 files changed, 85 insertions(+), 173 deletions(-)
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
index bbfcab7..3df4b95 100644
--- a/contrib/pg_upgrade/check.c
+++ b/contrib/pg_upgrade/check.c
@@ -14,12 +14,10 @@
#include "pg_upgrade.h"
-static void set_locale_and_encoding(ClusterInfo *cluster);
static void check_new_cluster_is_empty(void);
-static void check_locale_and_encoding(ControlData *oldctrl,
- ControlData *newctrl);
-static bool equivalent_locale(const char *loca, const char *locb);
-static bool equivalent_encoding(const char *chara, const char *charb);
+static void check_databases_are_compatible(void);
+static void check_locale_and_encoding(DbInfo *olddb, DbInfo *newdb);
+static bool equivalent_locale(int category, const char *loca, const char *locb);
static void check_is_install_user(ClusterInfo *cluster);
static void check_for_prepared_transactions(ClusterInfo *cluster);
static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
@@ -81,8 +79,6 @@ check_and_dump_old_cluster(bool live_check)
if (!live_check)
start_postmaster(&old_cluster, true);
- set_locale_and_encoding(&old_cluster);
-
get_pg_database_relfilenode(&old_cluster);
/* Extract a list of databases and tables from the old cluster */
@@ -127,13 +123,10 @@ check_and_dump_old_cluster(bool live_check)
void
check_new_cluster(void)
{
- set_locale_and_encoding(&new_cluster);
-
- check_locale_and_encoding(&old_cluster.controldata, &new_cluster.controldata);
-
get_db_and_rel_infos(&new_cluster);
check_new_cluster_is_empty();
+ check_databases_are_compatible();
check_loadable_libraries();
@@ -279,93 +272,24 @@ check_cluster_compatibility(bool live_check)
/*
- * set_locale_and_encoding()
- *
- * query the database to get the template0 locale
- */
-static void
-set_locale_and_encoding(ClusterInfo *cluster)
-{
- ControlData *ctrl = &cluster->controldata;
- PGconn *conn;
- PGresult *res;
- int i_encoding;
- int cluster_version = cluster->major_version;
-
- conn = connectToServer(cluster, "template1");
-
- /* for pg < 80400, we got the values from pg_controldata */
- if (cluster_version >= 80400)
- {
- int i_datcollate;
- int i_datctype;
-
- res = executeQueryOrDie(conn,
-"SELECT datcollate, datctype "
-"FROM pg_catalog.pg_database "
-"WHERE datname = 'template0' ");
- assert(PQntuples(res) == 1);
-
- i_datcollate = PQfnumber(res, "datcollate");
- i_datctype = PQfnumber(res