Here's an updated version of the stripped-down patch, now with
documentation changes, plus a couple of minor bug fixes.
I think this is looking pretty good now, and I would appreciate review
from others before I go ahead committing this. Is there any more places
in the documentation that needs updating, or any grammar mistakes?
Further opinions on whether to allow using a database with different
collation order as template, requiring a manual reindex and facing weird
results if that's neglected, or to error out with a hint suggesting to
use template0 as template (as is in the patch now)?
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index f484db8..1e1786a 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -130,23 +130,24 @@ initdb --locale=sv_SE
<para>
The nature of some locale categories is that their value has to be
- fixed for the lifetime of a database cluster. That is, once
- <command>initdb</command> has run, you cannot change them anymore.
- <literal>LC_COLLATE</literal> and <literal>LC_CTYPE</literal> are
- those categories. They affect the sort order of indexes, so they
- must be kept fixed, or indexes on text columns will become corrupt.
- <productname>PostgreSQL</productname> enforces this by recording
- the values of <envar>LC_COLLATE</> and <envar>LC_CTYPE</> that are
- seen by <command>initdb</>. The server automatically adopts
- those two values when it is started.
+ fixed when the database is created. You can use different settings
+ for different databases, but once a database is created, you cannot
+ change them for that database anymore. <literal>LC_COLLATE</literal>
+ and <literal>LC_CTYPE</literal> are those categories. They affect
+ the sort order of indexes, so they must be kept fixed, or indexes on
+ text columns will become corrupt. The default values for these
+ categories are defined when <command>initdb</command> is run, and
+ those values are used when new databases are created, unless
+ explicitly specified otherwise in the <command>CREATE
+ DATABASE</command> command.
</para>
<para>
The other locale categories can be changed as desired whenever the
server is running by setting the run-time configuration variables
that have the same name as the locale categories (see <xref
- linkend="runtime-config-client-format"> for details). The defaults that are
- chosen by <command>initdb</command> are actually only written into
+ linkend="runtime-config-client-format"> for details). The defaults
+ that are chosen by <command>initdb</command> are actually only written into
the configuration file <filename>postgresql.conf</filename> to
serve as defaults when the server is started. If you delete these
assignments from <filename>postgresql.conf</filename> then the
@@ -261,7 +262,7 @@ initdb --locale=sv_SE
<para>
Check that <productname>PostgreSQL</> is actually using the locale
- that you think it is. <envar>LC_COLLATE</> and <envar>LC_CTYPE</>
+ that you think it is. The default <envar>LC_COLLATE</> and <envar>LC_CTYPE</>
settings are determined at <command>initdb</> time and cannot be
changed without repeating <command>initdb</>. Other locale
settings including <envar>LC_MESSAGES</> and <envar>LC_MONETARY</>
@@ -320,16 +321,10 @@ initdb --locale=sv_SE
<para>
An important restriction, however, is that each database character set
- must be compatible with the server's <envar>LC_CTYPE</> setting.
+ must be compatible with the database's <envar>LC_CTYPE</> setting.
When <envar>LC_CTYPE</> is <literal>C</> or <literal>POSIX</>, any
character set is allowed, but for other settings of <envar>LC_CTYPE</>
there is only one character set that will work correctly.
- Since the <envar>LC_CTYPE</> setting is frozen by <command>initdb</>, the
- apparent flexibility to use different encodings in different databases
- of a cluster is more theoretical than real, except when you select
- <literal>C</> or <literal>POSIX</> locale (thus disabling any real locale
- awareness). It is likely that these mechanisms will be revisited in future
- versions of <productname>PostgreSQL</productname>.
</para>
<sect2 id="multibyte-charset-supported">
@@ -734,19 +729,19 @@ initdb -E EUC_JP
</para>
<para>
- If you have selected <literal>C</> or <literal>POSIX</> locale,
- you can create a database with a different character set:
+ You can specify a non-default encoding at database creation time,
+ provided that the encoding is compatible with the selected locale:
<screen>
-createdb -E EUC_KR korean
+createdb -E EUC_KR -T template0 --lc-collate=ko_KR.euckr --lc-ctype=ko_KR.euckr korean
</screen>
This will create a database named <literal>korean</literal> that
- uses the character set <literal>EUC_KR</literal>. Another way to
- accomplish this is to use this SQL command:
+ uses the character set <literal>EUC_KR</literal>, and locale <literal>ko_KR</literal>.
+ Another way to accomplish this is to use this SQL command:
<programlisting>
-CREATE DATABASE korean WITH ENCODING 'EUC_KR';
+CREATE DATABASE korean WITH ENCODING 'EUC_KR' COLLATE='ko_KR.euckr' CTYPE='ko_KR.euckr' TEMPLATE=template0;
</programlisting>
The encoding for a database is stored in the system catalog
@@ -754,6 +749,8 @@ CREATE DATABASE korean WITH ENCODING 'EUC_KR';
<option>-l</option> option or the <command>\l</command> command
of <command>psql</command>.
+XXX It would be nice to show the "korean" database here. And we should
+modify psql to show locale as well
<screen>
$ <userinput>psql -l</userinput>
List of databases
diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml
index 2d4a7bf..f02a957 100644
--- a/doc/src/sgml/indices.sgml
+++ b/doc/src/sgml/indices.sgml
@@ -157,7 +157,7 @@ CREATE INDEX test1_id_index ON test1 (id);
<emphasis>if</emphasis> the pattern is a constant and is anchored to
the beginning of the string — for example, <literal>col LIKE
'foo%'</literal> or <literal>col ~ '^foo'</literal>, but not
- <literal>col LIKE '%bar'</literal>. However, if your server does not
+ <literal>col LIKE '%bar'</literal>. However, if your database does not
use the C locale you will need to create the index with a special
operator class to support indexing of pattern-matching queries. See
<xref linkend="indexes-opclass"> below. It is also possible to use
@@ -922,7 +922,7 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
according to the locale-specific collation rules. This makes
these operator classes suitable for use by queries involving
pattern matching expressions (<literal>LIKE</literal> or POSIX
- regular expressions) when the server does not use the standard
+ regular expressions) when the database does not use the standard
<quote>C</quote> locale. As an example, you might index a
<type>varchar</type> column like this:
<programlisting>
diff --git a/doc/src/sgml/ref/create_database.sgml b/doc/src/sgml/ref/create_database.sgml
index 95350c4..ed048b0 100644
--- a/doc/src/sgml/ref/create_database.sgml
+++ b/doc/src/sgml/ref/create_database.sgml
@@ -24,6 +24,8 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
[ [ WITH ] [ OWNER [=] <replaceable class="parameter">dbowner</replaceable> ]
[ TEMPLATE [=] <replaceable class="parameter">template</replaceable> ]
[ ENCODING [=] <replaceable class="parameter">encoding</replaceable> ]
+ [ COLLATE [=] <replaceable class="parameter">collation</replaceable> ]
+ [ CTYPE [=] <replaceable class="parameter">ctype</replaceable> ]
[ TABLESPACE [=] <replaceable class="parameter">tablespace</replaceable> ]
[ CONNECTION LIMIT [=] <replaceable class="parameter">connlimit</replaceable> ] ]
</synopsis>
@@ -113,6 +115,29 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
</listitem>
</varlistentry>
<varlistentry>
+ <term><replaceable class="parameter">collation</replaceable></term>
+ <listitem>
+ <para>
+ Collation order (<literal>LC_COLLATE</>) to use in the new database.
+ This affects the sort order applied to strings, e.g in queries with
+ ORDER BY, as well as the order used in indexes on text columns.
+ The default is to use the collation order of the template database.
+ See below for additional restrictions.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><replaceable class="parameter">ctype</replaceable></term>
+ <listitem>
+ <para>
+ Character classification (<literal>LC_CTYPE</>) to use in the new
+ database. This affects the categorization of characters, e.g. lower,
+ upper and digit. The default is to use the character classification of
+ the template database. See below for additional restrictions.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
<term><replaceable class="parameter">tablespace</replaceable></term>
<listitem>
<para>
@@ -181,12 +206,10 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
<para>
Any character set encoding specified for the new database must be
- compatible with the server's <envar>LC_CTYPE</> locale setting.
+ compatible with the chosen COLLATE and CTYPE settings.
If <envar>LC_CTYPE</> is <literal>C</> (or equivalently
<literal>POSIX</>), then all encodings are allowed, but for other
- locale settings there is only one encoding that will work properly,
- and so the apparent freedom to specify an encoding is illusory if
- you didn't initialize the database cluster in <literal>C</> locale.
+ locale settings there is only one encoding that will work properly.
<command>CREATE DATABASE</> will allow superusers to specify
<literal>SQL_ASCII</> encoding regardless of the locale setting,
but this choice is deprecated and may result in misbehavior of
@@ -195,6 +218,16 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
</para>
<para>
+ The <literal>COLLATE</> and <literal>CTYPE</> settings must match
+ those of the template database, except when template0 is used as
+ template. This is because <literal>COLLATE</> and <literal>CTYPE</>
+ affects the ordering in indexes, so that any indexes copied from the
+ template database would be invalid in the new database with different
+ settings. <literal>template0</literal>, however, is known to not
+ contain any indexes that would be affected.
+ </para>
+
+ <para>
The <literal>CONNECTION LIMIT</> option is only enforced approximately;
if two new sessions start at about the same time when just one
connection <quote>slot</> remains for the database, it is possible that
diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 10a9d1c..b313b1f 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -76,25 +76,30 @@ PostgreSQL documentation
<para>
<command>initdb</command> initializes the database cluster's default
- locale and character set encoding. The collation order
- (<literal>LC_COLLATE</>) and character set classes
- (<literal>LC_CTYPE</>, e.g. upper, lower, digit) are fixed for all
- databases and cannot be changed. Collation orders other than
- <literal>C</> or <literal>POSIX</> also have a performance penalty.
- For these reasons it is important to choose the right locale when
- running <command>initdb</command>. The remaining locale categories
- can be changed later when the server is started. All server locale
- values (<literal>lc_*</>) can be displayed via <command>SHOW ALL</>.
+ locale and character set encoding.
+
+ The character set encoding, collation order (<literal>LC_COLLATE</>)
+ and character set classes (<literal>LC_CTYPE</>, e.g. upper, lower,
+ digit) can be set separately for a database when it is created.
+ <command>initdb</command> determines those settings for the
+ <literal>template1</literal> database, which will serve as the
+ default for all other databases.
+
+ To alter the default collation order or character set classes, use the
+ <option>--lc-collate</option> and <option>--lc-ctype</option> options.
+ Collation orders other than <literal>C</> or <literal>POSIX</> also have
+ a performance penalty. For these reasons it is important to choose the
+ right locale when running <command>initdb</command>.
+
+ The remaining locale categories can be changed later when the server
+ is started. You can also use <option>--locale</option> to set the
+ default for all locale categories, including collation order and
+ character set classes. All server locale values (<literal>lc_*</>) can
+ be displayed via <command>SHOW ALL</>.
More details can be found in <xref linkend="locale">.
- </para>
- <para>
- The character set encoding can be set separately for a database when
- it is created. <command>initdb</command> determines the encoding for
- the <literal>template1</literal> database, which will serve as the
- default for all other databases. To alter the default encoding use
- the <option>--encoding</option> option. More details can be found in
- <xref linkend="multibyte">.
+ To alter the default encoding, use the <option>--encoding</option>.
+ More details can be found in <xref linkend="multibyte">.
</para>
</refsect1>
diff --git a/doc/src/sgml/ref/pg_controldata.sgml b/doc/src/sgml/ref/pg_controldata.sgml
index b379df0..7f50b7b 100644
--- a/doc/src/sgml/ref/pg_controldata.sgml
+++ b/doc/src/sgml/ref/pg_controldata.sgml
@@ -30,7 +30,7 @@ PostgreSQL documentation
<title>Description</title>
<para>
<command>pg_controldata</command> prints information initialized during
- <command>initdb</>, such as the catalog version and server locale.
+ <command>initdb</>, such as the catalog version.
It also shows information about write-ahead logging and checkpoint
processing. This information is cluster-wide, and not specific to any one
database.
diff --git a/doc/src/sgml/ref/pg_resetxlog.sgml b/doc/src/sgml/ref/pg_resetxlog.sgml
index 1933455..f7331e4 100644
--- a/doc/src/sgml/ref/pg_resetxlog.sgml
+++ b/doc/src/sgml/ref/pg_resetxlog.sgml
@@ -62,12 +62,9 @@ PostgreSQL documentation
by specifying the <literal>-f</> (force) switch. In this case plausible
values will be substituted for the missing data. Most of the fields can be
expected to match, but manual assistance might be needed for the next OID,
- next transaction ID and epoch, next multitransaction ID and offset,
- WAL starting address, and database locale fields.
- The first six of these can be set using the switches discussed below.
- <command>pg_resetxlog</command>'s own environment is the source for its
- guess at the locale fields; take care that <envar>LANG</> and so forth
- match the environment that <command>initdb</> was run in.
+ next transaction ID and epoch, next multitransaction ID and offset, and
+ WAL starting address fields.
+ The first five of these can be set using the switches discussed below.
If you are not able to determine correct values for all these fields,
<literal>-f</> can still be used, but
the recovered database must be treated with even more suspicion than
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index e463d15..a1be7cb 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -747,8 +747,7 @@ SELECT name FROM distributors ORDER BY code;
<para>
Character-string data is sorted according to the locale-specific
- collation order that was established when the database cluster
- was initialized.
+ collation order that was established when the database was created.
</para>
</refsect2>
diff --git a/doc/src/sgml/ref/show.sgml b/doc/src/sgml/ref/show.sgml
index 3d238c3..461a9c8 100644
--- a/doc/src/sgml/ref/show.sgml
+++ b/doc/src/sgml/ref/show.sgml
@@ -82,8 +82,8 @@ SHOW ALL
<para>
Shows the database's locale setting for collation (text
ordering). At present, this parameter can be shown but not
- set, because the setting is determined at
- <command>initdb</> time.
+ set, because the setting is determined at database creation
+ time.
</para>
</listitem>
</varlistentry>
@@ -94,8 +94,8 @@ SHOW ALL
<para>
Shows the database's locale setting for character
classification. At present, this parameter can be shown but
- not set, because the setting is determined at
- <command>initdb</> time.
+ not set, because the setting is determined at database creation
+ time.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 5ce6163..4b7f5eb 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -145,11 +145,12 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
Normally, it will just take the locale settings in the environment
and apply them to the initialized database. It is possible to
specify a different locale for the database; more information about
- that can be found in <xref linkend="locale">. The sort order used
- within a particular database cluster is set by
- <command>initdb</command> and cannot be changed later, short of
- dumping all data, rerunning <command>initdb</command>, and reloading
- the data. There is also a performance impact for using locales
+ that can be found in <xref linkend="locale">. The default sort order used
+ within the particular database cluster is set by
+ <command>initdb</command>, and while you can create new databases using
+ different sort order, the order used in the template databases that initdb
+ creates cannot be changed without dropping and recreating them.
+ There is also a performance impact for using locales
other than <literal>C</> or <literal>POSIX</>. Therefore, it is
important to make this choice correctly the first time.
</para>
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index e48f82b..98ac7f8 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -1896,7 +1896,7 @@ LIMIT 10;
<note>
<para>
- The parser's notion of a <quote>letter</> is determined by the server's
+ The parser's notion of a <quote>letter</> is determined by the database's
locale setting, specifically <varname>lc_ctype</>. Words containing
only the basic ASCII letters are reported as a separate token type,
since it is sometimes useful to distinguish them. In most European
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 5645271..edd1d74 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3847,7 +3847,6 @@ WriteControlFile(void)
{
int fd;
char buffer[PG_CONTROL_SIZE]; /* need not be aligned */
- char *localeptr;
/*
* Initialize version and compatibility-check fields
@@ -3876,18 +3875,6 @@ WriteControlFile(void)
ControlFile->float4ByVal = FLOAT4PASSBYVAL;
ControlFile->float8ByVal = FLOAT8PASSBYVAL;
- ControlFile->localeBuflen = LOCALE_NAME_BUFLEN;
- localeptr = setlocale(LC_COLLATE, NULL);
- if (!localeptr)
- ereport(PANIC,
- (errmsg("invalid LC_COLLATE setting")));
- StrNCpy(ControlFile->lc_collate, localeptr, LOCALE_NAME_BUFLEN);
- localeptr = setlocale(LC_CTYPE, NULL);
- if (!localeptr)
- ereport(PANIC,
- (errmsg("invalid LC_CTYPE setting")));
- StrNCpy(ControlFile->lc_ctype, localeptr, LOCALE_NAME_BUFLEN);
-
/* Contents are protected with a CRC */
INIT_CRC32(ControlFile->crc);
COMP_CRC32(ControlFile->crc,
@@ -4126,34 +4113,6 @@ ReadControlFile(void)
" but the server was compiled without USE_FLOAT8_BYVAL."),
errhint("It looks like you need to recompile or initdb.")));
#endif
-
- if (ControlFile->localeBuflen != LOCALE_NAME_BUFLEN)
- ereport(FATAL,
- (errmsg("database files are incompatible with server"),
- errdetail("The database cluster was initialized with LOCALE_NAME_BUFLEN %d,"
- " but the server was compiled with LOCALE_NAME_BUFLEN %d.",
- ControlFile->localeBuflen, LOCALE_NAME_BUFLEN),
- errhint("It looks like you need to recompile or initdb.")));
- if (pg_perm_setlocale(LC_COLLATE, ControlFile->lc_collate) == NULL)
- ereport(FATAL,
- (errmsg("database files are incompatible with operating system"),
- errdetail("The database cluster was initialized with LC_COLLATE \"%s\","
- " which is not recognized by setlocale().",
- ControlFile->lc_collate),
- errhint("It looks like you need to initdb or install locale support.")));
- if (pg_perm_setlocale(LC_CTYPE, ControlFile->lc_ctype) == NULL)
- ereport(FATAL,
- (errmsg("database files are incompatible with operating system"),
- errdetail("The database cluster was initialized with LC_CTYPE \"%s\","
- " which is not recognized by setlocale().",
- ControlFile->lc_ctype),
- errhint("It looks like you need to initdb or install locale support.")));
-
- /* Make the fixed locale settings visible as GUC variables, too */
- SetConfigOption("lc_collate", ControlFile->lc_collate,
- PGC_INTERNAL, PGC_S_OVERRIDE);
- SetConfigOption("lc_ctype", ControlFile->lc_ctype,
- PGC_INTERNAL, PGC_S_OVERRIDE);
}
void
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 4dd6262..2d5e27b 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -53,6 +53,7 @@
#include "utils/fmgroids.h"
#include "utils/guc.h"
#include "utils/lsyscache.h"
+#include "utils/pg_locale.h"
#include "utils/syscache.h"
#include "utils/tqual.h"
@@ -69,7 +70,7 @@ static bool get_db_info(const char *name, LOCKMODE lockmode,
Oid *dbIdP, Oid *ownerIdP,
int *encodingP, bool *dbIsTemplateP, bool *dbAllowConnP,
Oid *dbLastSysOidP, TransactionId *dbFrozenXidP,
- Oid *dbTablespace);
+ Oid *dbTablespace, char **dbCollation, char **dbCtype);
static bool have_createdb_privilege(void);
static void remove_dbtablespaces(Oid db_id);
static bool check_db_file_conflict(Oid db_id);
@@ -87,6 +88,8 @@ createdb(const CreatedbStmt *stmt)
Oid src_dboid;
Oid src_owner;
int src_encoding;
+ char *src_collation;
+ char *src_ctype;
bool src_istemplate;
bool src_allowconn;
Oid src_lastsysoid;
@@ -104,10 +107,14 @@ createdb(const CreatedbStmt *stmt)
DefElem *downer = NULL;
DefElem *dtemplate = NULL;
DefElem *dencoding = NULL;
+ DefElem *dcollation = NULL;
+ DefElem *dctype = NULL;
DefElem *dconnlimit = NULL;
char *dbname = stmt->dbname;
char *dbowner = NULL;
const char *dbtemplate = NULL;
+ char *lc_collate = NULL;
+ char *lc_ctype = NULL;
int encoding = -1;
int dbconnlimit = -1;
int ctype_encoding;
@@ -152,6 +159,22 @@ createdb(const CreatedbStmt *stmt)
errmsg("conflicting or redundant options")));
dencoding = defel;
}
+ else if (strcmp(defel->defname, "collate") == 0)
+ {
+ if (dcollation)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options")));
+ dcollation = defel;
+ }
+ else if (strcmp(defel->defname, "ctype") == 0)
+ {
+ if (dctype)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options")));
+ dctype = defel;
+ }
else if (strcmp(defel->defname, "connectionlimit") == 0)
{
if (dconnlimit)
@@ -205,6 +228,11 @@ createdb(const CreatedbStmt *stmt)
elog(ERROR, "unrecognized node type: %d",
nodeTag(dencoding->arg));
}
+ if (dcollation && dcollation->arg)
+ lc_collate = strVal(dcollation->arg);
+ if (dctype && dctype->arg)
+ lc_ctype = strVal(dctype->arg);
+
if (dconnlimit && dconnlimit->arg)
dbconnlimit = intVal(dconnlimit->arg);
@@ -243,7 +271,8 @@ createdb(const CreatedbStmt *stmt)
if (!get_db_info(dbtemplate, ShareLock,
&src_dboid, &src_owner, &src_encoding,
&src_istemplate, &src_allowconn, &src_lastsysoid,
- &src_frozenxid, &src_deftablespace))
+ &src_frozenxid, &src_deftablespace,
+ &src_collation, &src_ctype))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_DATABASE),
errmsg("template database \"%s\" does not exist",
@@ -262,9 +291,13 @@ createdb(const CreatedbStmt *stmt)
dbtemplate)));
}
- /* If encoding is defaulted, use source's encoding */
+ /* If encoding or locales are defaulted, use source's setting */
if (encoding < 0)
encoding = src_encoding;
+ if (lc_collate == NULL)
+ lc_collate = src_collation;
+ if (lc_ctype == NULL)
+ lc_ctype = src_ctype;
/* Some encodings are client only */
if (!PG_VALID_BE_ENCODING(encoding))
@@ -272,6 +305,16 @@ createdb(const CreatedbStmt *stmt)
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("invalid server encoding %d", encoding)));
+ /* Check that the chosen locales are valid */
+ if (!check_locale(LC_COLLATE, lc_collate))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("invalid locale name %s", lc_collate)));
+ if (!check_locale(LC_CTYPE, lc_ctype))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("invalid locale name %s", lc_ctype)));
+
/*
* Check whether encoding matches server locale settings. We allow
* mismatch in three cases:
@@ -290,7 +333,7 @@ createdb(const CreatedbStmt *stmt)
*
* Note: if you change this policy, fix initdb to match.
*/
- ctype_encoding = pg_get_encoding_from_locale(NULL);
+ ctype_encoding = pg_get_encoding_from_locale(lc_ctype);
if (!(ctype_encoding == encoding ||
ctype_encoding == PG_SQL_ASCII ||
@@ -299,12 +342,32 @@ createdb(const CreatedbStmt *stmt)
#endif
(encoding == PG_SQL_ASCII && superuser())))
ereport(ERROR,
- (errmsg("encoding %s does not match server's locale %s",
+ (errmsg("encoding %s does not match locale %s",
pg_encoding_to_char(encoding),
- setlocale(LC_CTYPE, NULL)),
- errdetail("The server's LC_CTYPE setting requires encoding %s.",
+ lc_ctype),
+ errdetail("The chosen LC_CTYPE setting requires encoding %s.",
pg_encoding_to_char(ctype_encoding))));
+ /*
+ * Check that the new locale is compatible with the source database.
+ *
+ * We know that template0 doesn't contain any indexes that depend on
+ * collation or ctype, so template0 can be used as template for
+ * any locale.
+ */
+ if (strcmp(dbtemplate, "template0") != 0)
+ {
+ if (strcmp(lc_collate, src_collation))
+ ereport(ERROR,
+ (errmsg("new collation is incompatible with the collation of the template database (%s)", src_collation),
+ errhint("Use the same collation as in the template database, or use template0 as template")));
+
+ if (strcmp(lc_ctype, src_ctype))
+ ereport(ERROR,
+ (errmsg("new ctype is incompatible with the ctype of the template database (%s)", src_ctype),
+ errhint("Use the same ctype as in the template database, or use template0 as template")));
+ }
+
/* Resolve default tablespace for new database */
if (dtablespacename && dtablespacename->arg)
{
@@ -421,6 +484,8 @@ createdb(const CreatedbStmt *stmt)
DirectFunctionCall1(namein, CStringGetDatum(dbname));
new_record[Anum_pg_database_datdba - 1] = ObjectIdGetDatum(datdba);
new_record[Anum_pg_database_encoding - 1] = Int32GetDatum(encoding);
+ new_record[Anum_pg_database_collation - 1] = DirectFunctionCall1(namein, CStringGetDatum(lc_collate));
+ new_record[Anum_pg_database_ctype - 1] = DirectFunctionCall1(namein, CStringGetDatum(lc_ctype));
new_record[Anum_pg_database_datistemplate - 1] = BoolGetDatum(false);
new_record[Anum_pg_database_datallowconn - 1] = BoolGetDatum(true);
new_record[Anum_pg_database_datconnlimit - 1] = Int32GetDatum(dbconnlimit);
@@ -629,7 +694,7 @@ dropdb(const char *dbname, bool missing_ok)
pgdbrel = heap_open(DatabaseRelationId, RowExclusiveLock);
if (!get_db_info(dbname, AccessExclusiveLock, &db_id, NULL, NULL,
- &db_istemplate, NULL, NULL, NULL, NULL))
+ &db_istemplate, NULL, NULL, NULL, NULL, NULL, NULL))
{
if (!missing_ok)
{
@@ -781,7 +846,7 @@ RenameDatabase(const char *oldname, const char *newname)
rel = heap_open(DatabaseRelationId, RowExclusiveLock);
if (!get_db_info(oldname, AccessExclusiveLock, &db_id, NULL, NULL,
- NULL, NULL, NULL, NULL, NULL))
+ NULL, NULL, NULL, NULL, NULL, NULL, NULL))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_DATABASE),
errmsg("database \"%s\" does not exist", oldname)));
@@ -1168,7 +1233,7 @@ get_db_info(const char *name, LOCKMODE lockmode,
Oid *dbIdP, Oid *ownerIdP,
int *encodingP, bool *dbIsTemplateP, bool *dbAllowConnP,
Oid *dbLastSysOidP, TransactionId *dbFrozenXidP,
- Oid *dbTablespace)
+ Oid *dbTablespace, char **dbCollation, char **dbCtype)
{
bool result = false;
Relation relation;
@@ -1259,6 +1324,11 @@ get_db_info(const char *name, LOCKMODE lockmode,
/* default tablespace for this database */
if (dbTablespace)
*dbTablespace = dbform->dattablespace;
+ /* default locale settings for this database */
+ if (dbCollation)
+ *dbCollation = pstrdup(NameStr(dbform->collation));
+ if (dbCtype)
+ *dbCtype = pstrdup(NameStr(dbform->ctype));
ReleaseSysCache(tuple);
result = true;
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 5ff353d..0831a94 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -398,7 +398,7 @@ static TypeName *TableFuncTypeName(List *columns);
CLUSTER COALESCE COLLATE COLUMN COMMENT COMMIT
COMMITTED CONCURRENTLY CONFIGURATION CONNECTION CONSTRAINT CONSTRAINTS
CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE CREATEDB
- CREATEROLE CREATEUSER CROSS CSV CURRENT_P CURRENT_DATE CURRENT_ROLE
+ CREATEROLE CREATEUSER CROSS CSV CTYPE CURRENT_P CURRENT_DATE CURRENT_ROLE
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
@@ -5458,6 +5458,22 @@ createdb_opt_item:
{
$$ = makeDefElem("encoding", NULL);
}
+ | COLLATE opt_equal Sconst
+ {
+ $$ = makeDefElem("collate", (Node *)makeString($3));
+ }
+ | COLLATE opt_equal DEFAULT
+ {
+ $$ = makeDefElem("collate", NULL);
+ }
+ | CTYPE opt_equal Sconst
+ {
+ $$ = makeDefElem("ctype", (Node *)makeString($3));
+ }
+ | CTYPE opt_equal DEFAULT
+ {
+ $$ = makeDefElem("ctype", NULL);
+ }
| CONNECTION LIMIT opt_equal SignedIconst
{
$$ = makeDefElem("connectionlimit", (Node *)makeInteger($4));
@@ -9216,6 +9232,7 @@ unreserved_keyword:
| CREATEROLE
| CREATEUSER
| CSV
+ | CTYPE
| CURRENT_P
| CURSOR
| CYCLE
diff --git a/src/backend/parser/keywords.c b/src/backend/parser/keywords.c
index b30a478..e2fe4bf 100644
--- a/src/backend/parser/keywords.c
+++ b/src/backend/parser/keywords.c
@@ -114,6 +114,7 @@ const ScanKeyword ScanKeywords[] = {
{"createuser", CREATEUSER, UNRESERVED_KEYWORD},
{"cross", CROSS, TYPE_FUNC_NAME_KEYWORD},
{"csv", CSV, UNRESERVED_KEYWORD},
+ {"ctype", CTYPE, UNRESERVED_KEYWORD},
{"current", CURRENT_P, UNRESERVED_KEYWORD},
{"current_date", CURRENT_DATE, RESERVED_KEYWORD},
{"current_role", CURRENT_ROLE, RESERVED_KEYWORD},
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index c0a01ae..4a55be5 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -189,6 +189,30 @@ pg_perm_setlocale(int category, const char *locale)
}
+/*
+ * Is the locale name valid for the locale category?
+ */
+bool
+check_locale(int category, const char *value)
+{
+ char *save;
+ bool ret;
+
+ save = setlocale(category, NULL);
+ if (!save)
+ return false; /* won't happen, we hope */
+
+ /* save may be pointing at a modifiable scratch variable, see above */
+ save = pstrdup(save);
+
+ ret = (setlocale(category, value) != NULL);
+
+ setlocale(category, save); /* assume this won't fail */
+ pfree(save);
+
+ return ret;
+}
+
/* GUC assign hooks */
/*
@@ -203,21 +227,9 @@ pg_perm_setlocale(int category, const char *locale)
static const char *
locale_xxx_assign(int category, const char *value, bool doit, GucSource source)
{
- char *save;
-
- save = setlocale(category, NULL);
- if (!save)
- return NULL; /* won't happen, we hope */
-
- /* save may be pointing at a modifiable scratch variable, see above */
- save = pstrdup(save);
-
- if (!setlocale(category, value))
+ if (!check_locale(category, value))
value = NULL; /* set failure return marker */
- setlocale(category, save); /* assume this won't fail */
- pfree(save);
-
/* need to reload cache next time? */
if (doit && value != NULL)
{
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 461edd9..08fbf3b 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -159,6 +159,8 @@ CheckMyDatabase(const char *name, bool am_superuser)
{
HeapTuple tup;
Form_pg_database dbform;
+ char *collate;
+ char *ctype;
/* Fetch our real pg_database row */
tup = SearchSysCache(DATABASEOID,
@@ -240,6 +242,28 @@ CheckMyDatabase(const char *name, bool am_superuser)
/* If we have no other source of client_encoding, use server encoding */
SetConfigOption("client_encoding", GetDatabaseEncodingName(),
PGC_BACKEND, PGC_S_DEFAULT);
+
+ /* assign locale variables */
+ collate = NameStr(dbform->collation);
+ ctype = NameStr(dbform->ctype);
+
+ if (setlocale(LC_COLLATE, collate) == NULL)
+ ereport(FATAL,
+ (errmsg("database locale is incompatible with operating system"),
+ errdetail("The database was initialized with LC_COLLATE \"%s\", "
+ " which is not recognized by setlocale().", collate),
+ errhint("Recreate the database with another locale or install the missing locale.")));
+
+ if (setlocale(LC_CTYPE, ctype) == NULL)
+ ereport(FATAL,
+ (errmsg("database locale is incompatible with operating system"),
+ errdetail("The database was initialized with LC_CTYPE \"%s\", "
+ " which is not recognized by setlocale().", ctype),
+ errhint("Recreate the database with another locale or install the missing locale.")));
+
+ /* Make the locale settings visible as GUC variables, too */
+ SetConfigOption("lc_collate", collate, PGC_INTERNAL, PGC_S_DATABASE);
+ SetConfigOption("lc_ctype", ctype, PGC_INTERNAL, PGC_S_DATABASE);
/*
* Lastly, set up any database-specific configuration variables.
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 3effafb..4b85281 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -1353,6 +1353,10 @@ bootstrap_template1(char *short_version)
bki_lines = replace_token(bki_lines, "ENCODING", encodingid);
+ bki_lines = replace_token(bki_lines, "LC_COLLATE", lc_collate);
+
+ bki_lines = replace_token(bki_lines, "LC_CTYPE", lc_ctype);
+
/*
* Pass correct LC_xxx environment to bootstrap.
*
@@ -2378,12 +2382,12 @@ usage(const char *progname)
printf(_("\nOptions:\n"));
printf(_(" [-D, --pgdata=]DATADIR location for this database cluster\n"));
printf(_(" -E, --encoding=ENCODING set default encoding for new databases\n"));
- printf(_(" --locale=LOCALE initialize database cluster with given locale\n"));
+ printf(_(" --locale=LOCALE set default locale for new databases\n"));
printf(_(" --lc-collate, --lc-ctype, --lc-messages=LOCALE\n"
" --lc-monetary, --lc-numeric, --lc-time=LOCALE\n"
- " initialize database cluster with given locale\n"
- " in the respective category (default taken from\n"
- " environment)\n"));
+ " set default locale in the respective\n"
+ " category for new databases (default\n"
+ " taken from environment)\n"));
printf(_(" --no-locale equivalent to --locale=C\n"));
printf(_(" -T, --text-search-config=CFG\n"
" default text search configuration\n"));
@@ -2806,10 +2810,12 @@ main(int argc, char *argv[])
strcmp(lc_ctype, lc_numeric) == 0 &&
strcmp(lc_ctype, lc_monetary) == 0 &&
strcmp(lc_ctype, lc_messages) == 0)
- printf(_("The database cluster will be initialized with locale %s.\n"), lc_ctype);
+ printf(_("The template databases will be initialized with locale %s.\n"), lc_ctype);
else
{
- printf(_("The database cluster will be initialized with locales\n"
+ /* XXX only collate and ctype are actually set in stone here, others
+ * are userset gucs */
+ printf(_("The template databases will be initialized with locales\n"
" COLLATE: %s\n"
" CTYPE: %s\n"
" MESSAGES: %s\n"
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 7970c32..08e102f 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -220,12 +220,5 @@ main(int argc, char *argv[])
(ControlFile.float4ByVal ? _("by value") : _("by reference")));
printf(_("Float8 argument passing: %s\n"),
(ControlFile.float8ByVal ? _("by value") : _("by reference")));
- printf(_("Maximum length of locale name: %u\n"),
- ControlFile.localeBuflen);
- printf(_("LC_COLLATE: %s\n"),
- ControlFile.lc_collate);
- printf(_("LC_CTYPE: %s\n"),
- ControlFile.lc_ctype);
-
return 0;
}
diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c b/src/bin/pg_resetxlog/pg_resetxlog.c
index 345b89c..0df796b 100644
--- a/src/bin/pg_resetxlog/pg_resetxlog.c
+++ b/src/bin/pg_resetxlog/pg_resetxlog.c
@@ -493,22 +493,6 @@ GuessControlValues(void)
#endif
ControlFile.float4ByVal = FLOAT4PASSBYVAL;
ControlFile.float8ByVal = FLOAT8PASSBYVAL;
- ControlFile.localeBuflen = LOCALE_NAME_BUFLEN;
-
- localeptr = setlocale(LC_COLLATE, "");
- if (!localeptr)
- {
- fprintf(stderr, _("%s: invalid LC_COLLATE setting\n"), progname);
- exit(1);
- }
- strlcpy(ControlFile.lc_collate, localeptr, sizeof(ControlFile.lc_collate));
- localeptr = setlocale(LC_CTYPE, "");
- if (!localeptr)
- {
- fprintf(stderr, _("%s: invalid LC_CTYPE setting\n"), progname);
- exit(1);
- }
- strlcpy(ControlFile.lc_ctype, localeptr, sizeof(ControlFile.lc_ctype));
/*
* XXX eventually, should try to grovel through old XLOG to develop more
@@ -584,12 +568,6 @@ PrintControlValues(bool guessed)
(ControlFile.float4ByVal ? _("by value") : _("by reference")));
printf(_("Float8 argument passing: %s\n"),
(ControlFile.float8ByVal ? _("by value") : _("by reference")));
- printf(_("Maximum length of locale name: %u\n"),
- ControlFile.localeBuflen);
- printf(_("LC_COLLATE: %s\n"),
- ControlFile.lc_collate);
- printf(_("LC_CTYPE: %s\n"),
- ControlFile.lc_ctype);
}
diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index 286667e..37d41de 100644
--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -32,6 +32,8 @@ main(int argc, char *argv[])
{"tablespace", required_argument, NULL, 'D'},
{"template", required_argument, NULL, 'T'},
{"encoding", required_argument, NULL, 'E'},
+ {"lc-collate", required_argument, NULL, 1},
+ {"lc-ctype", required_argument, NULL, 2},
{NULL, 0, NULL, 0}
};
@@ -50,6 +52,8 @@ main(int argc, char *argv[])
char *tablespace = NULL;
char *template = NULL;
char *encoding = NULL;
+ char *lc_collate = NULL;
+ char *lc_ctype = NULL;
PQExpBufferData sql;
@@ -95,6 +99,12 @@ main(int argc, char *argv[])
case 'E':
encoding = optarg;
break;
+ case 1:
+ lc_collate = optarg;
+ break;
+ case 2:
+ lc_ctype = optarg;
+ break;
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
exit(1);
@@ -152,6 +162,11 @@ main(int argc, char *argv[])
appendPQExpBuffer(&sql, " ENCODING '%s'", encoding);
if (template)
appendPQExpBuffer(&sql, " TEMPLATE %s", fmtId(template));
+ if (lc_collate)
+ appendPQExpBuffer(&sql, " COLLATE '%s'", lc_collate);
+ if (lc_ctype)
+ appendPQExpBuffer(&sql, " CTYPE '%s'", lc_ctype);
+
appendPQExpBuffer(&sql, ";\n");
conn = connectDatabase(strcmp(dbname, "postgres") == 0 ? "template1" : "postgres",
@@ -209,6 +224,9 @@ help(const char *progname)
printf(_("\nOptions:\n"));
printf(_(" -D, --tablespace=TABLESPACE default tablespace for the database\n"));
printf(_(" -E, --encoding=ENCODING encoding for the database\n"));
+ printf(_(" --lc-collate=LOCALE LC_COLLATE setting for the database\n"));
+ printf(_(" --lc-ctype=LOCALE LC_CTYPE setting for the database\n"));
+
printf(_(" -O, --owner=OWNER database user to own the new database\n"));
printf(_(" -T, --template=TEMPLATE template database to copy\n"));
printf(_(" -e, --echo show the commands being sent to the server\n"));
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 38b5a84..aee6934 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -144,11 +144,6 @@ typedef struct ControlFileData
bool float4ByVal; /* float4 pass-by-value? */
bool float8ByVal; /* float8, int8, etc pass-by-value? */
- /* active locales */
- uint32 localeBuflen;
- char lc_collate[LOCALE_NAME_BUFLEN];
- char lc_ctype[LOCALE_NAME_BUFLEN];
-
/* CRC of all above ... MUST BE LAST! */
pg_crc32 crc;
} ControlFileData;
diff --git a/src/include/catalog/pg_database.h b/src/include/catalog/pg_database.h
index 6e9e5d2..1b10e60 100644
--- a/src/include/catalog/pg_database.h
+++ b/src/include/catalog/pg_database.h
@@ -33,6 +33,8 @@ CATALOG(pg_database,1262) BKI_SHARED_RELATION
NameData datname; /* database name */
Oid datdba; /* owner of database */
int4 encoding; /* character encoding */
+ NameData collation; /* LC_COLLATE of database */
+ NameData ctype; /* LC_CTYPE of database */
bool datistemplate; /* allowed as CREATE DATABASE template? */
bool datallowconn; /* new connections allowed? */
int4 datconnlimit; /* max connections allowed (-1=no limit) */
@@ -54,20 +56,22 @@ typedef FormData_pg_database *Form_pg_database;
* compiler constants for pg_database
* ----------------
*/
-#define Natts_pg_database 11
+#define Natts_pg_database 13
#define Anum_pg_database_datname 1
#define Anum_pg_database_datdba 2
#define Anum_pg_database_encoding 3
-#define Anum_pg_database_datistemplate 4
-#define Anum_pg_database_datallowconn 5
-#define Anum_pg_database_datconnlimit 6
-#define Anum_pg_database_datlastsysoid 7
-#define Anum_pg_database_datfrozenxid 8
-#define Anum_pg_database_dattablespace 9
-#define Anum_pg_database_datconfig 10
-#define Anum_pg_database_datacl 11
+#define Anum_pg_database_collation 4
+#define Anum_pg_database_ctype 5
+#define Anum_pg_database_datistemplate 6
+#define Anum_pg_database_datallowconn 7
+#define Anum_pg_database_datconnlimit 8
+#define Anum_pg_database_datlastsysoid 9
+#define Anum_pg_database_datfrozenxid 10
+#define Anum_pg_database_dattablespace 11
+#define Anum_pg_database_datconfig 12
+#define Anum_pg_database_datacl 13
-DATA(insert OID = 1 ( template1 PGUID ENCODING t t -1 0 0 1663 _null_ _null_ ));
+DATA(insert OID = 1 ( template1 PGUID ENCODING "LC_COLLATE" "LC_CTYPE" t t -1 0 0 1663 _null_ _null_));
SHDESCR("default template database");
#define TemplateDbOid 1
diff --git a/src/include/utils/pg_locale.h b/src/include/utils/pg_locale.h
index 5a49823..2b60027 100644
--- a/src/include/utils/pg_locale.h
+++ b/src/include/utils/pg_locale.h
@@ -39,6 +39,7 @@ extern const char *locale_numeric_assign(const char *value,
extern const char *locale_time_assign(const char *value,
bool doit, GucSource source);
+extern bool check_locale(int category, const char *locale);
extern char *pg_perm_setlocale(int category, const char *locale);
extern bool lc_collate_is_c(void);
diff --git a/src/interfaces/ecpg/preproc/preproc.y b/src/interfaces/ecpg/preproc/preproc.y
index 0a8b62b..949e76b 100644
--- a/src/interfaces/ecpg/preproc/preproc.y
+++ b/src/interfaces/ecpg/preproc/preproc.y
@@ -428,7 +428,7 @@ add_typedef(char *name, char * dimension, char * length, enum ECPGttype type_enu
CLUSTER COALESCE COLLATE COLUMN COMMENT COMMIT
COMMITTED CONCURRENTLY CONFIGURATION CONNECTION CONSTRAINT CONSTRAINTS
CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE CREATEDB
- CREATEROLE CREATEUSER CROSS CSV CURRENT_P CURRENT_DATE CURRENT_ROLE
+ CREATEROLE CREATEUSER CROSS CSV CTYPE CURRENT_P CURRENT_DATE CURRENT_ROLE
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers