Re: [HACKERS] Can ICU be used for a database's default sort order?
Peter Geoghegan writes: > On Fri, Jun 23, 2017 at 11:32 AM, Peter Eisentraut > wrote: >> 1) Associate by name only. That is, you can create a database with any >> COLLATION "foo" that you want, and it's only checked when you first >> connect to or do anything in the database. >> >> 2) Create shared collations. Then we'd need a way to manage having a >> mix of shared and non-shared collations around. >> >> There are significant pros and cons to all of these ideas. Some people >> I talked to appeared to prefer the shared collations approach. > I strongly prefer the second approach. The only downside that occurs > to me is that that approach requires more code. Is there something > that I've missed? I'm not very clear on how you'd bootstrap template1 into anything other than C locale in the second approach. With our existing libc-based stuff, it's possible to define what the database's locale is before there are any catalogs. It's not apparent how to do that with a collation-based solution. In my mind, collations are just a SQL-syntax wrapper for locales that are really defined one level down. I think we'd be well advised to carry that same approach into the database properties, because otherwise we have circularities to deal with. So I'm imagining something more like create database encoding 'utf8' lc_collate 'icu-en_US' lc_ctype ... where lc_collate is just a string that we know how to interpret, the same as now. We could optionally reduce the amount of notation involved by merging the lc_collate and lc_ctype parameters into one, say create database encoding 'utf8' locale 'icu-en_US' ... I'm not too clear on how this would play with other libc locale functionality (lc_monetary and so on), but we'd have to deal with that question anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Can ICU be used for a database's default sort order?
On Fri, Jun 23, 2017 at 11:32 AM, Peter Eisentraut wrote: > It's something I hope to address soon. I hope you do. I think that we'd realize significant benefits by having ICU become the defacto standard collation provider, that most users get without even realizing it. As things stand, you have to make a point of specifying an ICU collation as your per-column collation within every CREATE TABLE. That's a significant barrier to adoption. > 1) Associate by name only. That is, you can create a database with any > COLLATION "foo" that you want, and it's only checked when you first > connect to or do anything in the database. > > 2) Create shared collations. Then we'd need a way to manage having a > mix of shared and non-shared collations around. > > There are significant pros and cons to all of these ideas. Some people > I talked to appeared to prefer the shared collations approach. I strongly prefer the second approach. The only downside that occurs to me is that that approach requires more code. Is there something that I've missed? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Can ICU be used for a database's default sort order?
On 6/22/17 23:10, Peter Geoghegan wrote: > On Thu, Jun 22, 2017 at 7:10 PM, Tom Lane wrote: >> Is there some way I'm missing, or is this just a not-done-yet feature? > > It's a not-done-yet feature. It's something I hope to address soon. The main definitional challenge is how to associate a pg_database entry with a collation. What we currently effectively do is duplicate the fields of pg_collation in pg_database. But I imagine over time we'll add more properties in pg_collation, along with additional ALTER COLLATION commands etc., so duplicating all of that would be a significant amount of code complication and result in a puzzling user interface. Ideally, I'd like to see CREATE DATABASE ... COLLATION "foo". But the problem is of course that collations are per-database objects. Possible solutions: 1) Associate by name only. That is, you can create a database with any COLLATION "foo" that you want, and it's only checked when you first connect to or do anything in the database. 2) Create shared collations. Then we'd need a way to manage having a mix of shared and non-shared collations around. There are significant pros and cons to all of these ideas. Some people I talked to appeared to prefer the shared collations approach. Other ideas? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Can ICU be used for a database's default sort order?
On Thu, Jun 22, 2017 at 7:10 PM, Tom Lane wrote: > Is there some way I'm missing, or is this just a not-done-yet feature? It's a not-done-yet feature. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Can ICU be used for a database's default sort order?
I tried to arrange $subject via create database icu encoding 'utf8' lc_ctype "en-US-x-icu" lc_collate "en-US-x-icu" template template0; and got only ERROR: invalid locale name: "en-US-x-icu" which is unsurprising after looking into the code, because createdb() checks those parameters with check_locale() which only knows about libc-defined locale names. Is there some way I'm missing, or is this just a not-done-yet feature? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers