Re: [HACKERS] CREATE COLLATION definitional questions for ICU
Peter Eisentrautwrites: > On 6/25/17 11:45, Tom Lane wrote: >> * Now that it's possible for user-created collations to have encoding -1, >> I do not think that the "shadowing" tests in CollationCreate and >> IsThereCollationInNamespace are sufficient. They don't prevent a new >> collation with encoding -1 from shadowing an existing encoding-specific >> collation that doesn't happen to match the current DB's encoding. >> Now, you'd have to work at it for that to cause problems --- say, >> create such a situation in template0 and then copy template0 specifying >> that other encoding. But none of that is forbidden. > I see. Do you have an idea how to fix it without too much overhead? We > couldn't use the existing syscache for it. I think you could use SearchSysCacheList to acquire a list of all collations with the given name, and then run through it looking for entries matching the target namespace. This doesn't seem too inefficient, since the list would typically have length at most one. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION definitional questions for ICU
On 6/27/17 11:17, Tom Lane wrote: > Moreover, if you insist on defining it this way, it's going to limit > our freedom of action in future. It's possible that, either in some > future version of ICU or for some other provider, there could be more > than one version of a collation simultaneously available. Good point, but I think this is highly theoretical and would probably work differently and separately from what we have now. If someone invented a feature that allows linking against multiple versions of ICU at once (thus allowing old versions to be continued to be used), then you'd still want the safety check of the current style that the version currently in use is the one previously used. If ICU offered a way to use multiple collation versions from the same library version, then I'd imagine that you would select the version with an attribute in the locale name, not with a separate field. And arguably, considering how the ICU locale name resolution works, having a safety check that records the actual version would still be of use. So "the version I want" and "the version I got" would continue to be separate attributes. I would not mind a gentle renaming of the current functionality if it could make that clearer. > Lastly, I'm not seeing the use-case for having CREATE COLLATION magically > make a working collation from a broken one. Wouldn't you normally > expect to need to do ALTER COLLATION REFRESH on an obsolete collation > before doing anything with it? My question would be, what's the use case for doing it the other way around? Thinking of an analogy, if there is a table with an index that is somehow affected by pg_upgrade and needs reindexing after pg_upgrade, and I run CREATE TABLE LIKE to make a new table like that old table, would it create the new table in a state that it would also require reindexing before it can be used? That doesn't seem very useful. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION definitional questions for ICU
On 6/25/17 11:45, Tom Lane wrote: > * Now that it's possible for user-created collations to have encoding -1, > I do not think that the "shadowing" tests in CollationCreate and > IsThereCollationInNamespace are sufficient. They don't prevent a new > collation with encoding -1 from shadowing an existing encoding-specific > collation that doesn't happen to match the current DB's encoding. > Now, you'd have to work at it for that to cause problems --- say, > create such a situation in template0 and then copy template0 specifying > that other encoding. But none of that is forbidden. I see. Do you have an idea how to fix it without too much overhead? We couldn't use the existing syscache for it. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION definitional questions for ICU
On 6/25/17 11:45, Tom Lane wrote: > * For an ICU collation, should we not insist that collcollate and > collctype be equal? If not, what does it mean for them to be different? I have fixed that for now by enforcing them to be the same. In the longer term, I'm thinking about converting these two columns into a more flexible array of properties. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION definitional questions for ICU
On 6/25/17 11:45, Tom Lane wrote: > * Also (and this would be a pre-existing bug), why doesn't the FROM > path copy the old collation's encoding? For example, if you attempted > to clone the "C" encoding, you wouldn't get a true clone but something > that's specific to the current DB's encoding. Fixed that. This has probably no practical impact, so I don't plan to backpatch it. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION definitional questions for ICU
Peter Eisentrautwrites: > On 6/25/17 11:45, Tom Lane wrote: >> * Should not the FROM code path copy the old collation's version? >> It seems a little bit weird that "cloning" a collation takes the >> liberty of installing a new version. > I think this is working correctly. Specifying the version explicitly is > really only useful for pg_upgrade, because it needs to reproduce the > state that is on disk. When you create a new collation, you want to > work with the collation version that the currently running software > provides. Hm. I certainly buy that for the "long form" of CREATE COLLATION, but it seems to me that the charter of CREATE COLLATION FROM is to clone the source collation's properties exactly. The C.C. ref page says that in so many words: The name of an existing collation to copy. The new collation will have the same properties as the existing one, but it ^ will be an independent object. Moreover, if you insist on defining it this way, it's going to limit our freedom of action in future. It's possible that, either in some future version of ICU or for some other provider, there could be more than one version of a collation simultaneously available. In that scenario, if an existing collation object referred to one of those available versions, it would be quite surprising for CREATE COLLATION FROM to silently substitute another one. But we'd be kind of stuck with doing that if we allow this precedent to be set in v10. Lastly, I'm not seeing the use-case for having CREATE COLLATION magically make a working collation from a broken one. Wouldn't you normally expect to need to do ALTER COLLATION REFRESH on an obsolete collation before doing anything with it? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION definitional questions for ICU
On 6/25/17 11:45, Tom Lane wrote: > * Should not the FROM code path copy the old collation's version? > It seems a little bit weird that "cloning" a collation takes the > liberty of installing a new version. I think this is working correctly. Specifying the version explicitly is really only useful for pg_upgrade, because it needs to reproduce the state that is on disk. When you create a new collation, you want to work with the collation version that the currently running software provides. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] CREATE COLLATION definitional questions for ICU
Reading over DefineCollation, I wondered: * Should not the FROM code path copy the old collation's version? It seems a little bit weird that "cloning" a collation takes the liberty of installing a new version. * Also (and this would be a pre-existing bug), why doesn't the FROM path copy the old collation's encoding? For example, if you attempted to clone the "C" encoding, you wouldn't get a true clone but something that's specific to the current DB's encoding. * For an ICU collation, should we not insist that collcollate and collctype be equal? If not, what does it mean for them to be different? * Now that it's possible for user-created collations to have encoding -1, I do not think that the "shadowing" tests in CollationCreate and IsThereCollationInNamespace are sufficient. They don't prevent a new collation with encoding -1 from shadowing an existing encoding-specific collation that doesn't happen to match the current DB's encoding. Now, you'd have to work at it for that to cause problems --- say, create such a situation in template0 and then copy template0 specifying that other encoding. But none of that is forbidden. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers