On Fri, Sep 22, 2017 at 4:46 PM, Peter Geoghegan <p...@bowt.ie> wrote: > But you are *already* canonicalizing ICU collation names as BCP 47. My > point here is: Why not finish the job off, and *also* canonicalize > colcollate in the same way?
Peter, with respect, it's time to let this argument go. We're scheduled to wrap a GA release in just over 72 hours. It is far too late to change behavior like this. There is no time for other people who may be interested in this issue to form a well-considered opinion on the topic and carefully review a proposed patch. There is also no time for users to notice it in the next beta and complain before we go final. This ship has sailed. On the substantive issue, I am inclined (admittedly without deep study) to agree with Peter Eisentraut. We have never canonicalized collations before and therefore it is not essential that we do that now. That would be a new feature, and I don't think I'd be prepared to endorse adding it three days after feature freeze let alone three days before the GA wrap. I do agree that the lack of canonicalization is utterly terrible. The APIs that Unix-like operating systems provide for collations are poorly suited to our purposes and hopelessly squishy about semantics, and it's not clear how much better ICU will be. But that's a problem that we should address, if at all, at a deliberate pace and with adequate time for reflection, research, and comment, not precipitously and under extreme time pressure. I simply do not buy the theory that this cannot be changed later. It's been the case for as long as we've had pg_collate that a new system could have different collations than the old one, resulting in a dump/restore failure. I expect somebody's had that problem at some point, but I don't think it's become a major pain point because most people don't use exotic collations, and if they do they probably understand that they need those exotic collations to be on the new system too. So, if we decide to change this later, we'll want to find ways to make the upgrade as pain-free as possible and document whatever the situation may be, but we've made many backward-incompatible changes in the past and this one would hardly be the worst. I also believe that Peter Eisentraut is entirely correct to be concerned about whether BCP 47 (or anything else) can really be regarded as a stable canonical form for ICU purposes. His email indicates that the acceptable and canonical forms have changed multiple times in the course of releases new enough for us to care about them. Assuming that statement is correct, it would be extremely short-sighted of us to bank on them not changing any more. But even if all of the above argumentation is utterly and completely wrong, dredged up from the universe's deepest and most profound reserves of stupidity and destined for future entry into Webster's as the canonical example of cluelessness, we still shouldn't change it the weekend before the GA wraps. I'm afraid that this new RMT process has lulled us into believing that the release will happen on time no matter how much stuff we whack around at the last minute, which is a very dangerous idea for a group of software engineers to have. Before, we thought we had infinite time to fix our bugs; now, we think we have infinite latitude to classify anything we don't like as a bug. Neither of those ideas is good software engineering. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers