Re: [HACKERS] Collations and Replication; Next Steps

Matthew Kelly Wed, 17 Sep 2014 07:11:12 -0700

Let me double check that assertion before we go too far with it.

Most of the problems I've seen are across 5 and 6 boundaries.  I thought I had 
case where it was within a minor release but I can't find it right now.  I'm 
going to dig.


That being said the sort order changes whether you statically or dynamically 
link (demonstrated on 4+ machines running different linux flavors), so at the 
point I have no reason to trust the stability of the sort across any build.  I 
legitimately question whether strcoll is buggy.  Ex. I have cases where for 
three strings a, b and c:  a > b, but  (a || c) < (b || c).  That's right 
postfixing doesn't hold.  It actually calls into question the index scan 
optimization that occurs when you do LIKE 'test%' even on a single machine, but 
I don't want to bite that off at the moment.

My mentality has switched to 'don't trust any change until shown otherwise', so 
that may have bled into my last email.

- Matt K.




On Sep 17, 2014, at 8:17 AM, Robert Haas <robertmh...@gmail.com>
 wrote:

> On Wed, Sep 17, 2014 at 9:07 AM, Matthew Kelly <mke...@tripadvisor.com> wrote:
>> Here is where I think the timezone and PostGIS cases are fundamentally 
>> different:
>> I can pretty easily make sure that all my servers run in the same timezone.  
>> That's just good practice.  I'm also going to install the same version of 
>> PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies 
>> from the exact same source files, regardless of when I build the machine.
>> 
>> Timezone is a user level setting; PostGIS is a user level library used by a 
>> subset.
>> 
>> glibc is a system level library, and text is a core data type, however.  
>> Changing versions to something that doesn't match the kernel can lead to 
>> system level instability, broken linkers, etc.  (I know because I tried).  
>> Here are some subtle other problems that fall out:
>> 
>> * Upgrading glibc, the kernel, and linker through the package manager in 
>> order to get security updates can cause the corruption.
>> * A basebackup that is taken in production and placed on a backup server 
>> might not be valid on that server, or your desktop machine, or on the spare 
>> you keep to do PITR when someone screws up.
>> * Unless you keep _all_ of your clusters on the same OS, machines from your 
>> database spare pool probably won't be the right OS when you add them to the 
>> cluster because a member failed.
>> 
>> Keep in mind here, by OS I mean CentOS versions.  (we're running a mix of 
>> late 5.x and 6.x, because of our numerous issues with the 6.x kernel)
>> 
>> The problem with LC_IDENTIFICATION is that every machine I have seen reports 
>> revision "1.0", date "2000-06-24".  It doesn't seem like the versioning is 
>> being actively maintained.
>> 
>> I'm with Martjin here, lets go ICU, if only because it moves sorting to a 
>> user level library, instead of a system level.  Martjin do you have a link 
>> to the out of tree patch?  If not I'll find it.  I'd like to apply it to a 
>> branch and start playing with it.
> 
> What I find astonishing is that whoever maintains glibc (or the Red
> Hat packaging for it) thinks it's OK to change the collation order in
> a minor release.  I'd understand changing it between, say, RHEL 6 and
> RHEL 7.  But the idea that minor release, supposedly safe updates
> think they can whack this around without breaking applications really
> kind of blows my mind.
> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

Reply via email to