Hello,

Last month, I brought up the following issue to the general mailing list about 
how running streaming replication between machines running different versions 
of glibc can cause corrupt indexes.
http://www.postgresql.org/message-id/ba6132ed-1f6b-4a0b-ac22-81278f5ab...@tripadvisor.com

In the month following, we have done further investigation here at TripAdvisor 
and have found that scope of this issue is far more troubling than initially 
thought.  Hackers seems like appropriate place to present this update because 
it will certainly motivate some discussion about the approach to collation 
support going forward.

After the initial episode, we thought it was necessary to find the true scope 
of the problem.  We developed a quick smoke test to evaluate the integrity of 
the indexes on a given machine.  We understood that the test was not 
exhaustive, but it would catch most instances of corrupt indexes given 
TripAdvisor's normal database usage pattern.  The source code with 
documentation about how it works is available at 
(https://github.com/mkellycs/postgres_index_integrity_check) for those 
interested.

What we found with this simple check was simply frightening.  In every single 
streaming replica cluster where one or more machines had been commissioned at a 
different time, that member was found to be corrupt.  When hardware upgrades of 
the master had been accomplished with a streaming replication, the new master 
was also found to have similar issues.  The following numbers are only as small 
as they are because our adoption of streaming replication has barely just 
begun.  So far we have found:

  *   8 internal production databases, and 2 live site database servers 
effected.
  *   Up to 3771 rows out of place in a single index (more correctly: 3771 
times a row was smaller then the row before it when sorted in ascending order, 
the actual number of incorrectly placed rows is probably much higher)
  *   On the worst offender, there were 12806 rows out of place across 26 
indexes
  *   On average roughly 15% of indexes containing text keys on tables larger 
100MB were found to exhibit this issue
  *   In at least one case, rebuilding a unique index on a master revealed that 
the database had allowed 100+ primary key violations.

It sounds like we as a community knew that these issues were theoretically 
possible, but I now have empirical evidence demonstrating the prevalence of 
this issue on our corpus of international data.  Instances of this issue showed 
up in indexes of member usernames, location/property names, and even Facebook 
url's.  I encourage other sufficiently large operations who index 
internationalized text to run similar tests; its highly likely they have 
similar latent issues that they just have not detected yet.

Here is the simple reality.  Collation based indexes, streaming replication, 
and multiple versions of glibc/os cannot coexist in a sufficiently large 
operation and not cause corrupt indexes.  The current options are to collate 
all of your indexes in C, or to ensure that all of your machines run exactly 
the same OS version.

The first and immediate TODO is to patch the documentation to add warnings 
regarding this issue.  I can propose a doc patch explaining the issue, if no 
one has any objections.

The second and far more challenging problem is how do we fix this issue?  As of 
our last discussion, Peter Geoghegan revived the proposal of using ICU as an 
alternative.  
(http://www.postgresql.org/message-id/CAEYLb_WvdCzuL=cyf1xyzjwn-1cvo6kzeawmkbxts3jphtj...@mail.gmail.com)
  I do not feel qualified to compare the value of this library to other 
options, but I am certainly willing to help with the patch process once a 
direction has been selected.

I will be at Postgres Open in Chicago this week, and I will be more than 
willing to further discuss the details of what we have found.

Regards,
Matt Kelly

Reply via email to