Re: Corruption with duplicate primary key

Tomas Vondra Thu, 05 Dec 2019 15:46:10 -0800

On Thu, Dec 05, 2019 at 09:14:12PM +0000, Alex Adriaanse wrote:

We have a Postgres 10 database that we recently upgraded to Postgres 12 using 
pg_upgrade. We recently discovered that there are rows in one of the tables 
that have duplicate primary keys:


record_loader=# \d loader.sync
                             Table "loader.sync"
     Column       |           Type           | Collation | Nullable | Default
-------------------+--------------------------+-----------+----------+---------
source            | text                     |           | not null |
natural_key       | text                     |           | not null |
payload           | jsonb                    |           |          |
dispatched        | timestamp with time zone |           | not null | now()
initial_load_id   | text                     |           |          |
deleted_load_id   | text                     |           |          |
created_timestamp | timestamp with time zone |           |          | now()
updated_timestamp | timestamp with time zone |           |          | now()
deleted_timestamp | timestamp with time zone |           |          |
Indexes:
   "sync_pkey" PRIMARY KEY, btree (source, natural_key)
Publications:
   "debezium"

This table is modified via triggers that fire off when a COPY command inserts 
many rows into another table.

Here are two example duplicate rows:

# SELECT xmin, xmax, cmin, cmax, source, md5(natural_key) AS natural_key_hash, 
dispatched, created_timestamp, updated_timestamp, deleted_timestamp FROM 
loader.sync WHERE (source, natural_key) = ('ok_lease', '...') ORDER BY 
xmin::text::int, cmin::text::int;
-[ RECORD 1 ]-----+---------------------------------
xmin              | 116649
xmax              | 0
cmin              | 5304404
cmax              | 5304404
source            | ok_lease
natural_key_hash  | de3e9a567b90025c3399c4c63c823fe9
dispatched        | 2019-11-24 05:09:36.099686+00
created_timestamp | 2019-11-24 05:09:36.099686+00
updated_timestamp | 2019-11-24 05:09:36.099686+00
deleted_timestamp |
-[ RECORD 2 ]-----+---------------------------------
xmin              | 116649
xmax              | 118583
cmin              | 5312208
cmax              | 5312208
source            | ok_lease
natural_key_hash  | de3e9a567b90025c3399c4c63c823fe9
dispatched        | 2019-11-10 05:09:24.214964+00
created_timestamp | 2019-05-17 21:24:19.558219+00
updated_timestamp | 2019-11-24 05:09:36.099686+00
deleted_timestamp | 2019-11-24 05:09:36.099686+00

It appears that the second row was in place originally, then got updated by a 
trigger (and even deleted later on, although it doesn't appear that the delete 
transaction got committed), and then the first row was inserted within the same 
transaction that updated the second row.

Another example:
-[ RECORD 1 ]-----+---------------------------------
xmin              | 116649
xmax              | 0
cmin              | 5304403
cmax              | 5304403
source            | ok_lease
natural_key_hash  | 1c8031348701a32cb5fee26839d6b0b4
dispatched        | 2019-11-10 05:09:24.214964+00
created_timestamp | 2019-05-31 06:00:33.765547+00
updated_timestamp | 2019-11-24 05:09:36.099686+00
deleted_timestamp | 2019-11-24 05:09:36.099686+00
-[ RECORD 2 ]-----+---------------------------------
xmin              | 116649
xmax              | 0
cmin              | 5304404
cmax              | 5304404
source            | ok_lease
natural_key_hash  | 1c8031348701a32cb5fee26839d6b0b4
dispatched        | 2019-11-24 05:09:36.099686+00
created_timestamp | 2019-11-24 05:09:36.099686+00
updated_timestamp | 2019-11-24 05:09:36.099686+00
deleted_timestamp |

Both examples have in common that the two duplicate rows were touched within 
the same transaction.

This database runs inside Docker, with the data directory bind-mounted to a 
reflink-enabled XFS filesystem. The VM is running Debian's 4.19.16-1~bpo9+1 
kernel inside an AWS EC2 instance. We have Debezium stream data from this 
database via pgoutput.

Recreating the primary key confirms that the constraint doesn't (or at least 
shouldn't) permit these duplicate rows:

record_loader=# BEGIN;
BEGIN
record_loader=# ALTER TABLE loader.sync DROP CONSTRAINT sync_pkey;
ALTER TABLE
record_loader=# ALTER TABLE loader.sync ADD CONSTRAINT sync_pkey PRIMARY KEY 
(source, natural_key);
ERROR:  could not create unique index "sync_pkey"
DETAIL:  Key (source, natural_key)=(ok_lease, ...) is duplicated.
CONTEXT:  parallel worker

Any ideas on what might cause this behavior?


Not sure. At first I thought maybe this might be due to collations
changing and breaking the index silently. What collation are you using?

A couple questions:

1) When you do the queries, do they use index scan or sequential scan?
Perhaps it does sequential scan, and if you force index scan (e.g. by
rewriting the query) it'll only find one of those rows.

2) Can you check in backups if this data corruption was present in the

PG10 cluster, before running pg_upgrade?

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Corruption with duplicate primary key

Reply via email to