I am seeing a duplicate, stale copy of the same row when performing a
pg_dump or copying a specific table, but not when directly selecting from
it. I'm running PostgreSQL 9.3.9 on Amazon RDS, with 9.3.10 client tools.
It's happening on a users table, which has a primary key and enforces a
unique email address:
Table "public.users"
Column | Type | Modifiers
---------------+-----------------------------+---------------------------------------------------
id | integer | not null default
nextval('users_id_seq'::regclass)
email | character varying(255) | not null default
''::character varying
last_activity | timestamp without time zone |
Indexes:
"users_pkey" PRIMARY KEY, btree (id)
"users_unique_email" UNIQUE, btree (email)
I first noticed the problem when doing copying the table to another
database. Roughly this:
% pg_dump --column-inserts -Fp -h remotedb remote_db > users.txt
% psql test_db < users.txt
[...]
ERROR: could not create unique index "users_pkey"
DETAIL: Key (id)=(123) is duplicated.
[...]
ERROR: could not create unique index "users_unique_email"
DETAIL: Key (email)=([email protected]) is duplicated.
It appears there's some sort of duplicate record for a single user in the
database. Checking the pg_dump output, I saw that a single user's record
was being exported twice:
% grep -i [email protected] users.txt
INSERT INTO users (id, email, last_activity) VALUES (123, '
[email protected]', '2015-10-21 10:32:15.997887');
INSERT INTO users (id, email, last_activity) VALUES (123, '
[email protected]', '2015-10-02 11:32:58.615743');
The rows were not exactly the same. Connecting to the source database
directly, I tried this:
remote_db=> select count(1) from users where id = 123;
count
-------
1
(1 row)
remote_db=> select count(1) from users where email = '[email protected]';
count
-------
1
(1 row)
To eliminate any risk of it being a weird locking issue, I restored a
snapshot of the database into a new RDS instance but I got the same
results. I then tried the following:
remote_db=> create table users_copy_with_indexes (like users including
defaults including constraints including indexes including storage
including comments);
CREATE TABLE
remote_db=> insert into users_copy_with_indexes select * from users;
ERROR: duplicate key value violates unique constraint
"users_copy_with_indexes_pkey"
DETAIL: Key (id)=(123) already exists.
However, when I created a copy without the indexes, I can see the duplicate
rows:
remote_db=> create table users_copy_without_indexes (like users);
CREATE TABLE
remote_db=> insert into users_copy_without_indexes select * from users;
INSERT 0 523342
remote_db=> select count(1) from users_copy_without_indexes where id =
123;
count
-------
2
(1 row)
remote_db=> select count(1) from users_copy_without_indexes where email =
'[email protected]';
count
-------
2
(1 row)
Any suggestions for what to look for next? Is it table corruption?
Chaz