On 10/24/2015 12:35 PM, Chaz Yoon wrote:
I am seeing a duplicate, stale copy of the same row when performing a
pg_dump or copying a specific table, but not when directly selecting
from it. I'm running PostgreSQL 9.3.9 on Amazon RDS, with 9.3.10 client
tools.
It's happening on a users table, which has a primary key and enforces a
unique email address:
Table "public.users"
Column | Type | Modifiers
---------------+-----------------------------+---------------------------------------------------
id | integer | not null default
nextval('users_id_seq'::regclass)
email | character varying(255) | not null default
''::character varying
last_activity | timestamp without time zone |
Indexes:
"users_pkey" PRIMARY KEY, btree (id)
"users_unique_email" UNIQUE, btree (email)
I first noticed the problem when doing copying the table to another
database. Roughly this:
% pg_dump --column-inserts -Fp -h remotedb remote_db > users.txt
% psql test_db < users.txt
[...]
ERROR: could not create unique index "users_pkey"
DETAIL: Key (id)=(123) is duplicated.
[...]
ERROR: could not create unique index "users_unique_email"
DETAIL: Key (email)=(this_u...@xyz.com <mailto:this_u...@xyz.com>)
is duplicated.
It appears there's some sort of duplicate record for a single user in
the database. Checking the pg_dump output, I saw that a single user's
record was being exported twice:
% grep -i this_u...@xyz.com <mailto:this_u...@xyz.com> users.txt
INSERT INTO users (id, email, last_activity) VALUES (123,
'this_u...@xyz.com <mailto:this_u...@xyz.com>', '2015-10-21
10:32:15.997887');
INSERT INTO users (id, email, last_activity) VALUES (123,
'this_u...@xyz.com <mailto:this_u...@xyz.com>', '2015-10-02
11:32:58.615743');
The rows were not exactly the same. Connecting to the source database
directly, I tried this:
remote_db=> select count(1) from users where id = 123;
count
-------
1
(1 row)
remote_db=> select count(1) from users where email =
'this_u...@xyz.com <mailto:this_u...@xyz.com>';
count
-------
1
(1 row)
To eliminate any risk of it being a weird locking issue, I restored a
snapshot of the database into a new RDS instance but I got the same
results. I then tried the following:
remote_db=> create table users_copy_with_indexes (like users
including defaults including constraints including indexes including
storage including comments);
CREATE TABLE
remote_db=> insert into users_copy_with_indexes select * from users;
ERROR: duplicate key value violates unique constraint
"users_copy_with_indexes_pkey"
DETAIL: Key (id)=(123) already exists.
However, when I created a copy without the indexes, I can see the
duplicate rows:
remote_db=> create table users_copy_without_indexes (like users);
CREATE TABLE
remote_db=> insert into users_copy_without_indexes select * from users;
INSERT 0 523342
remote_db=> select count(1) from users_copy_without_indexes where id
= 123;
count
-------
2
(1 row)
remote_db=> select count(1) from users_copy_without_indexes where
email = 'this_u...@xyz.com <mailto:this_u...@xyz.com>';
count
-------
2
(1 row)
Any suggestions for what to look for next? Is it table corruption?
I would say the smoking gun is the copy w/o indexes shows both records
and the one with indexes only one. I would DROP/CREATE index on the
original table, with the usual caveat that this does place a load on the
table. Using Concurrently might help, but I would read the information here:
http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY
Building Indexes Concurrently
Chaz
--
Adrian Klaver
adrian.kla...@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general