Hello,
I'm come across a situation where a table in our production DB has a
relatively small number of duplicative rows that seemingly defy the
unique constraint present on that table.
We're running MySQL 5.6.19a via Amazon RDS. The table in question is
~250M rows.
`show create table` gives:
app_customergroupmembership | CREATE TABLE `app_customergroupmembership` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`customer_id` varchar(6) COLLATE utf8_bin NOT NULL,
`group_id` varchar(6) COLLATE utf8_bin NOT NULL,
`created` datetime NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `app_customergroupmembership_customer_id_31afe160_uniq`
(`customer_id`,`group_id`),
KEY `app_customergroupmembership_group_id_18aedd38e3f8a4a0`
(`group_id`,`created`)
) ENGINE=InnoDB AUTO_INCREMENT=21951158253 DEFAULT CHARSET=utf8
COLLATE=utf8_bin
Despite that, records with duplicate customer_id/group_id do exist:
mysql> select * from app_customergroupmembership where customer_id =
'ajEiQA';
+-------------+-------------+----------+---------------------+
| id | customer_id | group_id | created |
+-------------+-------------+----------+---------------------+
| 20279608258 | ajEiQA | ddH6Ev | 2015-02-17 00:14:54 |
| 20279608269 | ajEiQA | ddH6Ev | 2015-02-17 00:14:54 |
+-------------+-------------+----------+---------------------+
Interestingly, these dupe records can't seem to be queried when using
both columns from the unique constraint in the WHERE clause:
mysql> select * from app_customergroupmembership where customer_id =
'ajEiQA' and group_id = 'ddH6Ev';
+-------------+-------------+----------+---------------------+
| id | customer_id | group_id | created |
+-------------+-------------+----------+---------------------+
| 20279608258 | ajEiQA | ddH6Ev | 2015-02-17 00:14:54 |
+-------------+-------------+----------+---------------------+
Any thoughts on how this situation came to pass, and how to prevent it
from happening?
Thanks,
--
Chris Hornung