The vacuum full tracks just finished and I still have the duplicated IDs.
2014-08-12 9:17 GMT+02:00 Alexis Bernard <ale...@bernard.io>: > Hi, > > Thanks for your help Alban ! > > Alexis. > > > 2014-08-09 18:46 GMT+02:00 Alban Hertroys <haram...@gmail.com>: > > On 09 Aug 2014, at 13:24, Alexis Bernard <ale...@bernard.io> wrote: >> >> > Primary key definition: "tracks_pkey" PRIMARY KEY, btree (id) >> > select version(): PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, >> compiled by gcc (Ubuntu 4.8.2-16ubuntu6) 4.8.2, 64-bit >> > Ubuntu trusty with kernel 3.13.0-29-generic. >> > >> > I have a autovacuum process running and configuration is by default: ps >> aux | grep vacuum | grep -v grep >> > postgres 587 0.0 0.0 1174304 3720 ? Ss juil.30 0:05 >> postgres: autovacuum launcher process >> > >> > I tried to vacuum manualy. >> >> What’s the output of: VACUUM VERBOSE tracks; >> Anything that looks abnormal? >> >> > INFO: vacuuming "public.tracks" > INFO: scanned index "tracks_pkey" to remove 25777 row versions > DETAIL: CPU 0.03s/0.12u sec elapsed 0.87 sec. > INFO: scanned index "index_tracks_on_permalink" to remove 25777 row > versions > DETAIL: CPU 0.07s/0.40u sec elapsed 5.30 sec. > INFO: "tracks": removed 25777 row versions in 5374 pages > DETAIL: CPU 0.00s/0.14u sec elapsed 0.66 sec. > INFO: index "tracks_pkey" now contains 1657345 row versions in 4594 pages > DETAIL: 17739 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: index "index_tracks_on_permalink" now contains 1657345 row versions > in 6143 pages > DETAIL: 25715 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: "tracks": found 17560 removable, 1657349 nonremovable row versions > in 35586 out of 35586 pages > DETAIL: 0 dead row versions cannot be removed yet. > There were 64749 unused item pointers. > 0 pages are entirely empty. > CPU 0.14s/0.86u sec elapsed 7.30 sec. > INFO: vacuuming "pg_toast.pg_toast_17228" > INFO: index "pg_toast_17228_index" now contains 0 row versions in 1 pages > DETAIL: 0 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: "pg_toast_17228": found 0 removable, 0 nonremovable row versions in > 0 out of 0 pages > DETAIL: 0 dead row versions cannot be removed yet. > There were 0 unused item pointers. > 0 pages are entirely empty. > CPU 0.00s/0.00u sec elapsed 0.00 sec. > VACUUM > > >> What do below queries return? They’re from the manual at: >> http://www.postgresql.org/docs/9.3/static/routine-vacuuming.html >> >> SELECT c.oid::regclass as table_name, >> greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age >> FROM pg_class c >> LEFT JOIN pg_class t ON c.reltoastrelid = t.oid >> WHERE c.relkind IN ('r', 'm'); >> >> > table_name | age > --------------------------------------------+---------- > pg_statistic | 37049759 > pg_type | 37049759 > artists | 37042311 > pg_authid | 37049759 > pg_attribute | 37049759 > pg_proc | 37049759 > pg_class | 37049759 > pg_user_mapping | 37049759 > pg_constraint | 37049759 > pg_inherits | 37049759 > pg_index | 37049759 > pg_operator | 37049759 > pg_opfamily | 37049759 > pg_opclass | 37049759 > pg_am | 37049759 > pg_amop | 37049759 > pg_amproc | 37049759 > pg_language | 37049759 > pg_largeobject_metadata | 37049759 > pg_database | 37049759 > pg_rewrite | 37049759 > pg_trigger | 37049759 > pg_event_trigger | 37049759 > pg_description | 37049759 > pg_cast | 37049759 > pg_enum | 37049759 > pg_namespace | 37049759 > pg_conversion | 37049759 > pg_depend | 37049759 > explorers | 36847481 > pg_db_role_setting | 37049759 > pg_tablespace | 37049759 > pg_pltemplate | 37049759 > pg_auth_members | 37049759 > pg_shdepend | 37049759 > pg_shdescription | 37049759 > pg_ts_config | 37049759 > pg_ts_config_map | 37049759 > pg_ts_dict | 37049759 > pg_ts_parser | 37049759 > pg_ts_template | 37049759 > pg_extension | 37049759 > pg_foreign_data_wrapper | 37049759 > pg_foreign_server | 37049759 > pg_foreign_table | 37049759 > pg_default_acl | 37049759 > pg_seclabel | 37049759 > pg_shseclabel | 37049759 > pg_range | 37049759 > pg_largeobject | 37049759 > information_schema.sql_implementation_info | 37049759 > information_schema.sql_languages | 37049759 > information_schema.sql_packages | 37049759 > information_schema.sql_sizing | 37049759 > information_schema.sql_sizing_profiles | 37049759 > delayed_jobs | 37042298 > pg_attrdef | 37049759 > pg_aggregate | 37049759 > invitation_requests | 37042286 > imports | 37042289 > information_schema.sql_features | 37049759 > pg_collation | 37049759 > information_schema.sql_parts | 37049759 > schema_migrations | 37042277 > tracks | 37042273 > invitations | 37042283 > users | 37042270 > playlists | 37042280 > sources | 37042276 > absorbed_tracks | 37042314 > countries | 37042302 > bridge_errors | 37042308 > bridges | 37042305 > countries_sources | 37042299 > favorites | 37042295 > import_entries | 37042292 > > > >> SELECT datname, age(datfrozenxid) FROM pg_database; >> >> > datname | age > ------------------+---------- > template1 | 37049756 > template0 | 37049756 > postgres | 37049756 > disco_production | 37049756 > > >> I’m wondering if perhaps autovacuum doesn’t run to completion, the above >> should determine that. >> >> To get out of this pickle, you might want to try VACUUM FULL, but that >> puts an exclusive lock on the table and creates a new copy of it while it’s >> running, so you better have enough disk space to contain that table twice. >> It does rewrite the table though, which is what I hope fixes your issue. >> The normal VACUUM should have updated the frozenid’s appropriately >> though, a VACUUM FULL should not actually be necessary. >> >> > I tried reindex: >> > >> > => reindex index tracks_pkey; >> > ERROR: failed to find parent tuple for heap-only tuple at (38802,116) >> in table “tracks >> >> The problem is probably not in the index then, or it wouldn’t have >> trouble re-creating it... >> >> > Here are more information about storage (raid 1): >> >> Do I read that as disks 0 to 2 are part of an LSI SAS? Are you really >> using raid-1 then, as if I’m reading that correctly only 1 of the 3 disks >> is actually attached to the raid array (namely the Dell disk) and the >> others are "not claimed" (i.e. spares)? >> > > Not sure, but I understood that the Dell RAID 1 controller contains 2 > Seagate disks. > > >> >> Anyway, from this it looks like you are using reliable storage that >> properly honours fsync commands. Or is this a hardware raid-controller with >> no battery backup unit and caching set to write-back instead of write-thru? >> >> > No idea. I can ask to my hosting provider if needed. > > > >> Oh, one last thing, the convention in this mailing list is to reply >> inline instead of top-posting. >> >> > Got it :-) > > >> > lshw -class disk -class storage >> > *-storage >> > description: Serial Attached SCSI controller >> > produit: SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] >> > fabriquant: LSI Logic / Symbios Logic >> > identifiant matériel: 0 >> > information bus: pci@0000:01:00.0 >> > nom logique: scsi0 >> > version: 03 >> > bits: 64 bits >> > horloge: 33MHz >> > fonctionnalités: storage pm pciexpress vpd msi msix bus_master >> cap_list rom >> > configuration: driver=mpt2sas latency=0 >> > ressources: irq:16 portE/S:2000(taille=256) >> mémoire:c5140000-c514ffff mémoire:c5100000-c513ffff >> mémoire:c5400000-c54fffff >> > *-disk:0 NON-RÉCLAMÉ >> > description: ATA Disk >> > produit: ST1000NM0033-9ZM >> > fabriquant: Seagate >> > identifiant matériel: 0.0.0 >> > information bus: scsi@0:0.0.0 >> > version: GA04 >> > numéro de série: Z1W0HGB9 >> > configuration: ansiversion=5 >> > *-disk:1 >> > description: SCSI Disk >> > produit: Virtual Disk >> > fabriquant: Dell >> > identifiant matériel: 1.0.0 >> > information bus: scsi@0:1.0.0 >> > nom logique: /dev/sda >> > version: 1028 >> > taille: 931GiB (999GB) >> > capacité: 931GiB (999GB) >> > fonctionnalités: 15000rpm partitioned partitioned:dos >> > configuration: ansiversion=6 sectorsize=512 signature=000e59a7 >> > *-disk:2 NON-RÉCLAMÉ >> > description: ATA Disk >> > produit: ST1000NM0033-9ZM >> > fabriquant: Seagate >> > identifiant matériel: 0.1.0 >> > information bus: scsi@0:0.1.0 >> > version: GA04 >> > numéro de série: Z1W0HFYZ >> > configuration: ansiversion=5 >> > *-storage >> > description: SATA controller >> > produit: 6 Series/C200 Series Chipset Family SATA AHCI Controller >> > fabriquant: Intel Corporation >> > identifiant matériel: 1f.2 >> > information bus: pci@0000:00:1f.2 >> > version: 04 >> > bits: 32 bits >> > horloge: 66MHz >> > fonctionnalités: storage msi pm ahci_1.0 bus_master cap_list >> > configuration: driver=ahci latency=0 >> > ressources: irq:43 portE/S:3048(taille=8) portE/S:3054(taille=4) >> portE/S:3040(taille=8) portE/S:3050(taille=4) portE/S:3020(taille=32) >> mémoire:c5204000-c52047ff >> > >> > Cheers, >> > Alexis. >> > >> > >> > 2014-08-09 12:35 GMT+02:00 Alban Hertroys <haram...@gmail.com>: >> > On 09 Aug 2014, at 11:38, Alexis Bernard <ale...@bernard.io> wrote: >> > >> > > Hi, >> > > >> > > I am having regurlarly duplicated rows with the same primary key. >> > > >> > > => select id, created_at, updated_at from tracks where created_at = >> '2014-07-03 15:09:16.336488'; >> > > id | created_at | updated_at >> > > --------+----------------------------+---------------------------- >> > > 331804 | 2014-07-03 15:09:16.336488 | 2014-07-03 15:37:55.253729 >> > > 331804 | 2014-07-03 15:09:16.336488 | 2014-07-03 15:37:55.253801 >> > > >> > > => select id from tracks where id = 331804; >> > > id >> > > ---- >> > > (0 rows) >> > > >> > >> > First of all, what is the definition of that primary key? >> > What exact version of PG are you using?: select version(); >> > What exact OS is this on? What kind of storage? >> > To me it looks like you may have run into transaction wrap-around or a >> corrupted index. >> > >> > Before you do anything, make a backup. >> > >> > Theorising that the issue here indeed is transaction wrap-around, what >> you’re seeing may be data from older transactions that has become newer >> because your current transaction txid is lower (due to the wraparound) than >> the txid of the transactions those rows belong(ed) to. If those >> transactions were committed, then you’re possibly seeing deleted or updated >> rows that are still around. TX wraparound can occur if you do not vacuum >> frequently enough and another thing that vacuum does is mark old rows >> obsolete so that the DB can reclaim the space they use. Seeing data from >> rows that are no longer there or that has been modified since seems to fit >> the bill here. >> > >> > Hence the question: When did you last (auto-)vacuum this table? Did you >> perhaps turn autovacuum off? Did it fall behind? >> > >> > If you have been vacuuming and the issue is a corrupt index: Does it >> help to reindex that table? You said that you have fsync on; what kind of >> storage is this database on? Something allowed that index to get corrupted. >> It is more likely that it’s caused by something in the underlying storage >> system (including the OS) than that it is a bug in PG. >> > >> > > => delete from tracks where created_at = '2014-07-03 15:09:16.336488' >> and updated_at = '2014-07-03 15:37:55.253801'; >> > > ERROR: update or delete on table "tracks" violates foreign key >> constraint "fk_sources_on_track_id" on table "sources" >> > > DETAIL: Key (id)=(331804) is still referenced from table "sources”. >> > >> > Apparently there is a row from another table referencing this one. So >> either the referenced row does actually exist (corrupt index theory) or it >> doesn’t and the referencing row is from an older transaction as well (TX >> wraparound theory). >> > >> > Considering that you’re seeing this regularly, my bet is on TX >> wraparound. >> > >> > Alban Hertroys >> > -- >> > If you can't see the forest for the trees, >> > cut the trees and you'll find there is no forest. >> > >> > >> >> Alban Hertroys >> -- >> If you can't see the forest for the trees, >> cut the trees and you'll find there is no forest. >> >> >