On 3/5/2012 11:12 AM, DrYSG wrote:
I have a large table (20M records) but mostly short text fields. There are
duplicates that I am trying to remove. I have a bigseriel index that I
unique, but duplicates in another field.
I have an 8 core, 24GB memory computer with RAID disks.
This request has been running for 70 hours (is it safe to kill it?).
Yes, it is safe.
How can I make this run faster? This is a one time processing task, but it
is taking a long time.
CREATE TABLE portal.new_metatdata AS
select distinct on (data_object.unique_id) * FROM portal.metadata;
Or something of this ilk should be faster because it only need to do a
sort on data_object.unique_id and then an insert. After you have
verified the results you can do:
BEGIN;
ALTER TABLE portal.metatdata rename TO portal.new_metatdata_old;
ALTER TABLE portal.new_metatdata rename TO portal.metatdata_old;
COMMIT;
Other people might have other ideas.
-Steve
DELETE FROM portal.metadata
WHERE idx NOT IN
(
SELECT MIN(idx)
FROM portal.metadata
GROUP BY "data_object.unique_id"
);
CREATE TABLE
metadata
(
data_object.unique_id CHARACTER(64) NOT NULL,
size_bytes BIGINT,
object_date TIMESTAMP(6) WITHOUT TIME ZONE,
object_type CHARACTER VARYING(25),
classification CHARACTER VARYING(7),
object_source CHARACTER VARYING(50),
object_managed_date TIMESTAMP(6) WITHOUT TIME ZONE,
clevel INTEGER,
fsctlh CHARACTER VARYING(50),
oname CHARACTER VARYING(40),
description CHARACTER VARYING(80),
horizontal_datum CHARACTER VARYING(20),
do_location CHARACTER VARYING(200),
elevation_ft INTEGER,
location USER-DEFINED,
idx BIGINT DEFAULT nextval('portal.metadata_idx_seq'::regclass) NOT
NULL,
bbox CHARACTER VARYING(160),
CONSTRAINT MetaDataKey PRIMARY KEY (idx)
)
--
View this message in context:
http://postgis.17.n6.nabble.com/Slow-delete-of-duplicate-rows-tp4548251p4548251.html
Sent from the PostGIS - User mailing list archive at Nabble.com.
_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users