Thomas Kellerer spam_ea...@gmx.net wrote:
I finally found a solution that runs fine:
DELETE FROM dupes a
WHERE EXISTS (SELECT 1
FROM dupes b
WHERE b.first_name = a.first_name
AND b.last_name = a.last_name
AND b.ctid
Tom Lane, 24.07.2012 19:12:
Well, it would only help if you're running a PG version that's new
enough to recognize the NOT EXISTS as an anti-join; and even then,
it's possible that joining on a tid column forecloses enough plan
types that you don't get any real benefit. But I'm just guessing.
Hi,
I was testing a query to delete duplicates to see how well using ctid works if
the table doesn't have a unique identifier available.
The table definition is:
create table dupes
(
id integer primary key,
first_name text,
last_name text
);
My test table has 100.000 rows with
Thomas Kellerer spam_ea...@gmx.net writes:
DELETE FROM dupes
WHERE id NOT IN (SELECT min(b.id)
FROM dupes b
GROUP BY first_name, last_Name
HAVING count(*) 1);
Doesn't that kill the non-duplicates too?
Why does the usage of the CTID
Tom Lane, 24.07.2012 16:23:
Thomas Kellerer spam_ea...@gmx.net writes:
DELETE FROM dupes
WHERE id NOT IN (SELECT min(b.id)
FROM dupes b
GROUP BY first_name, last_Name
HAVING count(*) 1);
Doesn't that kill the non-duplicates too?
Ah
Thomas Kellerer spam_ea...@gmx.net writes:
Tom Lane, 24.07.2012 16:23:
IIRC, type tid doesn't have any hash support.
So the bad plan is expected?
Joins on tid columns just aren't supported very well at the moment.
Partly that's from lack of round tuits, and partly it's because it
doesn't seem
Tom Lane wrote on 24.07.2012 17:55:
Joins on tid columns just aren't supported very well at the moment.
Partly that's from lack of round tuits, and partly it's because it
doesn't seem all that wise to encourage people to use them. There
are gotchas if any of the rows receive concurrent updates.
Thomas Kellerer spam_ea...@gmx.net writes:
Tom Lane wrote on 24.07.2012 17:55:
FWIW, it might be helpful to cast this as a NOT EXISTS rather than
NOT IN subquery.
Hmm. How would you change that into an NOT EXISTS clause (so that one of the
duplicates remains)
Everything I come up with is