Re: [GENERAL] Deleting Multiple Rows Based on Multiple Columns

David Johnston Sat, 13 Aug 2011 12:09:27 -0700

>  If it is possible to leave one row with specific values in the columns
> (loc_name, sample_date, param) and delete the additional ones, I'd like to
> learn how to do so. I know that I'll have use for these techniques with
> future data.
> 
>  Else, I'd like to delete all those rows with multiple copies. Then I'll
> manually remove the extra rows in emacs, and insert the remainder in the
> original table.


It will be easiest to delete all and add back the single desired record later.

The general form will be:

DELETE FROM chemistry c WHERE EXISTS (SELECT 1 FROM chem_too ct WHERE c.field 
=ct.field AND c.field2=ct.field2 etc...)

Not tested so minor syntax tweaks by be needed.  Pretty sure docs cover this 
use case.  You can also do:

DELETE FROM chemistry
USING chem_too
WHERE chemistry.fields =AND chem_too.fields;

The other way to group multiple columns in a single "row" column.

E.g.,  WHERE (field1, field2, field3) =/IN (SELECT field1, field2, field3 FROM 
...)

Note the parenthesis grouping the three columns into a single unit for 
comparison.  It is useful shorthand for the ( field1=field1 AND field2=field2 
etc... ) construct.  You can readily use this form in the FROM/USING form's 
WHERE clause.

In your chem_too table you can use the ROW_NUMBER window function over the 
desired key columns to number the duplicate rows (in a sub-query) and then, in 
the outer query, remove any rows that have a ROW_NUMBER > 1.  Try to write the 
query yourself and post your best effort if you get stumped.  Using this query 
on the chem_too table you can select a single record per key to insert back 
into the main table.  You can also use ORDER BY in the WINDOW definition to 
sort on secondary (non-partition by) fields if desired.

David J.


-- 
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Deleting Multiple Rows Based on Multiple Columns

Reply via email to