Suhen -
 
Wouldn't it be more efficient to have something like:
 
delete from invaudee i1
  where rowid not in
    (select min(rowid)
    from invaudee i2
    where i1.unique_value = i2.unique_value);
 
You have to do a full table scan of the table as part of the delete.  With the query I proposed, if you have an index on invaudee then it should be used to do a range scan for the rowid.  If you use the inner select you have it will do a full table scan for every row in invaudee.
 
I just tried it on one of my tables, here are my explain plans :
 
for delete from test_table t1
    where rowid not in
      (select min(rowid)
      from test_table t2
      group by unique_value_1,unique_value_2)
 
EXPLAIN_PLAN                                                                                                             OPT
------------------------------------------------------------------------------------------------------------------------ ------
1.0 DELETE STATEMENT    1 Cost= n/a                                                                                      CHOOSE
  2.1 DELETE  TEST_TABLE  TEST_TABLE
    3.1 FILTER
      4.1 TABLE ACCESS FULL TEST_TABLE  TEST_TABLE
      4.2 FILTER
        5.1 SORT GROUP BY
          6.1 TABLE ACCESS FULL TEST_TABLE  TEST_TABLE
 
7 rows selected.
 
for delete from test_table t1
    where rowid not in
      (select min(rowid)
      from test_table t2
      where t2.unique_value_1 = t1.unique_value_1)
/
 
EXPLAIN_PLAN                                                                                                             OPT
------------------------------------------------------------------------------------------------------------------------ ------
1.0 DELETE STATEMENT    3 Cost= n/a                                                                                      CHOOSE
  2.1 DELETE  TEST_TABLE  TEST_TABLE
    3.1 FILTER
      4.1 TABLE ACCESS FULL TEST_TABLE  TEST_TABLE
      4.2 SORT AGGREGATE
        5.1 INDEX RANGE SCAN TEST_TABLE_N1 NON-UNIQUE TEST_TABLE_N1
 
6 rows selected.
 
HTH
 
Linda
 
-----Original Message-----
From: Suhen Pather [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, September 11, 2001 10:50 PM
To: Multiple recipients of list ORACLE-L
Subject: help with deleting duplicate records from very large table

List,

 

I need to delete duplicate records from a very large table (60 millions records +).

There would be about 3 million duplicate entries.

 

What is the quickest way to do this?

 

The syntax that I am using is

delete from invaudee

where rowid not in (select min(rowid) from invaudee

group by audit_number);

 

This is taking a long time to run. I cannot see any entries in v$transaction

for the delete.

 

There is no indexes on the INVAUDEE table.

I created an index on the primary key column but it still takes forever to run.

 

I do not have the space to CTAS.

 

Or should I write the duplicates to an EXCEPTIONS table and perform the delete based on the entries

in the EXCEPTIONS table.

 

Any help would be greatly appreciated.

 

Suhen

 

Reply via email to