Re: [PLUG] Removing Duplicate Rows from SQL Dump

Hal Pomeranz Tue, 16 Aug 2011 07:42:26 -0700

Rich--

I'm not certain from your email exactly which columns you want to
de-duplicate on, but the solution is to use sort:


        sort -u -k1,4 inputfile >inputfile.de-duped

The "-k" option should specify the range of columns you want to use
for de-duplication.  The "-u" tells sort to only output the lines that
are unique on those columns.  Of course your output will also end up
being re-sorted on those columns, so you may have to re-order it again
after you're finished de-duping.

-- 
Hal Pomeranz, Founder/CEO      Deer Run Associates      [email protected]
   Computer Forensic Investigations, Information Security, Training
_______________________________________________
PLUG mailing list
[email protected]
http://lists.pdxlinux.org/mailman/listinfo/plug

Re: [PLUG] Removing Duplicate Rows from SQL Dump

Reply via email to