"shaun thornburgh" <[EMAIL PROTECTED]> wrote on 02/15/2005 
04:53:54 PM:

> Hi,
> 
> I have a table with 26 fields, each row in this table must be unique. I 
> can't define all of the fields to be primary keys as the limit is 16. 
> Therefore before I insert data I have to check that each row is unique. 
Data 
> is to be inserted into this table from files uploaded to the server - 
CSV 
> files. Is it possible to use the LOAD DATA INFILE statement to check for 

> unique rows, or does anyone have a better solution to my problem!
> 
> Thanks for any advice offered.
> 
> 
> 


What you want to do is called data scrubbing. Exactly how always depends 
on resources and the data itself however the general procedure works 
something like this:

*create an import table that matches your source data
*Import your data (without changes or omissions if at all possible) to 
this intermediate table.
*Validate your imported data to make sure you have everything you wanted 
from the LOAD DATA INFILE command.
*Eliminate any duplicate rows from your imported data table (many ways to 
do this. comparing the hash values for each row is one idea)
*use your import table as the source for your final update/insert
*drop your intermediate table.

I would normally get into more details but it's time to head home.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Reply via email to