Your suggested temp table approach is how I would solve this; if
everything is properly indexed it won't be too bad.  Even if it is bad,
it'll be better than updating columns within the table and then deleting
rows based on that.

   Another potential alternative is to:

   1. Load all new rows into a temp table
   2. Select the old matching rows into a second temp table
   3. Insert all the remaining new rows to that second temp table
   4. Drop the original table and rename the second temp table

   That's likely to be slower on small data sets and faster on larger
datasets, I think.  Depends on how much data is already in the database
vs. the amount of data being loaded.

   -Tom  

> -----Original Message-----
> From: Andre du Plessis [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, September 06, 2007 5:41 AM
> To: sqlite-users@sqlite.org
> Subject: [sqlite] SQL approach to Import only new items, 
> delete other items
> 
> Im importing data
> 
> The data has a unique value, call it MD5 for now that could 
> be a unique
> value for the data.
> 
>  
> 
> Each record that gets imported is converted to MD5, a lookup 
> is done on
> the table for that MD5, 
> 
> if found it must leave it alone, if not found it must insert a new
> record...
> 
>  
> 
> All the items in the table that was not imported must be deleted...
> 
>  
> 
> The only feasible approach I have is to add a column to the 
> table, like
> UPDATE_FLAG for example, 
> 
> During the import update_flag gets set to 0,
> 
> Once a record is found update_flag gets set to 1,
> 
>  
> 
> At the end of the import all records with update_flag = 0 gets
> deleted...
> 
>  
> 
> However I did not want to add a column to the table, REASON being, I'm
> also version controlling the DB, and when an import occurs and nothing
> 
> Has been added or removed,  I don't want modifications to the 
> DB, as the
> import can run many times over.
> 
>  
> 
> I was considering the following:
> 
> Create a temp table call it,
> 
> Temp_ids for example
> 
> Then insert into the temp table for each record that was found...
> 
>  
> 
> At the end do something like
> 
> Delete from imports where import_id not in (select id from temp_ids)
> 
>  
> 
> But that might be horribly slow and memory expensive remembering that
> the import table may have millions of records..
> 
>  
> 
> What could the people here suggest to me,
> 
>  
> 
> Thanks.
> 
> 

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to