Re: [PERFORM] Massive Inserts Strategies

2006-03-28 Thread Jim C. Nasby
Is there some other unique key you can test on?

Take a look at http://lnk.nu/cvs.distributed.net/8qt.sql lines 169-216
for an exammple. In this case we use a different method for assigning
IDs than you probably will, but the idea remains.

On Tue, Mar 28, 2006 at 10:59:49AM -0500, ashah wrote:
> I tried this solution, but ran into following problem.
> 
> The temp_table has columns (col1, col2, col3).
> 
> The original_table has columns (col0, col1, col2, col3)
> 
> Now the extra col0 on the original_table is the unique generated ID by the 
> database.
> 
> How can I make your suggestions work in that case .. ?
> 
> On Wednesday 22 March 2006 11:09 am, Jim C. Nasby wrote:
> > Load the files into a temp table and go from there...
> >
> > COPY ... FROM file;
> > UPDATE existing_table SET ... WHERE ...;
> > INSERT INTO existing_table SELECT * FROM temp_table WHERE NOT EXISTS(
> > SELECT * FROM existing_table WHERE ...)
> >
> > On Wed, Mar 22, 2006 at 10:32:10AM -0500, ashah wrote:
> > > I have a database with foreign keys enabled on the schema. I receive
> > > different files, some of them are huge. And I need to load these files in
> > > the database every night. There are several scenerios that I want to
> > > design an optimal solution for -
> > >
> > > 1. One of the file has around 80K records and I have to delete everything
> > > from the table and load this file. The provider never provides a "delta
> > > file" so I dont have a way to identify which records are already present
> > > and which are new. If I dont delete everything and insert fresh, I have
> > > to make around 80K selects to decide if the records exist or not. Now
> > > there are lot of tables that have foreign keys linked with this table so
> > > unless I disable the foreign keys, I cannot really delete anything from
> > > this table. What would be a good practise here?
> > >
> > > 2. Another file that I receive has around 150K records that I need to
> > > load in the database. Now one of the fields is logically a "foreign key"
> > > to another table, and it is linked to the parent table via a database
> > > generated unique ID instead of the actual value. But the file comes with
> > > the actual value. So once again, I have to either drop the foreign key,
> > > or make 150K selects to determine the serial ID so that the foreign key
> > > is satisfied. What would be a good strategy in this scenerio ?
> > >
> > > Please pardon my inexperience with database !
> > >
> > > Thanks,
> > > Amit
> > >
> > > ---(end of broadcast)---
> > > TIP 4: Have you searched our list archives?
> > >
> > >http://archives.postgresql.org
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match
> 

-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Massive Inserts Strategies

2006-03-28 Thread Markus Schaber
Hi, ashah,

ashah wrote:
> I tried this solution, but ran into following problem.
> 
> The temp_table has columns (col1, col2, col3).
> 
> The original_table has columns (col0, col1, col2, col3)

> Now the extra col0 on the original_table is the unique generated ID by
> the database.

INSERT INTO original_table (col1, col2, col3) SELECT col1, col2, col3
FROM temp_table WHERE ...

HTH,
Markus




-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Massive Inserts Strategies

2006-03-28 Thread ashah
I tried this solution, but ran into following problem.

The temp_table has columns (col1, col2, col3).

The original_table has columns (col0, col1, col2, col3)

Now the extra col0 on the original_table is the unique generated ID by the 
database.

How can I make your suggestions work in that case .. ?

On Wednesday 22 March 2006 11:09 am, Jim C. Nasby wrote:
> Load the files into a temp table and go from there...
>
> COPY ... FROM file;
> UPDATE existing_table SET ... WHERE ...;
> INSERT INTO existing_table SELECT * FROM temp_table WHERE NOT EXISTS(
> SELECT * FROM existing_table WHERE ...)
>
> On Wed, Mar 22, 2006 at 10:32:10AM -0500, ashah wrote:
> > I have a database with foreign keys enabled on the schema. I receive
> > different files, some of them are huge. And I need to load these files in
> > the database every night. There are several scenerios that I want to
> > design an optimal solution for -
> >
> > 1. One of the file has around 80K records and I have to delete everything
> > from the table and load this file. The provider never provides a "delta
> > file" so I dont have a way to identify which records are already present
> > and which are new. If I dont delete everything and insert fresh, I have
> > to make around 80K selects to decide if the records exist or not. Now
> > there are lot of tables that have foreign keys linked with this table so
> > unless I disable the foreign keys, I cannot really delete anything from
> > this table. What would be a good practise here?
> >
> > 2. Another file that I receive has around 150K records that I need to
> > load in the database. Now one of the fields is logically a "foreign key"
> > to another table, and it is linked to the parent table via a database
> > generated unique ID instead of the actual value. But the file comes with
> > the actual value. So once again, I have to either drop the foreign key,
> > or make 150K selects to determine the serial ID so that the foreign key
> > is satisfied. What would be a good strategy in this scenerio ?
> >
> > Please pardon my inexperience with database !
> >
> > Thanks,
> > Amit
> >
> > ---(end of broadcast)---
> > TIP 4: Have you searched our list archives?
> >
> >http://archives.postgresql.org

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Massive Inserts Strategies

2006-03-22 Thread Jim C. Nasby
Load the files into a temp table and go from there...

COPY ... FROM file;
UPDATE existing_table SET ... WHERE ...;
INSERT INTO existing_table SELECT * FROM temp_table WHERE NOT EXISTS(
SELECT * FROM existing_table WHERE ...)

On Wed, Mar 22, 2006 at 10:32:10AM -0500, ashah wrote:
> I have a database with foreign keys enabled on the schema. I receive 
> different 
> files, some of them are huge. And I need to load these files in the database 
> every night. There are several scenerios that I want to design an optimal 
> solution for -
> 
> 1. One of the file has around 80K records and I have to delete everything 
> from 
> the table and load this file. The provider never provides a "delta file" so I 
> dont have a way to identify which records are already present and which are 
> new. If I dont delete everything and insert fresh, I have to make around 80K 
> selects to decide if the records exist or not. Now there are lot of tables 
> that have foreign keys linked with this table so unless I disable the foreign 
> keys, I cannot really delete anything from this table. What would be a good  
> practise here?
> 
> 2. Another file that I receive has around 150K records that I need to load in 
> the database. Now one of the fields is logically a "foreign key" to another 
> table, and it is linked to the parent table via a database generated unique 
> ID instead of the actual value. But the file comes with the actual value. So 
> once again, I have to either drop the foreign key, or make 150K selects to 
> determine the serial ID so that the foreign key is satisfied. What would be a 
> good strategy in this scenerio ?
> 
> Please pardon my inexperience with database !
> 
> Thanks,
> Amit
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org
> 

-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Massive Inserts Strategies

2006-03-22 Thread PFC


	For both cases, you could COPY your file into a temporary table and do a  
big JOIN with your existing table, one for inserting new rows, and one for  
updating existing rows.
	Doing a large bulk query is a lot more efficient than doing a lot of  
selects. Vacuum afterwards, and you'll be fine.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Massive Inserts Strategies

2006-03-22 Thread ashah
I have a database with foreign keys enabled on the schema. I receive different 
files, some of them are huge. And I need to load these files in the database 
every night. There are several scenerios that I want to design an optimal 
solution for -

1. One of the file has around 80K records and I have to delete everything from 
the table and load this file. The provider never provides a "delta file" so I 
dont have a way to identify which records are already present and which are 
new. If I dont delete everything and insert fresh, I have to make around 80K 
selects to decide if the records exist or not. Now there are lot of tables 
that have foreign keys linked with this table so unless I disable the foreign 
keys, I cannot really delete anything from this table. What would be a good  
practise here?

2. Another file that I receive has around 150K records that I need to load in 
the database. Now one of the fields is logically a "foreign key" to another 
table, and it is linked to the parent table via a database generated unique 
ID instead of the actual value. But the file comes with the actual value. So 
once again, I have to either drop the foreign key, or make 150K selects to 
determine the serial ID so that the foreign key is satisfied. What would be a 
good strategy in this scenerio ?

Please pardon my inexperience with database !

Thanks,
Amit

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org