Re: [PERFORM] Massive Inserts Strategies
Is there some other unique key you can test on? Take a look at http://lnk.nu/cvs.distributed.net/8qt.sql lines 169-216 for an exammple. In this case we use a different method for assigning IDs than you probably will, but the idea remains. On Tue, Mar 28, 2006 at 10:59:49AM -0500, ashah wrote: > I tried this solution, but ran into following problem. > > The temp_table has columns (col1, col2, col3). > > The original_table has columns (col0, col1, col2, col3) > > Now the extra col0 on the original_table is the unique generated ID by the > database. > > How can I make your suggestions work in that case .. ? > > On Wednesday 22 March 2006 11:09 am, Jim C. Nasby wrote: > > Load the files into a temp table and go from there... > > > > COPY ... FROM file; > > UPDATE existing_table SET ... WHERE ...; > > INSERT INTO existing_table SELECT * FROM temp_table WHERE NOT EXISTS( > > SELECT * FROM existing_table WHERE ...) > > > > On Wed, Mar 22, 2006 at 10:32:10AM -0500, ashah wrote: > > > I have a database with foreign keys enabled on the schema. I receive > > > different files, some of them are huge. And I need to load these files in > > > the database every night. There are several scenerios that I want to > > > design an optimal solution for - > > > > > > 1. One of the file has around 80K records and I have to delete everything > > > from the table and load this file. The provider never provides a "delta > > > file" so I dont have a way to identify which records are already present > > > and which are new. If I dont delete everything and insert fresh, I have > > > to make around 80K selects to decide if the records exist or not. Now > > > there are lot of tables that have foreign keys linked with this table so > > > unless I disable the foreign keys, I cannot really delete anything from > > > this table. What would be a good practise here? > > > > > > 2. Another file that I receive has around 150K records that I need to > > > load in the database. Now one of the fields is logically a "foreign key" > > > to another table, and it is linked to the parent table via a database > > > generated unique ID instead of the actual value. But the file comes with > > > the actual value. So once again, I have to either drop the foreign key, > > > or make 150K selects to determine the serial ID so that the foreign key > > > is satisfied. What would be a good strategy in this scenerio ? > > > > > > Please pardon my inexperience with database ! > > > > > > Thanks, > > > Amit > > > > > > ---(end of broadcast)--- > > > TIP 4: Have you searched our list archives? > > > > > >http://archives.postgresql.org > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match > -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Massive Inserts Strategies
Hi, ashah, ashah wrote: > I tried this solution, but ran into following problem. > > The temp_table has columns (col1, col2, col3). > > The original_table has columns (col0, col1, col2, col3) > Now the extra col0 on the original_table is the unique generated ID by > the database. INSERT INTO original_table (col1, col2, col3) SELECT col1, col2, col3 FROM temp_table WHERE ... HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Massive Inserts Strategies
I tried this solution, but ran into following problem. The temp_table has columns (col1, col2, col3). The original_table has columns (col0, col1, col2, col3) Now the extra col0 on the original_table is the unique generated ID by the database. How can I make your suggestions work in that case .. ? On Wednesday 22 March 2006 11:09 am, Jim C. Nasby wrote: > Load the files into a temp table and go from there... > > COPY ... FROM file; > UPDATE existing_table SET ... WHERE ...; > INSERT INTO existing_table SELECT * FROM temp_table WHERE NOT EXISTS( > SELECT * FROM existing_table WHERE ...) > > On Wed, Mar 22, 2006 at 10:32:10AM -0500, ashah wrote: > > I have a database with foreign keys enabled on the schema. I receive > > different files, some of them are huge. And I need to load these files in > > the database every night. There are several scenerios that I want to > > design an optimal solution for - > > > > 1. One of the file has around 80K records and I have to delete everything > > from the table and load this file. The provider never provides a "delta > > file" so I dont have a way to identify which records are already present > > and which are new. If I dont delete everything and insert fresh, I have > > to make around 80K selects to decide if the records exist or not. Now > > there are lot of tables that have foreign keys linked with this table so > > unless I disable the foreign keys, I cannot really delete anything from > > this table. What would be a good practise here? > > > > 2. Another file that I receive has around 150K records that I need to > > load in the database. Now one of the fields is logically a "foreign key" > > to another table, and it is linked to the parent table via a database > > generated unique ID instead of the actual value. But the file comes with > > the actual value. So once again, I have to either drop the foreign key, > > or make 150K selects to determine the serial ID so that the foreign key > > is satisfied. What would be a good strategy in this scenerio ? > > > > Please pardon my inexperience with database ! > > > > Thanks, > > Amit > > > > ---(end of broadcast)--- > > TIP 4: Have you searched our list archives? > > > >http://archives.postgresql.org ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Massive Inserts Strategies
Load the files into a temp table and go from there... COPY ... FROM file; UPDATE existing_table SET ... WHERE ...; INSERT INTO existing_table SELECT * FROM temp_table WHERE NOT EXISTS( SELECT * FROM existing_table WHERE ...) On Wed, Mar 22, 2006 at 10:32:10AM -0500, ashah wrote: > I have a database with foreign keys enabled on the schema. I receive > different > files, some of them are huge. And I need to load these files in the database > every night. There are several scenerios that I want to design an optimal > solution for - > > 1. One of the file has around 80K records and I have to delete everything > from > the table and load this file. The provider never provides a "delta file" so I > dont have a way to identify which records are already present and which are > new. If I dont delete everything and insert fresh, I have to make around 80K > selects to decide if the records exist or not. Now there are lot of tables > that have foreign keys linked with this table so unless I disable the foreign > keys, I cannot really delete anything from this table. What would be a good > practise here? > > 2. Another file that I receive has around 150K records that I need to load in > the database. Now one of the fields is logically a "foreign key" to another > table, and it is linked to the parent table via a database generated unique > ID instead of the actual value. But the file comes with the actual value. So > once again, I have to either drop the foreign key, or make 150K selects to > determine the serial ID so that the foreign key is satisfied. What would be a > good strategy in this scenerio ? > > Please pardon my inexperience with database ! > > Thanks, > Amit > > ---(end of broadcast)--- > TIP 4: Have you searched our list archives? > >http://archives.postgresql.org > -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Massive Inserts Strategies
For both cases, you could COPY your file into a temporary table and do a big JOIN with your existing table, one for inserting new rows, and one for updating existing rows. Doing a large bulk query is a lot more efficient than doing a lot of selects. Vacuum afterwards, and you'll be fine. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Massive Inserts Strategies
I have a database with foreign keys enabled on the schema. I receive different files, some of them are huge. And I need to load these files in the database every night. There are several scenerios that I want to design an optimal solution for - 1. One of the file has around 80K records and I have to delete everything from the table and load this file. The provider never provides a "delta file" so I dont have a way to identify which records are already present and which are new. If I dont delete everything and insert fresh, I have to make around 80K selects to decide if the records exist or not. Now there are lot of tables that have foreign keys linked with this table so unless I disable the foreign keys, I cannot really delete anything from this table. What would be a good practise here? 2. Another file that I receive has around 150K records that I need to load in the database. Now one of the fields is logically a "foreign key" to another table, and it is linked to the parent table via a database generated unique ID instead of the actual value. But the file comes with the actual value. So once again, I have to either drop the foreign key, or make 150K selects to determine the serial ID so that the foreign key is satisfied. What would be a good strategy in this scenerio ? Please pardon my inexperience with database ! Thanks, Amit ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org