Are you treating each insertion as a separate transaction? If so the performance will suffer. I am doing the same thing in building a data warehouse using PG. I have to load millions of records each night. I do two different things:

1) If I need to keep the insertions inside the java process I turn off auto-commit and every n insertions (5000 seems to give me the best performance for my setup) issue a commit. Make sure you do a final commit in a finally block so you don't miss anything.

2) Dump all the data to a file and then use a psql COPY <table> (columns) FROM 'file path' call to load it. Very fast.


John Pagakis wrote:


I have a table that will require 100,000 rows initially.

Assume the following (some of the field names have been changed for
confidentiality reasons):

   baz_number CHAR(15) NOT NULL,
   customer_id CHAR(39),
   foobar_id INTEGER,
   is_cancelled BOOL DEFAULT false NOT NULL,
   create_user VARCHAR(60) NOT NULL,
   create_datetime TIMESTAMP DEFAULT 'now()' NOT NULL,
   last_update_user VARCHAR(60) NOT NULL,
   last_update_datetime TIMESTAMP DEFAULT 'now()' NOT NULL,
   CONSTRAINT PK_baz PRIMARY KEY (baz_number)

   ADD FOREIGN KEY (customer_id) REFERENCES purchase (customer_id);

   ADD FOREIGN KEY (foobar_id) REFERENCES foobar (foobar_id);

Using JDBC, it took approximately one hour to insert 100,000 records. I have an algorithm to generate a unique baz_number - it is a mixture of alpha and numerics.

There is a purchase table; one purchase can have many associated baz
records, but the baz records will be pre-allocated - baz.customer_id allows
null.  The act of purchasing a baz will cause baz.customer_id to be
populated from the customer_id (key) field in the purchase table.

If it took an hour to insert 100,000 records, I can only imagine how much
time it will take if one customer were to attempt to purchase all 100,000
baz.  Certainly too long for a web page.

I've not had to deal with this kind of volume in Postgres before; I have my
suspicions on what is wrong here (could it be using a CHAR( 15 ) as a key?)
but I'd *LOVE* any thoughts.

Would I be better off making the key an identity field and not indexing on

Thanks in advance for any help.

John Pagakis

"The best way to make your dreams come true is to wake up." -- Paul Valery

This signature generated by
    ... and I Quote!!(tm) Copyright (c) 1999 SpaZmodic Frog Software, Inc.

---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
     joining column's datatypes do not match

Reply via email to