Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-14 Thread Lou O'Quin
unless you specify otherwiise, every insert carries its own transaction
begin/commit.  That's a lot of overhead for a single insert, no?  Why
not use a single transaction for, say, each 1000 inserts?  That would
strike a nice balance of security with efficiency.

pseudo code for the insert:

Begin Transaction;
FOR i in 1..20 LOOP
  INSERT INTO viewfoo (x) VALUES (x);
  IF i % 1000 = 0 THEN
Commit Transaction;
Begin Transaction;
  END IF;
END LOOP;
Commit Transaction;
End


This approach should speed up things dramatically. 


 
 Karen Hill [EMAIL PROTECTED] 2/6/2007 2:39 PM 
On Feb 5, 9:33 pm, [EMAIL PROTECTED] (Tom Lane) wrote:
 Karen Hill [EMAIL PROTECTED] writes:
  I have a pl/pgsql function that is inserting 200,000 records for
  testing purposes.  What is the expected time frame for this
operation
  on a pc with 1/2 a gig of ram and a 7200 RPM disk?

 I think you have omitted a bunch of relevant facts.  Bare INSERT is
 reasonably quick:

 regression=# create table foo (f1 int);
 CREATE TABLE
 regression=# \timing
 Timing is on.
 regression=# insert into foo select x from generate_series(1,20)
x;
 INSERT 0 20
 Time: 5158.564 ms
 regression=#

 (this on a not-very-fast machine) but if you weigh it down with a
ton
 of index updates, foreign key checks, etc, it could get slow ...
 also you haven't mentioned what else that plpgsql function is doing.


The postgres version is 8.2.1 on Windows.   The pl/pgsql function is
inserting to an updatable view (basically two tables).

CREATE TABLE foo1
(


) ;

CREATE TABLE foo2
(

);

CREATE VIEW viewfoo AS
(

);
CREATE RULE ruleFoo AS ON INSERT TO viewfoo DO INSTEAD
(

);

CREATE OR REPLACE FUNCTION functionFoo() RETURNS VOID AS $$
BEGIN
FOR i in 1..20 LOOP
INSERT INTO viewfoo (x) VALUES (x);
END LOOP;
END;
$$ LANGUAGE plpgsql;



---(end of
broadcast)---
TIP 6: explain analyze is your friend


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Csaba Nagy
On Tue, 2007-02-06 at 01:35, Karen Hill wrote:
 [snip] So far I've been sitting here for about 2 million ms
 waiting for it to complete, and I'm not sure how many inserts postgres
 is doing per second.

One way is to run analyze verbose on the target table and see how many
pages it has, and then do it again 1 minute later and check how many
pages it grew. Then multiply the page increase by the record per page
ratio you can get from the same analyze's output, and you'll get an
estimated growth rate. Of course this will only work if you didn't have
lots of free space in the table to start with... if you do have lots of
free space, you still can estimate the growth based on the analyze
results, but it will be more complicated.


In any case, it would be very nice to have more tools to attach to
running queries and see how they are doing... starting with what exactly
they are doing (are they in RI checks maybe ?), the actual execution
plan they are using, how much they've done from their work... it would
help a lot debugging performance problems.

Cheers,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Scott Marlowe
On Tue, 2007-02-06 at 10:40, Merlin Moncure wrote:
 On 2/6/07, Scott Marlowe [EMAIL PROTECTED] wrote:
  On Mon, 2007-02-05 at 18:35, Karen Hill wrote:
   I have a pl/pgsql function that is inserting 200,000 records for
   testing purposes.  What is the expected time frame for this operation
   on a pc with 1/2 a gig of ram and a 7200 RPM disk?   The processor is
   a 2ghz cpu.  So far I've been sitting here for about 2 million ms
   waiting for it to complete, and I'm not sure how many inserts postgres
   is doing per second.
 
  That really depends.  Doing 200,000 inserts as individual transactions
  will be fairly slow.  Since PostgreSQL generally runs in autocommit
  mode, this means that if you didn't expressly begin a transaction, you
  are in fact inserting each row as a transaction.  i.e. this:
 
 I think OP is doing insertion inside a pl/pgsql loop...transaction is
 implied here. 

Yeah, I noticed that about 10 seconds after hitting send... :)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Merlin Moncure

On 2/6/07, Scott Marlowe [EMAIL PROTECTED] wrote:

On Tue, 2007-02-06 at 10:40, Merlin Moncure wrote:
 On 2/6/07, Scott Marlowe [EMAIL PROTECTED] wrote:
  On Mon, 2007-02-05 at 18:35, Karen Hill wrote:
   I have a pl/pgsql function that is inserting 200,000 records for
   testing purposes.  What is the expected time frame for this operation
   on a pc with 1/2 a gig of ram and a 7200 RPM disk?   The processor is
   a 2ghz cpu.  So far I've been sitting here for about 2 million ms
   waiting for it to complete, and I'm not sure how many inserts postgres
   is doing per second.
 
  That really depends.  Doing 200,000 inserts as individual transactions
  will be fairly slow.  Since PostgreSQL generally runs in autocommit
  mode, this means that if you didn't expressly begin a transaction, you
  are in fact inserting each row as a transaction.  i.e. this:

 I think OP is doing insertion inside a pl/pgsql loop...transaction is
 implied here.

Yeah, I noticed that about 10 seconds after hitting send... :)


actually, I get the stupid award also because RI check to unindexed
column is not possible :)  (this haunts deletes, not inserts).

merlin

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Mark Lewis
On Tue, 2007-02-06 at 12:01 -0500, Merlin Moncure wrote:
 On 2/6/07, Scott Marlowe [EMAIL PROTECTED] wrote:
  On Tue, 2007-02-06 at 10:40, Merlin Moncure wrote:
   On 2/6/07, Scott Marlowe [EMAIL PROTECTED] wrote:
On Mon, 2007-02-05 at 18:35, Karen Hill wrote:
 I have a pl/pgsql function that is inserting 200,000 records for
 testing purposes.  What is the expected time frame for this operation
 on a pc with 1/2 a gig of ram and a 7200 RPM disk?   The processor is
 a 2ghz cpu.  So far I've been sitting here for about 2 million ms
 waiting for it to complete, and I'm not sure how many inserts postgres
 is doing per second.
   
That really depends.  Doing 200,000 inserts as individual transactions
will be fairly slow.  Since PostgreSQL generally runs in autocommit
mode, this means that if you didn't expressly begin a transaction, you
are in fact inserting each row as a transaction.  i.e. this:
  
   I think OP is doing insertion inside a pl/pgsql loop...transaction is
   implied here.
 
  Yeah, I noticed that about 10 seconds after hitting send... :)
 
 actually, I get the stupid award also because RI check to unindexed
 column is not possible :)  (this haunts deletes, not inserts).

Sure it's possible:

CREATE TABLE parent (col1 int4);
-- insert many millions of rows into parent
CREATE TABLE child  (col1 int4 REFERENCES parent(col1));
-- insert many millions of rows into child, very very slowly.


- Mark Lewis



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Bruno Wolff III
On Tue, Feb 06, 2007 at 10:31:26 -0800,
  Mark Lewis [EMAIL PROTECTED] wrote:
 
 Sure it's possible:
 
 CREATE TABLE parent (col1 int4);
 -- insert many millions of rows into parent
 CREATE TABLE child  (col1 int4 REFERENCES parent(col1));
 -- insert many millions of rows into child, very very slowly.

I don't think Postgres allows this. You don't have to have an index in the
child table, but do in the parent table.
Quote from http://developer.postgresql.org/pgdocs/postgres/sql-createtable.html:
The referenced columns must be the columns of a unique or primary key
constraint in the referenced table.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Merlin Moncure

On 2/6/07, Mark Lewis [EMAIL PROTECTED] wrote:

 actually, I get the stupid award also because RI check to unindexed
 column is not possible :)  (this haunts deletes, not inserts).

Sure it's possible:

CREATE TABLE parent (col1 int4);
-- insert many millions of rows into parent
CREATE TABLE child  (col1 int4 REFERENCES parent(col1));
-- insert many millions of rows into child, very very slowly.


the database will not allow you to create a RI link out unless the
parent table has a primary key/unique constraint, which the database
backs with an indexand you can't even trick it afterwards by
dropping the constraint.

it's the other direction, when you cascade forwards when you can have
a problem.  this is most common with a delete, but can also happen on
an update of a table's primary key with child tables referencing it.

merlin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Mark Lewis
On Tue, 2007-02-06 at 14:06 -0500, Merlin Moncure wrote:
 On 2/6/07, Mark Lewis [EMAIL PROTECTED] wrote:
   actually, I get the stupid award also because RI check to unindexed
   column is not possible :)  (this haunts deletes, not inserts).
 
  Sure it's possible:
 
  CREATE TABLE parent (col1 int4);
  -- insert many millions of rows into parent
  CREATE TABLE child  (col1 int4 REFERENCES parent(col1));
  -- insert many millions of rows into child, very very slowly.
 
 the database will not allow you to create a RI link out unless the
 parent table has a primary key/unique constraint, which the database
 backs with an indexand you can't even trick it afterwards by
 dropping the constraint.
 
 it's the other direction, when you cascade forwards when you can have
 a problem.  this is most common with a delete, but can also happen on
 an update of a table's primary key with child tables referencing it.
 

Hmmm, should check my SQL before hitting send I guess.  Well, at least
you no longer have to wear the stupid award, Merlin :)

-- Mark Lewis

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Hiltibidal, Robert
What is your row size?

Have you checked to see what your current inserts per second are?



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Scott
Marlowe
Sent: Tuesday, February 06, 2007 10:56 AM
To: Merlin Moncure
Cc: Karen Hill; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How long should it take to insert 200,000
records?

On Tue, 2007-02-06 at 10:40, Merlin Moncure wrote:
 On 2/6/07, Scott Marlowe [EMAIL PROTECTED] wrote:
  On Mon, 2007-02-05 at 18:35, Karen Hill wrote:
   I have a pl/pgsql function that is inserting 200,000 records for
   testing purposes.  What is the expected time frame for this
operation
   on a pc with 1/2 a gig of ram and a 7200 RPM disk?   The processor
is
   a 2ghz cpu.  So far I've been sitting here for about 2 million ms
   waiting for it to complete, and I'm not sure how many inserts
postgres
   is doing per second.
 
  That really depends.  Doing 200,000 inserts as individual
transactions
  will be fairly slow.  Since PostgreSQL generally runs in autocommit
  mode, this means that if you didn't expressly begin a transaction,
you
  are in fact inserting each row as a transaction.  i.e. this:
 
 I think OP is doing insertion inside a pl/pgsql loop...transaction is
 implied here. 

Yeah, I noticed that about 10 seconds after hitting send... :)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


PRIVILEGED AND CONFIDENTIAL
This email transmission contains privileged and confidential information 
intended only for the use of the individual or entity named above.  If the 
reader of the email is not the intended recipient or the employee or agent 
responsible for delivering it to the intended recipient, you are hereby 
notified that any use, dissemination or copying of this email transmission is 
strictly prohibited by the sender.  If you have received this transmission in 
error, please delete the email and immediately notify the sender via the email 
return address or mailto:[EMAIL PROTECTED]  Thank you.





---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Karen Hill
On Feb 5, 9:33 pm, [EMAIL PROTECTED] (Tom Lane) wrote:
 Karen Hill [EMAIL PROTECTED] writes:
  I have a pl/pgsql function that is inserting 200,000 records for
  testing purposes.  What is the expected time frame for this operation
  on a pc with 1/2 a gig of ram and a 7200 RPM disk?

 I think you have omitted a bunch of relevant facts.  Bare INSERT is
 reasonably quick:

 regression=# create table foo (f1 int);
 CREATE TABLE
 regression=# \timing
 Timing is on.
 regression=# insert into foo select x from generate_series(1,20) x;
 INSERT 0 20
 Time: 5158.564 ms
 regression=#

 (this on a not-very-fast machine) but if you weigh it down with a ton
 of index updates, foreign key checks, etc, it could get slow ...
 also you haven't mentioned what else that plpgsql function is doing.


The postgres version is 8.2.1 on Windows.   The pl/pgsql function is
inserting to an updatable view (basically two tables).

CREATE TABLE foo1
(


) ;

CREATE TABLE foo2
(

);

CREATE VIEW viewfoo AS
(

);
CREATE RULE ruleFoo AS ON INSERT TO viewfoo DO INSTEAD
(

);

CREATE OR REPLACE FUNCTION functionFoo() RETURNS VOID AS $$
BEGIN
FOR i in 1..20 LOOP
INSERT INTO viewfoo (x) VALUES (x);
END LOOP;
END;
$$ LANGUAGE plpgsql;



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Mark Kirkwood

Karen Hill wrote:



The postgres version is 8.2.1 on Windows.   The pl/pgsql function is
inserting to an updatable view (basically two tables).

CREATE TABLE foo1
(


) ;

CREATE TABLE foo2
(

);

CREATE VIEW viewfoo AS
(

);
CREATE RULE ruleFoo AS ON INSERT TO viewfoo DO INSTEAD
(

);

CREATE OR REPLACE FUNCTION functionFoo() RETURNS VOID AS $$
BEGIN
FOR i in 1..20 LOOP
INSERT INTO viewfoo (x) VALUES (x);
END LOOP;
END;
$$ LANGUAGE plpgsql;




Sorry - but we probably need *still* more detail! - the definition of 
viewfoo is likely to be critical. For instance a simplified variant of 
your setup does 20 inserts in 5s on my PIII tualatin machine:


CREATE TABLE foo1 (x INTEGER);

CREATE VIEW viewfoo AS SELECT * FROM foo1;

CREATE RULE ruleFoo AS ON INSERT TO viewfoo DO INSTEAD
(
 INSERT INTO foo1 VALUES (new.x);
)

CREATE OR REPLACE FUNCTION functionFoo() RETURNS VOID AS $$
BEGIN
FOR i in 1..20 LOOP
INSERT INTO viewfoo (x) VALUES (i);
END LOOP;
END;
$$ LANGUAGE plpgsql;


postgres=# \timing
postgres=# SELECT functionFoo() ;
 functionfoo
-

(1 row)

Time: 4659.477 ms

postgres=# SELECT count(*) FROM viewfoo;
 count

 20
(1 row)

Cheers

Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-06 Thread Tom Lane
Karen Hill [EMAIL PROTECTED] writes:
 On Feb 5, 9:33 pm, [EMAIL PROTECTED] (Tom Lane) wrote:
 I think you have omitted a bunch of relevant facts.

 The postgres version is 8.2.1 on Windows.   The pl/pgsql function is
 inserting to an updatable view (basically two tables).
 [ sketch of schema ]

I think the problem is probably buried in the parts you left out.  Can
you show us the full schemas for those tables, as well as the rule
definition?  The plpgsql function itself can certainly go a lot faster
than what you indicated.  On my slowest active machine:

regression=# create table viewfoo(x int);
CREATE TABLE
regression=# CREATE OR REPLACE FUNCTION functionFoo() RETURNS VOID AS $$
BEGIN
FOR i in 1..20 LOOP
INSERT INTO viewfoo (x) VALUES (i);
END LOOP;
END;
$$ LANGUAGE plpgsql;
CREATE FUNCTION
regression=# \timing
Timing is on.
regression=# select functionFoo();
 functionfoo 
-
 
(1 row)

Time: 16939.667 ms
regression=# 

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] How long should it take to insert 200,000 records?

2007-02-05 Thread Karen Hill
I have a pl/pgsql function that is inserting 200,000 records for
testing purposes.  What is the expected time frame for this operation
on a pc with 1/2 a gig of ram and a 7200 RPM disk?   The processor is
a 2ghz cpu.  So far I've been sitting here for about 2 million ms
waiting for it to complete, and I'm not sure how many inserts postgres
is doing per second.

regards,
karen


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-05 Thread Tom Lane
Karen Hill [EMAIL PROTECTED] writes:
 I have a pl/pgsql function that is inserting 200,000 records for
 testing purposes.  What is the expected time frame for this operation
 on a pc with 1/2 a gig of ram and a 7200 RPM disk?

I think you have omitted a bunch of relevant facts.  Bare INSERT is
reasonably quick:

regression=# create table foo (f1 int);
CREATE TABLE
regression=# \timing
Timing is on.
regression=# insert into foo select x from generate_series(1,20) x;
INSERT 0 20
Time: 5158.564 ms
regression=# 

(this on a not-very-fast machine) but if you weigh it down with a ton
of index updates, foreign key checks, etc, it could get slow ...
also you haven't mentioned what else that plpgsql function is doing.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate