[PERFORM] How import big amounts of data?

2005-12-29 Thread Arnau

Hi all,

  Which is the best way to import data to tables? I have to import 
9 rows into a column and doing it as inserts takes ages. Would be 
faster with copy? is there any other alternative to insert/copy?


Cheers!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] How import big amounts of data?

2005-12-29 Thread Steinar H. Gunderson
On Thu, Dec 29, 2005 at 10:48:26AM +0100, Arnau wrote:
   Which is the best way to import data to tables? I have to import 
 9 rows into a column and doing it as inserts takes ages. Would be 
 faster with copy? is there any other alternative to insert/copy?

There are multiple reasons why your INSERT might be slow:

- Are you using multiple transactions instead of batching them in all or a
  few transactions? (Usually, the per-transaction cost is a lot higher than
  the per-row insertion cost.)
- Do you have a foreign key without a matching index in the other table? (In
  newer versions of PostgreSQL, EXPLAIN ANALYZE can help with this; do a
  single insert and see where it ends up doing work. Older won't show such
  things, though.)
- Do you have an insertion trigger taking time? (Ditto wrt. EXPLAIN ANALYZE.)

COPY will be faster than INSERT regardless, though (for more than a few rows,
at least).

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How import big amounts of data?

2005-12-29 Thread Dennis Bjorklund
On Thu, 29 Dec 2005, Arnau wrote:

Which is the best way to import data to tables? I have to import 
 9 rows into a column and doing it as inserts takes ages. Would be 
 faster with copy? is there any other alternative to insert/copy?

Wrap the inserts inside a BEGIN/COMMIT block and it will be a lot faster.
Copy is even faster, but for just 9 rows I wouldn't bother.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] How import big amounts of data?

2005-12-29 Thread Teemu Torma
On Thursday 29 December 2005 10:48, Arnau wrote:
Which is the best way to import data to tables? I have to import
 9 rows into a column and doing it as inserts takes ages. Would be
 faster with copy? is there any other alternative to insert/copy?

I am doing twice as big imports daily, and found the follwing method 
most efficient (other than using copy):

- Use plpgsql function to do the actual insert (or update/insert if 
needed). 

- Inside a transaction, execute SELECT statements with maximum possible 
number of insert function calls in one go.  This minimizes the number 
of round trips between the client and the server.

Teemu

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] How import big amounts of data?

2005-12-29 Thread Arnau


I am doing twice as big imports daily, and found the follwing method 
most efficient (other than using copy):


- Use plpgsql function to do the actual insert (or update/insert if 
needed). 

- Inside a transaction, execute SELECT statements with maximum possible 
number of insert function calls in one go.  This minimizes the number 
of round trips between the client and the server.


Thanks Teemu! could you paste an example of one of those functions? ;-) 
An example of those SELECTS also would be great, I'm not sure I have 
completly understood what you mean.


--
Arnau

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] How import big amounts of data?

2005-12-29 Thread Teemu Torma
On Thursday 29 December 2005 17:19, Arnau wrote:
  - Use plpgsql function to do the actual insert (or update/insert if
  needed).
 
  - Inside a transaction, execute SELECT statements with maximum
  possible number of insert function calls in one go.  This minimizes
  the number of round trips between the client and the server.

 Thanks Teemu! could you paste an example of one of those functions?
 ;-) An example of those SELECTS also would be great, I'm not sure I
 have completly understood what you mean.

An insert function like:

CREATE OR REPLACE FUNCTION
insert_values (the_value1 numeric, the_value2 numeric)
RETURNS void
LANGUAGE plpgsql VOLATILE AS $$
BEGIN
  INSERT INTO values (value1, value2)
VALUES (the_value1, the_value2);
RETURN;
END;
$$;

Then execute queries like

SELECT insert_values(1,2), insert_values(2,3), insert_values(3,4);

with maximum number of insert_values calls as possible.

I think the transaction (BEGIN/COMMIT) has little time benefit if you 
have at least hundreds of calls in one SELECT.

Teemu

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Process executing COPY opens and reads every table on the system

2005-12-29 Thread Jeffrey W. Baker
I have an instance of PG 7.4 where I would really like to execute some
schema changes, but every schema change is blocked waiting for a process
doing a COPY.  That query is:

COPY drill.trades (manager, sec_id, ticker, bridge_tkr, date, type,
short, quantity, price, prin, net_money, factor) TO stdout;

So it's only involved with a single table in a single schema.
Unfortunately, what this process is doing is opening and reading every
table in the database:

# strace -e open,close -p 29859
Process 29859 attached - interrupt to quit
open(/var/lib/postgres/data/base/7932340/2442094542, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/358185104.16, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2023517557.1, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2023517557.3, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2023517557.5, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2023517557.5, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2023517557.3, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2023517557.6, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2023517557.9, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2023517557.3, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/358185104.16, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/358185104.16, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/358185104.16, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2414561511, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2426495316, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2426495316, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2414561511, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2426495316, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2426495316, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2414561511, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/358185104.16, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2429205386, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2429205433, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/358185104.16, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2426495316, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2414561511, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/358185104.16, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/358185104.16, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2429205441, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/358185104.16, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2414561511, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2426495316, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2023517557.3, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2023517557.10, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/358185104.16, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/358185104.16, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2023517557.9, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/2298808676/2298808939.10, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/358185104.15, O_RDWR) = 49
close(49)   = 0
open(/var/lib/postgres/data/base/7932340/2414561511, O_RDWR) = 49
close(49)   = 0

[PERFORM] Invulnerable VACUUM process thrashing everything

2005-12-29 Thread Jeffrey W. Baker
A few WEEKS ago, the autovacuum on my instance of pg 7.4 unilaterally
decided to VACUUM a table which has not been updated in over a year and
is more than one terabyte on the disk.  Because of the very high
transaction load on this database, this VACUUM has been ruining
performance for almost a month.  Unfortunately is seems invulnerable to
killing by signals:

# ps ax | grep VACUUM
15308 ?D588:00 postgres: postgres skunk [local] VACUUM
# kill -HUP 15308
# ps ax | grep VACUUM
15308 ?D588:00 postgres: postgres skunk [local] VACUUM
# kill -INT 15308
# ps ax | grep VACUUM
15308 ?D588:00 postgres: postgres skunk [local] VACUUM
# kill -PIPE 15308
# ps ax | grep VACUUM
15308 ?D588:00 postgres: postgres skunk [local] VACUUM

o/~ But the cat came back, the very next day ...

I assume that if I kill this with SIGKILL, that will bring down every
other postgres process, so that should be avoided.  But surely there is
a way to interrupt this.  If I had some reason to shut down the
instance, I'd be screwed, it seems.

-jwb

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Invulnerable VACUUM process thrashing everything

2005-12-29 Thread Ron
Ick.  Can you get users and foreign connections off that machine, 
lock them out for some period, and renice the VACUUM?


Shedding load and keeping it off while VACUUM runs high priority 
might allow it to finish in a reasonable amount of time.

Or
Shedding load and dropping the VACUUM priority might allow a kill 
signal to get through.


Hope this helps,
Ron


At 05:09 PM 12/29/2005, Jeffrey W. Baker wrote:

A few WEEKS ago, the autovacuum on my instance of pg 7.4 unilaterally
decided to VACUUM a table which has not been updated in over a year and
is more than one terabyte on the disk.  Because of the very high
transaction load on this database, this VACUUM has been ruining
performance for almost a month.  Unfortunately is seems invulnerable to
killing by signals:

# ps ax | grep VACUUM
15308 ?D588:00 postgres: postgres skunk [local] VACUUM
# kill -HUP 15308
# ps ax | grep VACUUM
15308 ?D588:00 postgres: postgres skunk [local] VACUUM
# kill -INT 15308
# ps ax | grep VACUUM
15308 ?D588:00 postgres: postgres skunk [local] VACUUM
# kill -PIPE 15308
# ps ax | grep VACUUM
15308 ?D588:00 postgres: postgres skunk [local] VACUUM

o/~ But the cat came back, the very next day ...

I assume that if I kill this with SIGKILL, that will bring down every
other postgres process, so that should be avoided.  But surely there is
a way to interrupt this.  If I had some reason to shut down the
instance, I'd be screwed, it seems.





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Invulnerable VACUUM process thrashing everything

2005-12-29 Thread Jeffrey W. Baker
On Thu, 2005-12-29 at 22:53 +, Russ Garrett wrote:
 In my experience a kill -9 has never resulted in any data loss in this 
 situation (it will cause postgres to detect that the process died, shut 
 down, then recover), and most of the time it only causes a 5-10sec 
 outage. I'd definitely hesitate to recommend it in a production context 
 though, especially since I think there are some known race-condition 
 bugs in 7.4.
 
 VACUUM *will* respond to a SIGTERM, but it doesn't check very often - 
 I've often had to wait hours for it to determine that it's been killed, 
 and my tables aren't anywhere near 1TB. Maybe this is a place where 
 things could be improved...

FWIW, I murdered this process with SIGKILL, and the recovery was very
short.


 Incidentally, I have to kill -9 some of our MySQL instances quite 
 regularly because they do odd things. Not something you want to be 
 doing, especially when MySQL takes 30mins to recover.

Agreed.  After mysql shutdown with MyISAM, all tables must be checked
and usually many need to be repaired.  This takes a reallly long
time.

-jwb

 Russ Garrett
 Last.fm Ltd.
 [EMAIL PROTECTED]
 
 Ron wrote:
 
  Ick.  Can you get users and foreign connections off that machine, lock 
  them out for some period, and renice the VACUUM?
 
  Shedding load and keeping it off while VACUUM runs high priority might 
  allow it to finish in a reasonable amount of time.
  Or
  Shedding load and dropping the VACUUM priority might allow a kill 
  signal to get through.
 
  Hope this helps,
  Ron
 
 
  At 05:09 PM 12/29/2005, Jeffrey W. Baker wrote:
 
  A few WEEKS ago, the autovacuum on my instance of pg 7.4 unilaterally
  decided to VACUUM a table which has not been updated in over a year and
  is more than one terabyte on the disk.  Because of the very high
  transaction load on this database, this VACUUM has been ruining
  performance for almost a month.  Unfortunately is seems invulnerable to
  killing by signals:
 
  # ps ax | grep VACUUM
  15308 ?D588:00 postgres: postgres skunk [local] VACUUM
  # kill -HUP 15308
  # ps ax | grep VACUUM
  15308 ?D588:00 postgres: postgres skunk [local] VACUUM
  # kill -INT 15308
  # ps ax | grep VACUUM
  15308 ?D588:00 postgres: postgres skunk [local] VACUUM
  # kill -PIPE 15308
  # ps ax | grep VACUUM
  15308 ?D588:00 postgres: postgres skunk [local] VACUUM
 
  o/~ But the cat came back, the very next day ...
 
  I assume that if I kill this with SIGKILL, that will bring down every
  other postgres process, so that should be avoided.  But surely there is
  a way to interrupt this.  If I had some reason to shut down the
  instance, I'd be screwed, it seems.
 
 
 
 
 
  ---(end of broadcast)---
  TIP 2: Don't 'kill -9' the postmaster
 
 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Invulnerable VACUUM process thrashing everything

2005-12-29 Thread Russ Garrett
In my experience a kill -9 has never resulted in any data loss in this 
situation (it will cause postgres to detect that the process died, shut 
down, then recover), and most of the time it only causes a 5-10sec 
outage. I'd definitely hesitate to recommend it in a production context 
though, especially since I think there are some known race-condition 
bugs in 7.4.


VACUUM *will* respond to a SIGTERM, but it doesn't check very often - 
I've often had to wait hours for it to determine that it's been killed, 
and my tables aren't anywhere near 1TB. Maybe this is a place where 
things could be improved...


Incidentally, I have to kill -9 some of our MySQL instances quite 
regularly because they do odd things. Not something you want to be 
doing, especially when MySQL takes 30mins to recover.


Russ Garrett
Last.fm Ltd.
[EMAIL PROTECTED]

Ron wrote:

Ick.  Can you get users and foreign connections off that machine, lock 
them out for some period, and renice the VACUUM?


Shedding load and keeping it off while VACUUM runs high priority might 
allow it to finish in a reasonable amount of time.

Or
Shedding load and dropping the VACUUM priority might allow a kill 
signal to get through.


Hope this helps,
Ron


At 05:09 PM 12/29/2005, Jeffrey W. Baker wrote:


A few WEEKS ago, the autovacuum on my instance of pg 7.4 unilaterally
decided to VACUUM a table which has not been updated in over a year and
is more than one terabyte on the disk.  Because of the very high
transaction load on this database, this VACUUM has been ruining
performance for almost a month.  Unfortunately is seems invulnerable to
killing by signals:

# ps ax | grep VACUUM
15308 ?D588:00 postgres: postgres skunk [local] VACUUM
# kill -HUP 15308
# ps ax | grep VACUUM
15308 ?D588:00 postgres: postgres skunk [local] VACUUM
# kill -INT 15308
# ps ax | grep VACUUM
15308 ?D588:00 postgres: postgres skunk [local] VACUUM
# kill -PIPE 15308
# ps ax | grep VACUUM
15308 ?D588:00 postgres: postgres skunk [local] VACUUM

o/~ But the cat came back, the very next day ...

I assume that if I kill this with SIGKILL, that will bring down every
other postgres process, so that should be avoided.  But surely there is
a way to interrupt this.  If I had some reason to shut down the
instance, I'd be screwed, it seems.






---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Invulnerable VACUUM process thrashing everything

2005-12-29 Thread Tom Lane
Russ Garrett [EMAIL PROTECTED] writes:
 VACUUM *will* respond to a SIGTERM, but it doesn't check very often - 
 I've often had to wait hours for it to determine that it's been killed, 
 and my tables aren't anywhere near 1TB. Maybe this is a place where 
 things could be improved...

Hmm, there are CHECK_FOR_INTERRUPTS calls in all the loops that seem
significant to me.  Is there anything odd about your database schema?
Unusual index types or data types maybe?  Also, what PG version are
you using?

If you notice a VACUUM not responding to SIGTERM promptly, it'd be
useful to attach to the backend process with gdb and get a stack trace
to find out what it's doing.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] unsubscribe

2005-12-29 Thread Arup Dutta








unsubscribe