Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Alex Stapleton

On 23 Oct 2006, at 22:59, Jim C. Nasby wrote:

http://stats.distributed.net used to use a perl script to do some
transformations before loading data into the database. IIRC, when we
switched to using C we saw 100x improvement in speed, so I suspect  
that

if you want performance perl isn't the way to go. I think you can
compile perl into C, so maybe that would help some.



http://shootout.alioth.debian.org/gp4/benchmark.php? 
test=alllang=perllang2=gcc


100x doesn't totally impossible if that is even vaguely accurate and  
you happen to be using bits of Perl which are a lot slower than the C  
implementation would be...
The slowest things appear to involve calling functions, all the  
slowest tests involve lots of function calls.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Worky Workerson

On 10/25/06, Craig A. James [EMAIL PROTECTED] wrote:

Jim C. Nasby wrote:
 Well, given that perl is using an entire CPU, it sounds like you should
 start looking either at ways to remove some of the overhead from perl,
 or to split that perl into multiple processes.

I use Perl for big database copies (usually with some processing/transformation 
along the
way) and I've never seen 100% CPU usage except for brief periods, even when 
copying
BLOBS and such.  My typical copy divides operations into blocks, for example 
doing


I'm just doing CSV style transformations (and calling a lot of
functions along the way), but the end result is a straight bulk load
of data into a blank database.  And we've established that Postgres
can do *way* better than what I am seeing, so its not suprising that
perl is using 100% of a CPU.

However, I am still curious as to the rather slow COPYs from psql to
local disks.  Like I mentioned previously, I was only seeing about 5.7
MB/s (1.8 GB / 330 seconds), where it seemed like others were doing
substantially better.  What sorts of things should I look into?

Thanks!

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Markus Schaber
Hi, Worky,

Worky Workerson wrote:

 $ psql -c COPY my_table TO STDOUT  my_data
 $ ls my_data
 2018792 edgescape_pg_load
 $ time cat my_data | psql -c COPY mytable FROM STDIN
 real5m43.194s
 user0m35.412s
 sys 0m9.567s

That's via PSQL, and you get about 5 MB/Sec.

 On a table with no indices, triggers and contstraints, we managed to
 COPY about 7-8 megabytes/second with psql over our 100 MBit network, so
 here the network was the bottleneck.
 
 hmm, this makes me think that either my PG config is really lacking,
 or that the SAN is badly misconfigured, as I would expect it to
 outperform a 100Mb network.  As it is, with a straight pipe to psql
 COPY, I'm only working with a little over 5.5 MB/s.  Could this be due
 to the primary key index updates?

Yes, index updates cause both CPU load, and random disk access (which is
slow by nature).


HTH,
Markus
-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Merlin Moncure

On 10/23/06, Worky Workerson [EMAIL PROTECTED] wrote:

The disk load is where I start to get a little fuzzy, as I haven't
played with iostat to figure what is normal.  The local drives
contain PG_DATA as well as all the log files, but there is a
tablespace on the FibreChannel SAN that contains the destination
table.  The disk usage pattern that I see is that there is a ton of
consistent activity on the local disk, with iostat reporting an
average of 30K Blk_wrtn/s, which I assume is the log files.  Every
several seconds there is a massive burst of activity on the FC
partition, to the tune of 250K Blk_wrtn/s.

 On a table with no indices, triggers and contstraints, we managed to
 COPY about 7-8 megabytes/second with psql over our 100 MBit network, so
 here the network was the bottleneck.


I'm guessing the high bursts are checkpoints.  Can you check your log
files for pg and see if you are getting warnings about checkpoint
frequency?   You can get some mileage here by increasing wal files.

Have you determined that pg is not swapping?  try upping maintenance_work_mem.

What exactly is your architecture?  is your database server direct
attached to the san? if so, 2gb/4gb fc?  what san?  have you bonnie++
the san?  basically, you can measure iowait to see if pg is waiting on
your disks.

regarding perl, imo the language performance is really about which
libraries you use. the language itself is plenty fast.

merlin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Luke Lonergan
Mr. Worky Workerson,

On 10/25/06 5:03 AM, Worky Workerson [EMAIL PROTECTED] wrote:

 However, I am still curious as to the rather slow COPYs from psql to
 local disks.  Like I mentioned previously, I was only seeing about 5.7
 MB/s (1.8 GB / 330 seconds), where it seemed like others were doing
 substantially better.  What sorts of things should I look into?

It's probable that you  have a really poor performing disk configuration.
Judging from earlier results, you may only be getting 3 x 5.7 = 17 MB/s of
write performance to your disks, which is about 1/4 of a single disk drive.

Please run this test and report the time here:

1) Calculate the size of 2x memory in 8KB blocks:
  # of blocks = 250,000 x memory_in_GB

Example:
  250,000 x 16GB = 4,000,000 blocks

2) Benchmark the time taken to write 2x RAM sequentially to your disk:
  time bash -c dd if=/dev/zero of=bigfile bs=8k count=# of blocks 
sync

3) Benchmark the time taken to read same:
  time dd if=bigfile of=/dev/null bs=8k

- Luke



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Jim C. Nasby
On Tue, Oct 24, 2006 at 10:36:04PM -0700, Craig A. James wrote:
 Jim C. Nasby wrote:
 Well, given that perl is using an entire CPU, it sounds like you should
 start looking either at ways to remove some of the overhead from perl,
 or to split that perl into multiple processes.
 
 I use Perl for big database copies (usually with some 
 processing/transformation along the way) and I've never seen 100% CPU usage 
 except for brief periods, even when copying BLOBS and such.  My typical 
 copy divides operations into blocks, for example doing
 
  N = 0
  while (more rows to go) {
 begin transaction
 select ... where primary_key  N order by primary_key limit 1000
 while (fetch a row)
insert into ...
 N = (highest value found in last block)
 commit
   }
 
 Doing it like this in Perl should keep Postgres busy, with Perl using only 
 moderate resources.  If you're seeing high Perl CPU usage, I'd look first 
 at the Perl code.

Wait... so you're using perl to copy data between two tables? And using
a cursor to boot? I can't think of any way that could be more
inefficient...

What's wrong with a plain old INSERT INTO ... SELECT? Or if you really
need to break it into multiple transaction blocks, at least don't
shuffle the data from the database into perl and then back into the
database; do an INSERT INTO ... SELECT with that same where clause.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Jim C. Nasby
On Wed, Oct 25, 2006 at 08:03:38AM -0400, Worky Workerson wrote:
 I'm just doing CSV style transformations (and calling a lot of
 functions along the way), but the end result is a straight bulk load
 of data into a blank database.  And we've established that Postgres
 can do *way* better than what I am seeing, so its not suprising that
 perl is using 100% of a CPU.

If you're loading into an empty database, there's a number of tricks
that will help you:

Turn off fsync
Add constraints and indexes *after* you've loaded the data (best to add
as much of them as possible on a per-table basis right after the table
is loaded so that it's hopefully still in cache)
Crank up maintenance_work_mem, especially for tables that won't fit into
cache anyway
Bump up checkpoint segments and wal_buffers.
Disable PITR
Create a table and load it's data in a single transaction (8.2 will
avoid writing any WAL data if you do this and PITR is turned off)
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Problems using a function in a where clause

2006-10-25 Thread Jim C. Nasby
On Wed, Oct 25, 2006 at 07:55:38AM -0300, Mara Dalponte wrote:
 On 10/24/06, Jim C. Nasby [EMAIL PROTECTED] wrote:
 On Mon, Oct 23, 2006 at 04:54:00PM -0300, Mara Dalponte wrote:
  Hello,
 
  I have a query with several join operations and applying the same
  filter condition over each involved table. This condition is a complex
  predicate over an indexed  timestamp field, depending on some
  parameters.
  To factorize code,  I wrote the filter into a plpgsql function, but
  the resulting query is much more slower than the first one!
 
 A view would probably be a better idea... or create some code that
 generates the code for you.
 
 Thank, but the filter function needs some external parameters, so a
 view wont be appropiate. Anyway, your second possibility could work!
 
  The explain command over the original query gives the following info
  for the WHERE clause that uses the filter:
 
  ...
  Index Cond: ((_timestamp = '2006-02-23 03:00:00'::timestamp without
  time zone) AND (_timestamp = '2006-02-27 20:00:00.98'::timestamp
  without time zone))
  ...
 
  The explain command for the WHERE clause using the filtering function is:
 
  ...
  Filter: include_time_date('2006-02-23'::date, '2006-02-27'::date,
  '03:00:00'::time without time zone, '20:00:00'::time without time
  zone, (_timestamp)::timestamp without time zone)
  ...
 
  It seems to not be using the index, and I think this is the reason of
  the performance gap between both solutions.
 
 Well, it looks like include_time_date just returns a boolean, so how
 could it use the index?
 
 I mean that in the old query the index is used (because is a
 comparative condition over an indexed timestamp field), but not in the
 new one, where the function is used. Is there some kind of inline
 function type?

No, unfortunately. Your best bet is to add the most important filter
criteria by hand, or write code that writes the code (which is what I'd
probably do).
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Merlin Moncure

On 10/25/06, Worky Workerson [EMAIL PROTECTED] wrote:

 I'm guessing the high bursts are checkpoints.  Can you check your log
 files for pg and see if you are getting warnings about checkpoint
 frequency?   You can get some mileage here by increasing wal files.

Nope, nothing in the log.  I have set:
 wal_buffers=128
 checkpoint_segments=128
 checkpoint_timeout=3000
which I thought was rather generous.  Perhaps I should set it even
higher for the loads?

 Have you determined that pg is not swapping?  try upping maintenance_work_mem.

maintenance_work_mem = 524288 ... should I increase it even more?
Doesn't look like pg is swapping ...


nah, you already addressed it.  either pg is swapping or it isnt, and
i'm guessing it isn't.


I'm currently running bonnie++ with the defaults ... should I change
the execution to better mimic Postgres' behavior?


just post what you have...


RHEL 4.3 x86_64
HP DL585, 4 Dual Core Opteron 885s
  16 GB RAM
  2x300GB 10K SCSI320, RAID10
HP MSA1000 SAN direct connected via single 2GB Fibre Channel Arbitrated Loop
  10x300GB 10K SCSI320, RAID10


in theory, with 10 10k disks in raid 10, you should be able to keep
your 2fc link saturated all the time unless your i/o is extremely
random.  random i/o is the wild card here, ideally you should see at
least 2000 seeks in bonnie...lets see what comes up.

hopefully, bonnie will report close to 200 mb/sec.  in extreme
sequential cases, the 2fc link should be a bottleneck if the raid
controller is doing its job.

if you are having cpu issues, try breaking your process down to at
least 4 processes (you have quad dual core box after all)...thats a no
brainer.

merlin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] commit so slow program looks frozen

2006-10-25 Thread Carlo Stonebanks
(I tried this question on the interface forum and got no result, but I don't 
know how to tell if it's an interface issue or not)

I have a TCL app which typically takes hours to complete. I found out that 
it is taking longer than it should because it occasionally stalls 
inexplicably (for tens of minute at a time) then usually continues.

There are a minimum of four apps running at the same time, all reading 
different sections of the same table, all writing to the same db and the 
same tables. The other apps seem unaffected by the one app that freezes.

This happens running pg_exec $conn commit from within a TCL script on a 
client app.


The delays are so long that I used to think the app was hopelessly frozen. 
By accident, I left the app alone in its frozen state and came back a good 
deal later and seen that it was running again.

Sometimes I decide it *IS* frozen and have to restart. Because Ctrl-C will 
not cause the script to break, it appears the app is stuck in non-TCL code 
(either waiting for postgres or stuck in the interface code?)

The application loops through an import file, reading one row at a time, and 
issues a bunch of inserts and updates to various tables. There's a simple 
pg_exec $conn start transaction at the beginning of the loop and the 
commit at the end. The commit actually appears to be going through.

There are no messages of any significance in the log. There do not appear to 
be any outstanding locks or transactions.

I am not doing any explicit locking, all transaction settings are set to 
default.

Any thoughts on the cause and possible solutions would be appreciated.

Carlo



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Worky Workerson

I'm guessing the high bursts are checkpoints.  Can you check your log
files for pg and see if you are getting warnings about checkpoint
frequency?   You can get some mileage here by increasing wal files.


Nope, nothing in the log.  I have set:
wal_buffers=128
checkpoint_segments=128
checkpoint_timeout=3000
which I thought was rather generous.  Perhaps I should set it even
higher for the loads?


Have you determined that pg is not swapping?  try upping maintenance_work_mem.


maintenance_work_mem = 524288 ... should I increase it even more?
Doesn't look like pg is swapping ...


What exactly is your architecture?  is your database server direct
attached to the san? if so, 2gb/4gb fc?  what san?  have you bonnie++
the san?  basically, you can measure iowait to see if pg is waiting on
your disks.


I'm currently running bonnie++ with the defaults ... should I change
the execution to better mimic Postgres' behavior?

RHEL 4.3 x86_64
HP DL585, 4 Dual Core Opteron 885s
 16 GB RAM
 2x300GB 10K SCSI320, RAID10
HP MSA1000 SAN direct connected via single 2GB Fibre Channel Arbitrated Loop
 10x300GB 10K SCSI320, RAID10

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Luke Lonergan
Merlin,

On 10/25/06 8:38 AM, Merlin Moncure [EMAIL PROTECTED] wrote:

 in theory, with 10 10k disks in raid 10, you should be able to keep
 your 2fc link saturated all the time unless your i/o is extremely
 random.  random i/o is the wild card here, ideally you should see at
 least 2000 seeks in bonnie...lets see what comes up.

The 2000 seeks/sec are irrelevant to Postgres with one user doing COPY.
Because the I/O is single threaded, you will get one disk worth of seeks for
one user, roughly 150/second on a 10K RPM drive.

I suspect the problem here is the sequential I/O rate - let's wait and see
what the dd test results look like.

- Luke



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Craig A. James

Jim C. Nasby wrote:

Wait... so you're using perl to copy data between two tables? And using
a cursor to boot? I can't think of any way that could be more
inefficient...

What's wrong with a plain old INSERT INTO ... SELECT? Or if you really
need to break it into multiple transaction blocks, at least don't
shuffle the data from the database into perl and then back into the
database; do an INSERT INTO ... SELECT with that same where clause.


The data are on two different computers, and I do processing of the data as it 
passes through the application.  Otherwise, the INSERT INTO ... SELECT is my 
first choice.

Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Spiegelberg, Greg
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Craig A. James
 Sent: Wednesday, October 25, 2006 12:52 PM
 To: Jim C. Nasby
 Cc: Worky Workerson; Merlin Moncure; pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] Best COPY Performance
 
 Jim C. Nasby wrote:
  Wait... so you're using perl to copy data between two tables? And 
  using a cursor to boot? I can't think of any way that could be more 
  inefficient...
  
  What's wrong with a plain old INSERT INTO ... SELECT? Or if 
 you really 
  need to break it into multiple transaction blocks, at least don't 
  shuffle the data from the database into perl and then back into the 
  database; do an INSERT INTO ... SELECT with that same where clause.
 
 The data are on two different computers, and I do processing 
 of the data as it passes through the application.  Otherwise, 
 the INSERT INTO ... SELECT is my first choice.

Would dblink() help in any way?

Greg



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Craig A. James

Spiegelberg, Greg wrote:
The data are on two different computers, and I do processing 
of the data as it passes through the application.  Otherwise, 
the INSERT INTO ... SELECT is my first choice.


Would dblink() help in any way?


It might if perl wasn't so damned good at this. ;-)

Craig


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] ACCESS EXCLUSIVE lock

2006-10-25 Thread Atesz

Tom Lane wrote:

This isn't going to be changed, because the likely direction of future
development is that the planner will start making use of constraints
even for SELECT queries.  This means that a DROP CONSTRAINT operation
could invalidate the plan of a SELECT query, so the locking will be
essential.
  

Hi!

I also think the constraints can increase performance of queries, if the 
planner can use them. It will be a great feature in the future! But I 
have more questions about the coherency between a constraint and a 
transaction. Can a constraint live in differenet isolation levels? If I 
drop a constraint in a transaction (T1), it doesn't seem after the drop 
operation in T1. But it should seem in another transaction (T2) in line 
with T1 (if T2 is started between T1's begin and commit!). If T1 start 
after T1's commit, our constraint doesn't have to seem in T2, so the 
planner cannot use it. If I think well, these predicates means the 
constraint follows its isolation level of the transaction.


How does it works in the current release?

If the constraints adapt its transaction why could it invalidate the 
plan of a SELECT query?  A SELECT could use a given constraint, if it's 
dropped without comitting or exists when the SELECT or the tansaction of 
the SELECT starts. I know we have to examine which rows can affect the 
result of the SELECT. The main question in this case is that: A wrong 
row (which break the dropped constraint) can affect the result of the 
SELECT? In my opininon there isn't wrong rows. Do you know such special 
case when it can happen? So some wrong rows can seem in the SELECT?


I know my original problem is not too common, but the parallel 
performance of the PostgreSQL is very important in multiprocessor 
environment. I see, you follow this direction! So you make better 
locking conditions in 8.2 in more cases. Generally the drop constraints 
are running in itself or in short transactions.


We have an optimalization trick when we have to insert more million rows 
into a table in same transaction. Before inserting them we drop the 
foreign key constraints after the begin of the transaction, and remake 
tem after insertations. This method is faster then the conventional 
solution. These trasactions are longer (5-40 minutes on a SunFireV40z).


I read the TODO list and I found more features about deferrability. 
Would you like to implement the deferrable foreign key constraints? If 
you want, in my opinion my posings will thouch it.


Thank you in anticipation!

Regards,
Antal Attila

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] commit so slow program looks frozen

2006-10-25 Thread Alvaro Herrera
Carlo Stonebanks wrote:

 The delays are so long that I used to think the app was hopelessly frozen. 
 By accident, I left the app alone in its frozen state and came back a good 
 deal later and seen that it was running again.
 
 Sometimes I decide it *IS* frozen and have to restart. Because Ctrl-C will 
 not cause the script to break, it appears the app is stuck in non-TCL code 
 (either waiting for postgres or stuck in the interface code?)

You may try to figure out what's the process doing (the backend
obviously, not the frontend (Tcl) process) by attaching to it with
strace.  Is it doing system calls?  Maybe it's busy reading from or
writing to disk.  Maybe it's swamped by a context switch storm (but in
that case, probably the other processes would be affected as well).

Or you may want to attach to it with GDB and see what the backtrace
looks like.  If nothing obvious pops up, do it several times and compare
them.

I wouldn't expect it to be stuck on locks, because if it's only on
commit, then it probably has all the locks it needs.  But try to see if
you can find something not granted in pg_locks that it may be stuck on.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Worky Workerson

Merlin/Luke:


 in theory, with 10 10k disks in raid 10, you should be able to keep
 your 2fc link saturated all the time unless your i/o is extremely
 random.  random i/o is the wild card here, ideally you should see at
 least 2000 seeks in bonnie...lets see what comes up.



I suspect the problem here is the sequential I/O rate - let's wait and see
what the dd test results look like.


Here are the tests that you suggested that I do, on both the local
disks (WAL) and the SAN (tablespace).  The random seeks seem to be far
below what Merlin said was good, so I am a bit concerned.  There is
a bit of other activity on the box at the moment which is hard to
stop, so that might have had an impact on the processing.

Here is the bonnie++ output:

Version 1.03   --Sequential Output-- --Sequential Input- --Random-
  -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
 Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
Local Disks31G 45119  85 56548  21 27527   8 35069  66 86506  13 499.6   1
SAN31G 53544  98 93385  35 18266   5 24970  47 57911   8 611.8   1

And here are the dd results for 16GB RAM, i.e. 4,000,000 8K blocks:
# Local Disks
$ time bash -c dd if=/dev/zero of=/home/myhome/bigfile bs=8k
count=400  sync
400+0 records in
400+0 records out

real10m0.382s
user0m1.117s
sys 2m45.681s
$ time dd if=/home/myhome/bigfile of=/dev/null bs=8k count=400
400+0 records in
400+0 records out

real6m22.904s
user0m0.717s
sys 0m53.766s

# Fibre Channel SAN
$ time bash -c dd if=/dev/zero of=/data/test/bigfile bs=8k
count=400  sync
400+0 records in
400+0 records out

real5m58.846s
user   0m1.096s
sys 2m18.026s
$ time dd if=/data/test/bigfile of=/dev/null bs=8k count=400
400+0 records in
400+0 records out

real14m9.560s
user0m0.739s
sys 0m53.806s

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Markus Schaber
Hi, Craig,

Craig A. James wrote:

 Would dblink() help in any way?
 
 It might if perl wasn't so damned good at this. ;-)

You know that you can use Perl inside PostgreS via plperl?

HTH,
Markus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] commit so slow program looks frozen

2006-10-25 Thread Carlo Stonebanks
 You may try to figure out what's the process doing (the backend
 obviously, not the frontend (Tcl) process) by attaching to it with
 strace.

It's so sad when us poor Windows guys get helpful hints from people assume 
that we're smart enough to run *NIX... ;-)

 Maybe it's swamped by a context switch storm (but in that case, probably 
 the other processes would be affected as well).

What is a context switch storm? (and what a great name for a heavy metal 
rock band!)

Interestingly enough, last night (after the original post) I watched three 
of the processes slow down, one after the other - and then stall for so long 
that I had assumed they had frozen. They were all stalled on a message that 
I had put in the script that indicated they had never returned from a 
commit. I have looked into this, and I believe the commits are actually 
going through.

The remaining 4th process continued to run, and actually picked up speed as 
the CPU gave its cycles over. The Windows task manager shows the postgresql 
processes that (I assume) are associated with the stalled processes as 
consuming zero CPU time.

Sometimes I have seen all of the apps slow down and momentarrily freeze at 
the same time... but then continue. I have autovacuum off, although 
stats_row_level and stats_start_collector remain on (I thought these were 
only relevant if autovacuum was on).

I have seen the apps slow down (and perhaps stall) when specifical tables 
have vacuum/analyze running, and that makes sense. I did notice that on one 
occasion a frozen app came back to life after I shut down EMS PostgreSQL 
manager in another session. Maybe a coincidence, or maybe an indication that 
the apps are straining resources... on a box with two twin-core XEONs and 
4GB of memory? Mind you, the config file is confgiured for the database 
loading phase weare in now - with lots of resources devoted to a few 
connections.

 I wouldn't expect it to be stuck on locks, because if it's only on
 commit, then it probably has all the locks it needs.  But try to see if
 you can find something not granted in pg_locks that it may be stuck on.

Looking at the pgadmin server status pages, no locks or transactions are 
pending when this happens.

Carlo 



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Luke Lonergan
Mr. Worky,

On 10/25/06 11:26 AM, Worky Workerson [EMAIL PROTECTED] wrote:

 And here are the dd results for 16GB RAM, i.e. 4,000,000 8K blocks:

So, if we divide 32,000 MB by the real time, we get:

/home (WAL):
53 MB/s write
84 MB/s read

/data (data):
89 MB/s write
38 MB/s read

The write and read speeds on /home look like a single disk drive, which is
not good if you have more drives in a RAID.  OTOH, it should be sufficient
for WAL writing and you should think that the COPY speed won't be limited by
WAL.

The read speed on your /data volume is awful to the point where you should
consider it broken and find a fix.  A quick comparison: the same number on a
16 drive internal SATA array with 7200 RPM disks gets 950 MB/s read, about
25 times faster for about 1/4 the price.

But again, this may not have anything to do with the speed of your COPY
statements.

Can you provide about 10 seconds worth of vmstat 1 while running your COPY
so we can get a global view of the I/O and CPU?

- Luke



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] commit so slow program looks frozen

2006-10-25 Thread Carlo Stonebanks








 I have a question
for you: did you have a long running query keeping open a transaction?
I've just noticed the same problem here, but things cleaned up immediately when
I aborted the long-running transaction.



No, the only processes are from those in
the import applications themselves: short transactions never lasting more than
a fraction of a second.



Carlo








[PERFORM] Configuration Issue ?

2006-10-25 Thread Mark Lonsdale










Hi 



Please help. I have got a postgres 7.3.4
database running on RedHat ES 3, with 8GB of physical memory in it.
The machine is shared with my application which is pretty intensive
in doing selects and updates against the database, but there are usually no
more than 10 connections to the database at any time.



Despite having 8GB of RAM on the machine, the machine
is frequently running out of physical memory and swapping which is hurting
performance. Have read around on various of the message boards, and
I suspect that the SHARED_BUFFERS setting on this server is set way to high,
and that this in fact may be hurting performance. My
current configuration settings are as follows:



shared_buffers =
393216 # min max_connections*2
or 16, 8KB each

max_fsm_relations =
1 # min 10, fsm is free space map, ~40
bytes

max_fsm_pages =
160001 # min 1000, fsm is
free space map, ~6

bytes

sort_mem =
409600
# min 64, size in KB

vacuum_mem =
81920
# min 1024, size in KB



From what Ive read, Ive not seen anyone recommend a
SHARED_BUFFERS setting higher than 50,000. Is a setting of 393216
going to cause significant problems, or does this sound about right on an 8GB
system, bearing in mind that Id like to reserve at least a couple of GB for my
application.



Also if you have any recommendations regarding effective_cache_size
Id be interested as reading around this sounds important as well



Thanks



Mark 










Re: [PERFORM] Configuration Issue ?

2006-10-25 Thread Joshua D. Drake
Mark Lonsdale wrote:
  
 
 Hi 
 
  
 
 Please help.  I have got a postgres 7.3.4 database running on RedHat ES
 3, with 8GB of physical memory in it.   The machine is shared with my
 application which is pretty intensive in doing selects and updates
 against the database, but there are usually no more than 10 connections
 to the database at any time.
 
 
 shared_buffers = 393216 # min max_connections*2 or 16, 8KB each

The above is likely hurting you more than helping you with 7.3.

 
 max_fsm_relations = 1   # min 10, fsm is free space map, ~40
 bytes
 
 max_fsm_pages = 160001  # min 1000, fsm is free space map, ~6
 
 bytes
 
 sort_mem = 409600   # min 64, size in KB

The above will likely kill you :). Try 4096 or 8192, maybe 16384
depending on workload.

 
 vacuum_mem = 81920  # min 1024, size in KB

This is fine.

 
 Also if you have any recommendations regarding effective_cache_size Id
 be interested as reading around this sounds important as well

About 20-25% of available ram for 7.3.


The long and short is you need to upgrade to at least 7.4, preferrably 8.1.

Joshua D. Drake



 
  
 
 Thanks
 
  
 
 Mark 
 
  
 
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] commit so slow program looks frozen

2006-10-25 Thread Brian Hurt




Carlo Stonebanks wrote:

  
You may try to figure out what's the process doing (the backend
obviously, not the frontend (Tcl) process) by attaching to it with
strace.

  
  
It's so sad when us poor Windows guys get helpful hints from people assume 
that we're smart enough to run *NIX... ;-)

  
  
Maybe it's swamped by a context switch storm (but in that case, probably 
the other processes would be affected as well).

  
  
What is a context switch storm? (and what a great name for a heavy metal 
rock band!)

Interestingly enough, last night (after the original post) I watched three 
of the processes slow down, one after the other - and then stall for so long 
that I had assumed they had frozen. They were all stalled on a message that 
I had put in the script that indicated they had never returned from a 
commit. I have looked into this, and I believe the commits are actually 
going through.
  

I have a question for you: did you have a long running query keeping
open a transaction? I've just noticed the same problem here, but
things cleaned up immediately when I aborted the long-running
transaction.

Note that in my case the long-running transaction wasn't idle in
transaction, it was just doing a whole lot of work.

Brian





Re: [PERFORM] Configuration Issue ?

2006-10-25 Thread Mark Lonsdale


Hi Josh

Thanks for the feedback, that is most usefull.  When you said one of the
settings was likely killing us, was it all of the settings for
max_fsm_relations, max_fsm_pages, and sort_mem or just the setting for
sort_mem ?

Can you explain why the setting would be killing me :-)

Thanks

Mark

-Original Message-
From: Joshua D. Drake [mailto:[EMAIL PROTECTED] 
Sent: 25 October 2006 21:52
To: Mark Lonsdale
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Configuration Issue ?

Mark Lonsdale wrote:
  
 
 Hi 
 
  
 
 Please help.  I have got a postgres 7.3.4 database running on RedHat
ES
 3, with 8GB of physical memory in it.   The machine is shared with my
 application which is pretty intensive in doing selects and updates
 against the database, but there are usually no more than 10
connections
 to the database at any time.
 
 
 shared_buffers = 393216 # min max_connections*2 or 16, 8KB
each

The above is likely hurting you more than helping you with 7.3.

 
 max_fsm_relations = 1   # min 10, fsm is free space map, ~40
 bytes
 
 max_fsm_pages = 160001  # min 1000, fsm is free space map, ~6
 
 bytes
 
 sort_mem = 409600   # min 64, size in KB

The above will likely kill you :). Try 4096 or 8192, maybe 16384
depending on workload.

 
 vacuum_mem = 81920  # min 1024, size in KB

This is fine.

 
 Also if you have any recommendations regarding effective_cache_size Id
 be interested as reading around this sounds important as well

About 20-25% of available ram for 7.3.


The long and short is you need to upgrade to at least 7.4, preferrably
8.1.

Joshua D. Drake



 
  
 
 Thanks
 
  
 
 Mark 
 
  
 
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Configuration Issue ?

2006-10-25 Thread Richard Huxton

Mark Lonsdale wrote:


Hi Josh

Thanks for the feedback, that is most usefull.  When you said one of the
settings was likely killing us, was it all of the settings for
max_fsm_relations, max_fsm_pages, and sort_mem or just the setting for
sort_mem ?

Can you explain why the setting would be killing me :-)


The sort_mem is crucial. It's memory *per sort*, which means one query 
can use several times that amount.



The long and short is you need to upgrade to at least 7.4, preferrably
8.1.


Joshua means this too. Upgrade to 7.3.16 within the next few days, then 
test out something more recent. You should see some useful performance 
gains from 8.1.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Configuration Issue ?

2006-10-25 Thread Joshua D. Drake
Richard Huxton wrote:
 Mark Lonsdale wrote:

 Hi Josh

 Thanks for the feedback, that is most usefull.  When you said one of the
 settings was likely killing us, was it all of the settings for
 max_fsm_relations, max_fsm_pages, and sort_mem or just the setting for
 sort_mem ?

 Can you explain why the setting would be killing me :-)
 
 The sort_mem is crucial. It's memory *per sort*, which means one query
 can use several times that amount.

Worse then that it is:

((sort memory) * (number of sorts)) * (number of connections) = amount
of ram possible to use.

Now... take the following query:

SELECT * FROM foo
  JOIN bar on (bar.id = foo.id)
  JOIN baz on (baz.id = foo_baz.id)
ORDER BY baz.name, foo.salary;

Over 5 million rows... How much ram you think you just used?

 
 The long and short is you need to upgrade to at least 7.4, preferrably
 8.1.
 
 Joshua means this too. Upgrade to 7.3.16 within the next few days, then
 test out something more recent. You should see some useful performance
 gains from 8.1.

Right. The reason I suggested 7.4 is that he gets VACUUM VERBOSE in a
reasonable fashion but of course 8.1 is better.

Sincerely,

Joshua D. Drake


 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Configuration Issue ?

2006-10-25 Thread Mark Lonsdale


Thanks guys, I think we'll certainly look to get the app certified with
7.4 and 8.x but that may take a little while.   In the interim, Im
thinking of making the following changes then:-

Change Shared_buffers from 393216 to 80,000 ( ~15% of 4GB of RAM.
Server is 8GB but I want to leave space for App as well )

Set my effective_cache_size to 125,000 ( ~25% of 4GB of RAM )

Set my sort_mem to 8192

Do those numbers look a bit better?   Will probably see if we can make
these changes asap as the server is struggling a bit now, which doesn't
really make sense given how much memory is in it.

Really appreciate your help and fast turnaround on this

Mark

-Original Message-
From: Joshua D. Drake [mailto:[EMAIL PROTECTED] 
Sent: 25 October 2006 22:17
To: Richard Huxton
Cc: Mark Lonsdale; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Configuration Issue ?

Richard Huxton wrote:
 Mark Lonsdale wrote:

 Hi Josh

 Thanks for the feedback, that is most usefull.  When you said one of
the
 settings was likely killing us, was it all of the settings for
 max_fsm_relations, max_fsm_pages, and sort_mem or just the setting
for
 sort_mem ?

 Can you explain why the setting would be killing me :-)
 
 The sort_mem is crucial. It's memory *per sort*, which means one query
 can use several times that amount.

Worse then that it is:

((sort memory) * (number of sorts)) * (number of connections) = amount
of ram possible to use.

Now... take the following query:

SELECT * FROM foo
  JOIN bar on (bar.id = foo.id)
  JOIN baz on (baz.id = foo_baz.id)
ORDER BY baz.name, foo.salary;

Over 5 million rows... How much ram you think you just used?

 
 The long and short is you need to upgrade to at least 7.4,
preferrably
 8.1.
 
 Joshua means this too. Upgrade to 7.3.16 within the next few days,
then
 test out something more recent. You should see some useful performance
 gains from 8.1.

Right. The reason I suggested 7.4 is that he gets VACUUM VERBOSE in a
reasonable fashion but of course 8.1 is better.

Sincerely,

Joshua D. Drake


 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Configuration Issue ?

2006-10-25 Thread Joshua D. Drake
Mark Lonsdale wrote:
 
 Thanks guys, I think we'll certainly look to get the app certified with
 7.4 and 8.x but that may take a little while.   In the interim, Im
 thinking of making the following changes then:-
 
 Change Shared_buffers from 393216 to 80,000 ( ~15% of 4GB of RAM.
 Server is 8GB but I want to leave space for App as well )

You likely run into issues with anything over 16384. I have never seen a
benefit from shared_buffers over 12k or so with 7.3.

 
 Set my effective_cache_size to 125,000 ( ~25% of 4GB of RAM )
 
 Set my sort_mem to 8192

:)

Sincerely,

Joshua D. Drake


 
 Do those numbers look a bit better?   Will probably see if we can make
 these changes asap as the server is struggling a bit now, which doesn't
 really make sense given how much memory is in it.
 
 Really appreciate your help and fast turnaround on this
 
 Mark
 
 -Original Message-
 From: Joshua D. Drake [mailto:[EMAIL PROTECTED] 
 Sent: 25 October 2006 22:17
 To: Richard Huxton
 Cc: Mark Lonsdale; pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] Configuration Issue ?
 
 Richard Huxton wrote:
 Mark Lonsdale wrote:
 Hi Josh

 Thanks for the feedback, that is most usefull.  When you said one of
 the
 settings was likely killing us, was it all of the settings for
 max_fsm_relations, max_fsm_pages, and sort_mem or just the setting
 for
 sort_mem ?

 Can you explain why the setting would be killing me :-)
 The sort_mem is crucial. It's memory *per sort*, which means one query
 can use several times that amount.
 
 Worse then that it is:
 
 ((sort memory) * (number of sorts)) * (number of connections) = amount
 of ram possible to use.
 
 Now... take the following query:
 
 SELECT * FROM foo
   JOIN bar on (bar.id = foo.id)
   JOIN baz on (baz.id = foo_baz.id)
 ORDER BY baz.name, foo.salary;
 
 Over 5 million rows... How much ram you think you just used?
 
 The long and short is you need to upgrade to at least 7.4,
 preferrably
 8.1.
 Joshua means this too. Upgrade to 7.3.16 within the next few days,
 then
 test out something more recent. You should see some useful performance
 gains from 8.1.
 
 Right. The reason I suggested 7.4 is that he gets VACUUM VERBOSE in a
 reasonable fashion but of course 8.1 is better.
 
 Sincerely,
 
 Joshua D. Drake
 
 
 
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] commit so slow program looks frozen

2006-10-25 Thread Scott Marlowe
On Wed, 2006-10-25 at 15:07, Carlo Stonebanks wrote:
  You may try to figure out what's the process doing (the backend
  obviously, not the frontend (Tcl) process) by attaching to it with
  strace.
 
 It's so sad when us poor Windows guys get helpful hints from people assume 
 that we're smart enough to run *NIX... ;-)

You should try a google search on strace and NT or windows or XP...  I
was surprised how many various implementations of it I found.

 
  Maybe it's swamped by a context switch storm (but in that case, probably 
  the other processes would be affected as well).
 
 What is a context switch storm? (and what a great name for a heavy metal 
 rock band!)

I can just see the postgresql group getting together at the next
O'Reilley's conference and creating that band.  And it will all be your
fault.

A context switch storm is when your machine spends more time trying to
figure out what to do than actually doing anything.  The CPU spends most
it's time switching between programs than running them.


 I have seen the apps slow down (and perhaps stall) when specifical tables 
 have vacuum/analyze running, and that makes sense. I did notice that on one 
 occasion a frozen app came back to life after I shut down EMS PostgreSQL 
 manager in another session. Maybe a coincidence, or maybe an indication that 
 the apps are straining resources... on a box with two twin-core XEONs and 
 4GB of memory? Mind you, the config file is confgiured for the database 
 loading phase weare in now - with lots of resources devoted to a few 
 connections.

Seeing as PostgreSQL runs one thread / process per connection, it's
pretty unlikely that the problem here is one hungry thread.  Do all
four CPUs show busy, or just one?  Do you have a way of measuring how
much time is spent waiting on I/O on a windows machine like top / vmstat
does in unix?

Is it possible your machine is going into a swap storm?  i.e. you've
used all physical memory somehow and it's swapping out?  If your current
configuration is too aggresive on sort / work mem then it can happen
with only a few connections.  

Note that if you have an import process that needs a big chunk of
memory, you can set just that one connection to use a large setting and
leave the default smaller.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] commit so slow program looks frozen

2006-10-25 Thread Joshua D. Drake
 
 Maybe it's swamped by a context switch storm (but in that case, probably 
 the other processes would be affected as well).
 What is a context switch storm? (and what a great name for a heavy metal 
 rock band!)
 
 I can just see the postgresql group getting together at the next
 O'Reilley's conference and creating that band.  And it will all be your
 fault.

Well now you let the secret out!

Joshua D. Drake

-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Jim C. Nasby
On Wed, Oct 25, 2006 at 11:25:01AM -0400, Worky Workerson wrote:
 I'm guessing the high bursts are checkpoints.  Can you check your log
 files for pg and see if you are getting warnings about checkpoint
 frequency?   You can get some mileage here by increasing wal files.
 
 Nope, nothing in the log.  I have set:
 wal_buffers=128
 checkpoint_segments=128
 checkpoint_timeout=3000
 which I thought was rather generous.  Perhaps I should set it even
 higher for the loads?

But depending on your shared_buffer and bgwriter settings (as well as
how much WAL traffic you're generating, you could still end up with big
slugs of work to be done when checkpoints happen.

If you set checkpoint_warning to 3001, you'll see exactly when
checkpoints are happening, so you can determine if that's an issue.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] commit so slow program looks frozen

2006-10-25 Thread Jim C. Nasby
On Wed, Oct 25, 2006 at 04:32:16PM -0400, Carlo Stonebanks wrote:
  I have a question for you: did you have a long running query keeping open
 a transaction?  I've just noticed the same problem here, but things cleaned
 up immediately when I aborted the long-running transaction.
 
 No, the only processes are from those in the import applications themselves:
 short transactions never lasting more than a fraction of a second.

Do you have a linux/unix machine you could reproduce this on?
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster