Re: [PERFORM] Best COPY Performance
On 23 Oct 2006, at 22:59, Jim C. Nasby wrote: http://stats.distributed.net used to use a perl script to do some transformations before loading data into the database. IIRC, when we switched to using C we saw 100x improvement in speed, so I suspect that if you want performance perl isn't the way to go. I think you can compile perl into C, so maybe that would help some. http://shootout.alioth.debian.org/gp4/benchmark.php? test=alllang=perllang2=gcc 100x doesn't totally impossible if that is even vaguely accurate and you happen to be using bits of Perl which are a lot slower than the C implementation would be... The slowest things appear to involve calling functions, all the slowest tests involve lots of function calls. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Best COPY Performance
On 10/25/06, Craig A. James [EMAIL PROTECTED] wrote: Jim C. Nasby wrote: Well, given that perl is using an entire CPU, it sounds like you should start looking either at ways to remove some of the overhead from perl, or to split that perl into multiple processes. I use Perl for big database copies (usually with some processing/transformation along the way) and I've never seen 100% CPU usage except for brief periods, even when copying BLOBS and such. My typical copy divides operations into blocks, for example doing I'm just doing CSV style transformations (and calling a lot of functions along the way), but the end result is a straight bulk load of data into a blank database. And we've established that Postgres can do *way* better than what I am seeing, so its not suprising that perl is using 100% of a CPU. However, I am still curious as to the rather slow COPYs from psql to local disks. Like I mentioned previously, I was only seeing about 5.7 MB/s (1.8 GB / 330 seconds), where it seemed like others were doing substantially better. What sorts of things should I look into? Thanks! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Best COPY Performance
Hi, Worky, Worky Workerson wrote: $ psql -c COPY my_table TO STDOUT my_data $ ls my_data 2018792 edgescape_pg_load $ time cat my_data | psql -c COPY mytable FROM STDIN real5m43.194s user0m35.412s sys 0m9.567s That's via PSQL, and you get about 5 MB/Sec. On a table with no indices, triggers and contstraints, we managed to COPY about 7-8 megabytes/second with psql over our 100 MBit network, so here the network was the bottleneck. hmm, this makes me think that either my PG config is really lacking, or that the SAN is badly misconfigured, as I would expect it to outperform a 100Mb network. As it is, with a straight pipe to psql COPY, I'm only working with a little over 5.5 MB/s. Could this be due to the primary key index updates? Yes, index updates cause both CPU load, and random disk access (which is slow by nature). HTH, Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Best COPY Performance
On 10/23/06, Worky Workerson [EMAIL PROTECTED] wrote: The disk load is where I start to get a little fuzzy, as I haven't played with iostat to figure what is normal. The local drives contain PG_DATA as well as all the log files, but there is a tablespace on the FibreChannel SAN that contains the destination table. The disk usage pattern that I see is that there is a ton of consistent activity on the local disk, with iostat reporting an average of 30K Blk_wrtn/s, which I assume is the log files. Every several seconds there is a massive burst of activity on the FC partition, to the tune of 250K Blk_wrtn/s. On a table with no indices, triggers and contstraints, we managed to COPY about 7-8 megabytes/second with psql over our 100 MBit network, so here the network was the bottleneck. I'm guessing the high bursts are checkpoints. Can you check your log files for pg and see if you are getting warnings about checkpoint frequency? You can get some mileage here by increasing wal files. Have you determined that pg is not swapping? try upping maintenance_work_mem. What exactly is your architecture? is your database server direct attached to the san? if so, 2gb/4gb fc? what san? have you bonnie++ the san? basically, you can measure iowait to see if pg is waiting on your disks. regarding perl, imo the language performance is really about which libraries you use. the language itself is plenty fast. merlin ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Best COPY Performance
Mr. Worky Workerson, On 10/25/06 5:03 AM, Worky Workerson [EMAIL PROTECTED] wrote: However, I am still curious as to the rather slow COPYs from psql to local disks. Like I mentioned previously, I was only seeing about 5.7 MB/s (1.8 GB / 330 seconds), where it seemed like others were doing substantially better. What sorts of things should I look into? It's probable that you have a really poor performing disk configuration. Judging from earlier results, you may only be getting 3 x 5.7 = 17 MB/s of write performance to your disks, which is about 1/4 of a single disk drive. Please run this test and report the time here: 1) Calculate the size of 2x memory in 8KB blocks: # of blocks = 250,000 x memory_in_GB Example: 250,000 x 16GB = 4,000,000 blocks 2) Benchmark the time taken to write 2x RAM sequentially to your disk: time bash -c dd if=/dev/zero of=bigfile bs=8k count=# of blocks sync 3) Benchmark the time taken to read same: time dd if=bigfile of=/dev/null bs=8k - Luke ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Best COPY Performance
On Tue, Oct 24, 2006 at 10:36:04PM -0700, Craig A. James wrote: Jim C. Nasby wrote: Well, given that perl is using an entire CPU, it sounds like you should start looking either at ways to remove some of the overhead from perl, or to split that perl into multiple processes. I use Perl for big database copies (usually with some processing/transformation along the way) and I've never seen 100% CPU usage except for brief periods, even when copying BLOBS and such. My typical copy divides operations into blocks, for example doing N = 0 while (more rows to go) { begin transaction select ... where primary_key N order by primary_key limit 1000 while (fetch a row) insert into ... N = (highest value found in last block) commit } Doing it like this in Perl should keep Postgres busy, with Perl using only moderate resources. If you're seeing high Perl CPU usage, I'd look first at the Perl code. Wait... so you're using perl to copy data between two tables? And using a cursor to boot? I can't think of any way that could be more inefficient... What's wrong with a plain old INSERT INTO ... SELECT? Or if you really need to break it into multiple transaction blocks, at least don't shuffle the data from the database into perl and then back into the database; do an INSERT INTO ... SELECT with that same where clause. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Best COPY Performance
On Wed, Oct 25, 2006 at 08:03:38AM -0400, Worky Workerson wrote: I'm just doing CSV style transformations (and calling a lot of functions along the way), but the end result is a straight bulk load of data into a blank database. And we've established that Postgres can do *way* better than what I am seeing, so its not suprising that perl is using 100% of a CPU. If you're loading into an empty database, there's a number of tricks that will help you: Turn off fsync Add constraints and indexes *after* you've loaded the data (best to add as much of them as possible on a per-table basis right after the table is loaded so that it's hopefully still in cache) Crank up maintenance_work_mem, especially for tables that won't fit into cache anyway Bump up checkpoint segments and wal_buffers. Disable PITR Create a table and load it's data in a single transaction (8.2 will avoid writing any WAL data if you do this and PITR is turned off) -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Problems using a function in a where clause
On Wed, Oct 25, 2006 at 07:55:38AM -0300, Mara Dalponte wrote: On 10/24/06, Jim C. Nasby [EMAIL PROTECTED] wrote: On Mon, Oct 23, 2006 at 04:54:00PM -0300, Mara Dalponte wrote: Hello, I have a query with several join operations and applying the same filter condition over each involved table. This condition is a complex predicate over an indexed timestamp field, depending on some parameters. To factorize code, I wrote the filter into a plpgsql function, but the resulting query is much more slower than the first one! A view would probably be a better idea... or create some code that generates the code for you. Thank, but the filter function needs some external parameters, so a view wont be appropiate. Anyway, your second possibility could work! The explain command over the original query gives the following info for the WHERE clause that uses the filter: ... Index Cond: ((_timestamp = '2006-02-23 03:00:00'::timestamp without time zone) AND (_timestamp = '2006-02-27 20:00:00.98'::timestamp without time zone)) ... The explain command for the WHERE clause using the filtering function is: ... Filter: include_time_date('2006-02-23'::date, '2006-02-27'::date, '03:00:00'::time without time zone, '20:00:00'::time without time zone, (_timestamp)::timestamp without time zone) ... It seems to not be using the index, and I think this is the reason of the performance gap between both solutions. Well, it looks like include_time_date just returns a boolean, so how could it use the index? I mean that in the old query the index is used (because is a comparative condition over an indexed timestamp field), but not in the new one, where the function is used. Is there some kind of inline function type? No, unfortunately. Your best bet is to add the most important filter criteria by hand, or write code that writes the code (which is what I'd probably do). -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Best COPY Performance
On 10/25/06, Worky Workerson [EMAIL PROTECTED] wrote: I'm guessing the high bursts are checkpoints. Can you check your log files for pg and see if you are getting warnings about checkpoint frequency? You can get some mileage here by increasing wal files. Nope, nothing in the log. I have set: wal_buffers=128 checkpoint_segments=128 checkpoint_timeout=3000 which I thought was rather generous. Perhaps I should set it even higher for the loads? Have you determined that pg is not swapping? try upping maintenance_work_mem. maintenance_work_mem = 524288 ... should I increase it even more? Doesn't look like pg is swapping ... nah, you already addressed it. either pg is swapping or it isnt, and i'm guessing it isn't. I'm currently running bonnie++ with the defaults ... should I change the execution to better mimic Postgres' behavior? just post what you have... RHEL 4.3 x86_64 HP DL585, 4 Dual Core Opteron 885s 16 GB RAM 2x300GB 10K SCSI320, RAID10 HP MSA1000 SAN direct connected via single 2GB Fibre Channel Arbitrated Loop 10x300GB 10K SCSI320, RAID10 in theory, with 10 10k disks in raid 10, you should be able to keep your 2fc link saturated all the time unless your i/o is extremely random. random i/o is the wild card here, ideally you should see at least 2000 seeks in bonnie...lets see what comes up. hopefully, bonnie will report close to 200 mb/sec. in extreme sequential cases, the 2fc link should be a bottleneck if the raid controller is doing its job. if you are having cpu issues, try breaking your process down to at least 4 processes (you have quad dual core box after all)...thats a no brainer. merlin ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] commit so slow program looks frozen
(I tried this question on the interface forum and got no result, but I don't know how to tell if it's an interface issue or not) I have a TCL app which typically takes hours to complete. I found out that it is taking longer than it should because it occasionally stalls inexplicably (for tens of minute at a time) then usually continues. There are a minimum of four apps running at the same time, all reading different sections of the same table, all writing to the same db and the same tables. The other apps seem unaffected by the one app that freezes. This happens running pg_exec $conn commit from within a TCL script on a client app. The delays are so long that I used to think the app was hopelessly frozen. By accident, I left the app alone in its frozen state and came back a good deal later and seen that it was running again. Sometimes I decide it *IS* frozen and have to restart. Because Ctrl-C will not cause the script to break, it appears the app is stuck in non-TCL code (either waiting for postgres or stuck in the interface code?) The application loops through an import file, reading one row at a time, and issues a bunch of inserts and updates to various tables. There's a simple pg_exec $conn start transaction at the beginning of the loop and the commit at the end. The commit actually appears to be going through. There are no messages of any significance in the log. There do not appear to be any outstanding locks or transactions. I am not doing any explicit locking, all transaction settings are set to default. Any thoughts on the cause and possible solutions would be appreciated. Carlo ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Best COPY Performance
I'm guessing the high bursts are checkpoints. Can you check your log files for pg and see if you are getting warnings about checkpoint frequency? You can get some mileage here by increasing wal files. Nope, nothing in the log. I have set: wal_buffers=128 checkpoint_segments=128 checkpoint_timeout=3000 which I thought was rather generous. Perhaps I should set it even higher for the loads? Have you determined that pg is not swapping? try upping maintenance_work_mem. maintenance_work_mem = 524288 ... should I increase it even more? Doesn't look like pg is swapping ... What exactly is your architecture? is your database server direct attached to the san? if so, 2gb/4gb fc? what san? have you bonnie++ the san? basically, you can measure iowait to see if pg is waiting on your disks. I'm currently running bonnie++ with the defaults ... should I change the execution to better mimic Postgres' behavior? RHEL 4.3 x86_64 HP DL585, 4 Dual Core Opteron 885s 16 GB RAM 2x300GB 10K SCSI320, RAID10 HP MSA1000 SAN direct connected via single 2GB Fibre Channel Arbitrated Loop 10x300GB 10K SCSI320, RAID10 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Best COPY Performance
Merlin, On 10/25/06 8:38 AM, Merlin Moncure [EMAIL PROTECTED] wrote: in theory, with 10 10k disks in raid 10, you should be able to keep your 2fc link saturated all the time unless your i/o is extremely random. random i/o is the wild card here, ideally you should see at least 2000 seeks in bonnie...lets see what comes up. The 2000 seeks/sec are irrelevant to Postgres with one user doing COPY. Because the I/O is single threaded, you will get one disk worth of seeks for one user, roughly 150/second on a 10K RPM drive. I suspect the problem here is the sequential I/O rate - let's wait and see what the dd test results look like. - Luke ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Best COPY Performance
Jim C. Nasby wrote: Wait... so you're using perl to copy data between two tables? And using a cursor to boot? I can't think of any way that could be more inefficient... What's wrong with a plain old INSERT INTO ... SELECT? Or if you really need to break it into multiple transaction blocks, at least don't shuffle the data from the database into perl and then back into the database; do an INSERT INTO ... SELECT with that same where clause. The data are on two different computers, and I do processing of the data as it passes through the application. Otherwise, the INSERT INTO ... SELECT is my first choice. Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Best COPY Performance
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Craig A. James Sent: Wednesday, October 25, 2006 12:52 PM To: Jim C. Nasby Cc: Worky Workerson; Merlin Moncure; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Best COPY Performance Jim C. Nasby wrote: Wait... so you're using perl to copy data between two tables? And using a cursor to boot? I can't think of any way that could be more inefficient... What's wrong with a plain old INSERT INTO ... SELECT? Or if you really need to break it into multiple transaction blocks, at least don't shuffle the data from the database into perl and then back into the database; do an INSERT INTO ... SELECT with that same where clause. The data are on two different computers, and I do processing of the data as it passes through the application. Otherwise, the INSERT INTO ... SELECT is my first choice. Would dblink() help in any way? Greg ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Best COPY Performance
Spiegelberg, Greg wrote: The data are on two different computers, and I do processing of the data as it passes through the application. Otherwise, the INSERT INTO ... SELECT is my first choice. Would dblink() help in any way? It might if perl wasn't so damned good at this. ;-) Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] ACCESS EXCLUSIVE lock
Tom Lane wrote: This isn't going to be changed, because the likely direction of future development is that the planner will start making use of constraints even for SELECT queries. This means that a DROP CONSTRAINT operation could invalidate the plan of a SELECT query, so the locking will be essential. Hi! I also think the constraints can increase performance of queries, if the planner can use them. It will be a great feature in the future! But I have more questions about the coherency between a constraint and a transaction. Can a constraint live in differenet isolation levels? If I drop a constraint in a transaction (T1), it doesn't seem after the drop operation in T1. But it should seem in another transaction (T2) in line with T1 (if T2 is started between T1's begin and commit!). If T1 start after T1's commit, our constraint doesn't have to seem in T2, so the planner cannot use it. If I think well, these predicates means the constraint follows its isolation level of the transaction. How does it works in the current release? If the constraints adapt its transaction why could it invalidate the plan of a SELECT query? A SELECT could use a given constraint, if it's dropped without comitting or exists when the SELECT or the tansaction of the SELECT starts. I know we have to examine which rows can affect the result of the SELECT. The main question in this case is that: A wrong row (which break the dropped constraint) can affect the result of the SELECT? In my opininon there isn't wrong rows. Do you know such special case when it can happen? So some wrong rows can seem in the SELECT? I know my original problem is not too common, but the parallel performance of the PostgreSQL is very important in multiprocessor environment. I see, you follow this direction! So you make better locking conditions in 8.2 in more cases. Generally the drop constraints are running in itself or in short transactions. We have an optimalization trick when we have to insert more million rows into a table in same transaction. Before inserting them we drop the foreign key constraints after the begin of the transaction, and remake tem after insertations. This method is faster then the conventional solution. These trasactions are longer (5-40 minutes on a SunFireV40z). I read the TODO list and I found more features about deferrability. Would you like to implement the deferrable foreign key constraints? If you want, in my opinion my posings will thouch it. Thank you in anticipation! Regards, Antal Attila ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] commit so slow program looks frozen
Carlo Stonebanks wrote: The delays are so long that I used to think the app was hopelessly frozen. By accident, I left the app alone in its frozen state and came back a good deal later and seen that it was running again. Sometimes I decide it *IS* frozen and have to restart. Because Ctrl-C will not cause the script to break, it appears the app is stuck in non-TCL code (either waiting for postgres or stuck in the interface code?) You may try to figure out what's the process doing (the backend obviously, not the frontend (Tcl) process) by attaching to it with strace. Is it doing system calls? Maybe it's busy reading from or writing to disk. Maybe it's swamped by a context switch storm (but in that case, probably the other processes would be affected as well). Or you may want to attach to it with GDB and see what the backtrace looks like. If nothing obvious pops up, do it several times and compare them. I wouldn't expect it to be stuck on locks, because if it's only on commit, then it probably has all the locks it needs. But try to see if you can find something not granted in pg_locks that it may be stuck on. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Best COPY Performance
Merlin/Luke: in theory, with 10 10k disks in raid 10, you should be able to keep your 2fc link saturated all the time unless your i/o is extremely random. random i/o is the wild card here, ideally you should see at least 2000 seeks in bonnie...lets see what comes up. I suspect the problem here is the sequential I/O rate - let's wait and see what the dd test results look like. Here are the tests that you suggested that I do, on both the local disks (WAL) and the SAN (tablespace). The random seeks seem to be far below what Merlin said was good, so I am a bit concerned. There is a bit of other activity on the box at the moment which is hard to stop, so that might have had an impact on the processing. Here is the bonnie++ output: Version 1.03 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP Local Disks31G 45119 85 56548 21 27527 8 35069 66 86506 13 499.6 1 SAN31G 53544 98 93385 35 18266 5 24970 47 57911 8 611.8 1 And here are the dd results for 16GB RAM, i.e. 4,000,000 8K blocks: # Local Disks $ time bash -c dd if=/dev/zero of=/home/myhome/bigfile bs=8k count=400 sync 400+0 records in 400+0 records out real10m0.382s user0m1.117s sys 2m45.681s $ time dd if=/home/myhome/bigfile of=/dev/null bs=8k count=400 400+0 records in 400+0 records out real6m22.904s user0m0.717s sys 0m53.766s # Fibre Channel SAN $ time bash -c dd if=/dev/zero of=/data/test/bigfile bs=8k count=400 sync 400+0 records in 400+0 records out real5m58.846s user 0m1.096s sys 2m18.026s $ time dd if=/data/test/bigfile of=/dev/null bs=8k count=400 400+0 records in 400+0 records out real14m9.560s user0m0.739s sys 0m53.806s ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Best COPY Performance
Hi, Craig, Craig A. James wrote: Would dblink() help in any way? It might if perl wasn't so damned good at this. ;-) You know that you can use Perl inside PostgreS via plperl? HTH, Markus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] commit so slow program looks frozen
You may try to figure out what's the process doing (the backend obviously, not the frontend (Tcl) process) by attaching to it with strace. It's so sad when us poor Windows guys get helpful hints from people assume that we're smart enough to run *NIX... ;-) Maybe it's swamped by a context switch storm (but in that case, probably the other processes would be affected as well). What is a context switch storm? (and what a great name for a heavy metal rock band!) Interestingly enough, last night (after the original post) I watched three of the processes slow down, one after the other - and then stall for so long that I had assumed they had frozen. They were all stalled on a message that I had put in the script that indicated they had never returned from a commit. I have looked into this, and I believe the commits are actually going through. The remaining 4th process continued to run, and actually picked up speed as the CPU gave its cycles over. The Windows task manager shows the postgresql processes that (I assume) are associated with the stalled processes as consuming zero CPU time. Sometimes I have seen all of the apps slow down and momentarrily freeze at the same time... but then continue. I have autovacuum off, although stats_row_level and stats_start_collector remain on (I thought these were only relevant if autovacuum was on). I have seen the apps slow down (and perhaps stall) when specifical tables have vacuum/analyze running, and that makes sense. I did notice that on one occasion a frozen app came back to life after I shut down EMS PostgreSQL manager in another session. Maybe a coincidence, or maybe an indication that the apps are straining resources... on a box with two twin-core XEONs and 4GB of memory? Mind you, the config file is confgiured for the database loading phase weare in now - with lots of resources devoted to a few connections. I wouldn't expect it to be stuck on locks, because if it's only on commit, then it probably has all the locks it needs. But try to see if you can find something not granted in pg_locks that it may be stuck on. Looking at the pgadmin server status pages, no locks or transactions are pending when this happens. Carlo ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Best COPY Performance
Mr. Worky, On 10/25/06 11:26 AM, Worky Workerson [EMAIL PROTECTED] wrote: And here are the dd results for 16GB RAM, i.e. 4,000,000 8K blocks: So, if we divide 32,000 MB by the real time, we get: /home (WAL): 53 MB/s write 84 MB/s read /data (data): 89 MB/s write 38 MB/s read The write and read speeds on /home look like a single disk drive, which is not good if you have more drives in a RAID. OTOH, it should be sufficient for WAL writing and you should think that the COPY speed won't be limited by WAL. The read speed on your /data volume is awful to the point where you should consider it broken and find a fix. A quick comparison: the same number on a 16 drive internal SATA array with 7200 RPM disks gets 950 MB/s read, about 25 times faster for about 1/4 the price. But again, this may not have anything to do with the speed of your COPY statements. Can you provide about 10 seconds worth of vmstat 1 while running your COPY so we can get a global view of the I/O and CPU? - Luke ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] commit so slow program looks frozen
I have a question for you: did you have a long running query keeping open a transaction? I've just noticed the same problem here, but things cleaned up immediately when I aborted the long-running transaction. No, the only processes are from those in the import applications themselves: short transactions never lasting more than a fraction of a second. Carlo
[PERFORM] Configuration Issue ?
Hi Please help. I have got a postgres 7.3.4 database running on RedHat ES 3, with 8GB of physical memory in it. The machine is shared with my application which is pretty intensive in doing selects and updates against the database, but there are usually no more than 10 connections to the database at any time. Despite having 8GB of RAM on the machine, the machine is frequently running out of physical memory and swapping which is hurting performance. Have read around on various of the message boards, and I suspect that the SHARED_BUFFERS setting on this server is set way to high, and that this in fact may be hurting performance. My current configuration settings are as follows: shared_buffers = 393216 # min max_connections*2 or 16, 8KB each max_fsm_relations = 1 # min 10, fsm is free space map, ~40 bytes max_fsm_pages = 160001 # min 1000, fsm is free space map, ~6 bytes sort_mem = 409600 # min 64, size in KB vacuum_mem = 81920 # min 1024, size in KB From what Ive read, Ive not seen anyone recommend a SHARED_BUFFERS setting higher than 50,000. Is a setting of 393216 going to cause significant problems, or does this sound about right on an 8GB system, bearing in mind that Id like to reserve at least a couple of GB for my application. Also if you have any recommendations regarding effective_cache_size Id be interested as reading around this sounds important as well Thanks Mark
Re: [PERFORM] Configuration Issue ?
Mark Lonsdale wrote: Hi Please help. I have got a postgres 7.3.4 database running on RedHat ES 3, with 8GB of physical memory in it. The machine is shared with my application which is pretty intensive in doing selects and updates against the database, but there are usually no more than 10 connections to the database at any time. shared_buffers = 393216 # min max_connections*2 or 16, 8KB each The above is likely hurting you more than helping you with 7.3. max_fsm_relations = 1 # min 10, fsm is free space map, ~40 bytes max_fsm_pages = 160001 # min 1000, fsm is free space map, ~6 bytes sort_mem = 409600 # min 64, size in KB The above will likely kill you :). Try 4096 or 8192, maybe 16384 depending on workload. vacuum_mem = 81920 # min 1024, size in KB This is fine. Also if you have any recommendations regarding effective_cache_size Id be interested as reading around this sounds important as well About 20-25% of available ram for 7.3. The long and short is you need to upgrade to at least 7.4, preferrably 8.1. Joshua D. Drake Thanks Mark -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] commit so slow program looks frozen
Carlo Stonebanks wrote: You may try to figure out what's the process doing (the backend obviously, not the frontend (Tcl) process) by attaching to it with strace. It's so sad when us poor Windows guys get helpful hints from people assume that we're smart enough to run *NIX... ;-) Maybe it's swamped by a context switch storm (but in that case, probably the other processes would be affected as well). What is a context switch storm? (and what a great name for a heavy metal rock band!) Interestingly enough, last night (after the original post) I watched three of the processes slow down, one after the other - and then stall for so long that I had assumed they had frozen. They were all stalled on a message that I had put in the script that indicated they had never returned from a commit. I have looked into this, and I believe the commits are actually going through. I have a question for you: did you have a long running query keeping open a transaction? I've just noticed the same problem here, but things cleaned up immediately when I aborted the long-running transaction. Note that in my case the long-running transaction wasn't idle in transaction, it was just doing a whole lot of work. Brian
Re: [PERFORM] Configuration Issue ?
Hi Josh Thanks for the feedback, that is most usefull. When you said one of the settings was likely killing us, was it all of the settings for max_fsm_relations, max_fsm_pages, and sort_mem or just the setting for sort_mem ? Can you explain why the setting would be killing me :-) Thanks Mark -Original Message- From: Joshua D. Drake [mailto:[EMAIL PROTECTED] Sent: 25 October 2006 21:52 To: Mark Lonsdale Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Configuration Issue ? Mark Lonsdale wrote: Hi Please help. I have got a postgres 7.3.4 database running on RedHat ES 3, with 8GB of physical memory in it. The machine is shared with my application which is pretty intensive in doing selects and updates against the database, but there are usually no more than 10 connections to the database at any time. shared_buffers = 393216 # min max_connections*2 or 16, 8KB each The above is likely hurting you more than helping you with 7.3. max_fsm_relations = 1 # min 10, fsm is free space map, ~40 bytes max_fsm_pages = 160001 # min 1000, fsm is free space map, ~6 bytes sort_mem = 409600 # min 64, size in KB The above will likely kill you :). Try 4096 or 8192, maybe 16384 depending on workload. vacuum_mem = 81920 # min 1024, size in KB This is fine. Also if you have any recommendations regarding effective_cache_size Id be interested as reading around this sounds important as well About 20-25% of available ram for 7.3. The long and short is you need to upgrade to at least 7.4, preferrably 8.1. Joshua D. Drake Thanks Mark -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Configuration Issue ?
Mark Lonsdale wrote: Hi Josh Thanks for the feedback, that is most usefull. When you said one of the settings was likely killing us, was it all of the settings for max_fsm_relations, max_fsm_pages, and sort_mem or just the setting for sort_mem ? Can you explain why the setting would be killing me :-) The sort_mem is crucial. It's memory *per sort*, which means one query can use several times that amount. The long and short is you need to upgrade to at least 7.4, preferrably 8.1. Joshua means this too. Upgrade to 7.3.16 within the next few days, then test out something more recent. You should see some useful performance gains from 8.1. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Configuration Issue ?
Richard Huxton wrote: Mark Lonsdale wrote: Hi Josh Thanks for the feedback, that is most usefull. When you said one of the settings was likely killing us, was it all of the settings for max_fsm_relations, max_fsm_pages, and sort_mem or just the setting for sort_mem ? Can you explain why the setting would be killing me :-) The sort_mem is crucial. It's memory *per sort*, which means one query can use several times that amount. Worse then that it is: ((sort memory) * (number of sorts)) * (number of connections) = amount of ram possible to use. Now... take the following query: SELECT * FROM foo JOIN bar on (bar.id = foo.id) JOIN baz on (baz.id = foo_baz.id) ORDER BY baz.name, foo.salary; Over 5 million rows... How much ram you think you just used? The long and short is you need to upgrade to at least 7.4, preferrably 8.1. Joshua means this too. Upgrade to 7.3.16 within the next few days, then test out something more recent. You should see some useful performance gains from 8.1. Right. The reason I suggested 7.4 is that he gets VACUUM VERBOSE in a reasonable fashion but of course 8.1 is better. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Configuration Issue ?
Thanks guys, I think we'll certainly look to get the app certified with 7.4 and 8.x but that may take a little while. In the interim, Im thinking of making the following changes then:- Change Shared_buffers from 393216 to 80,000 ( ~15% of 4GB of RAM. Server is 8GB but I want to leave space for App as well ) Set my effective_cache_size to 125,000 ( ~25% of 4GB of RAM ) Set my sort_mem to 8192 Do those numbers look a bit better? Will probably see if we can make these changes asap as the server is struggling a bit now, which doesn't really make sense given how much memory is in it. Really appreciate your help and fast turnaround on this Mark -Original Message- From: Joshua D. Drake [mailto:[EMAIL PROTECTED] Sent: 25 October 2006 22:17 To: Richard Huxton Cc: Mark Lonsdale; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Configuration Issue ? Richard Huxton wrote: Mark Lonsdale wrote: Hi Josh Thanks for the feedback, that is most usefull. When you said one of the settings was likely killing us, was it all of the settings for max_fsm_relations, max_fsm_pages, and sort_mem or just the setting for sort_mem ? Can you explain why the setting would be killing me :-) The sort_mem is crucial. It's memory *per sort*, which means one query can use several times that amount. Worse then that it is: ((sort memory) * (number of sorts)) * (number of connections) = amount of ram possible to use. Now... take the following query: SELECT * FROM foo JOIN bar on (bar.id = foo.id) JOIN baz on (baz.id = foo_baz.id) ORDER BY baz.name, foo.salary; Over 5 million rows... How much ram you think you just used? The long and short is you need to upgrade to at least 7.4, preferrably 8.1. Joshua means this too. Upgrade to 7.3.16 within the next few days, then test out something more recent. You should see some useful performance gains from 8.1. Right. The reason I suggested 7.4 is that he gets VACUUM VERBOSE in a reasonable fashion but of course 8.1 is better. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Configuration Issue ?
Mark Lonsdale wrote: Thanks guys, I think we'll certainly look to get the app certified with 7.4 and 8.x but that may take a little while. In the interim, Im thinking of making the following changes then:- Change Shared_buffers from 393216 to 80,000 ( ~15% of 4GB of RAM. Server is 8GB but I want to leave space for App as well ) You likely run into issues with anything over 16384. I have never seen a benefit from shared_buffers over 12k or so with 7.3. Set my effective_cache_size to 125,000 ( ~25% of 4GB of RAM ) Set my sort_mem to 8192 :) Sincerely, Joshua D. Drake Do those numbers look a bit better? Will probably see if we can make these changes asap as the server is struggling a bit now, which doesn't really make sense given how much memory is in it. Really appreciate your help and fast turnaround on this Mark -Original Message- From: Joshua D. Drake [mailto:[EMAIL PROTECTED] Sent: 25 October 2006 22:17 To: Richard Huxton Cc: Mark Lonsdale; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Configuration Issue ? Richard Huxton wrote: Mark Lonsdale wrote: Hi Josh Thanks for the feedback, that is most usefull. When you said one of the settings was likely killing us, was it all of the settings for max_fsm_relations, max_fsm_pages, and sort_mem or just the setting for sort_mem ? Can you explain why the setting would be killing me :-) The sort_mem is crucial. It's memory *per sort*, which means one query can use several times that amount. Worse then that it is: ((sort memory) * (number of sorts)) * (number of connections) = amount of ram possible to use. Now... take the following query: SELECT * FROM foo JOIN bar on (bar.id = foo.id) JOIN baz on (baz.id = foo_baz.id) ORDER BY baz.name, foo.salary; Over 5 million rows... How much ram you think you just used? The long and short is you need to upgrade to at least 7.4, preferrably 8.1. Joshua means this too. Upgrade to 7.3.16 within the next few days, then test out something more recent. You should see some useful performance gains from 8.1. Right. The reason I suggested 7.4 is that he gets VACUUM VERBOSE in a reasonable fashion but of course 8.1 is better. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] commit so slow program looks frozen
On Wed, 2006-10-25 at 15:07, Carlo Stonebanks wrote: You may try to figure out what's the process doing (the backend obviously, not the frontend (Tcl) process) by attaching to it with strace. It's so sad when us poor Windows guys get helpful hints from people assume that we're smart enough to run *NIX... ;-) You should try a google search on strace and NT or windows or XP... I was surprised how many various implementations of it I found. Maybe it's swamped by a context switch storm (but in that case, probably the other processes would be affected as well). What is a context switch storm? (and what a great name for a heavy metal rock band!) I can just see the postgresql group getting together at the next O'Reilley's conference and creating that band. And it will all be your fault. A context switch storm is when your machine spends more time trying to figure out what to do than actually doing anything. The CPU spends most it's time switching between programs than running them. I have seen the apps slow down (and perhaps stall) when specifical tables have vacuum/analyze running, and that makes sense. I did notice that on one occasion a frozen app came back to life after I shut down EMS PostgreSQL manager in another session. Maybe a coincidence, or maybe an indication that the apps are straining resources... on a box with two twin-core XEONs and 4GB of memory? Mind you, the config file is confgiured for the database loading phase weare in now - with lots of resources devoted to a few connections. Seeing as PostgreSQL runs one thread / process per connection, it's pretty unlikely that the problem here is one hungry thread. Do all four CPUs show busy, or just one? Do you have a way of measuring how much time is spent waiting on I/O on a windows machine like top / vmstat does in unix? Is it possible your machine is going into a swap storm? i.e. you've used all physical memory somehow and it's swapping out? If your current configuration is too aggresive on sort / work mem then it can happen with only a few connections. Note that if you have an import process that needs a big chunk of memory, you can set just that one connection to use a large setting and leave the default smaller. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] commit so slow program looks frozen
Maybe it's swamped by a context switch storm (but in that case, probably the other processes would be affected as well). What is a context switch storm? (and what a great name for a heavy metal rock band!) I can just see the postgresql group getting together at the next O'Reilley's conference and creating that band. And it will all be your fault. Well now you let the secret out! Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Best COPY Performance
On Wed, Oct 25, 2006 at 11:25:01AM -0400, Worky Workerson wrote: I'm guessing the high bursts are checkpoints. Can you check your log files for pg and see if you are getting warnings about checkpoint frequency? You can get some mileage here by increasing wal files. Nope, nothing in the log. I have set: wal_buffers=128 checkpoint_segments=128 checkpoint_timeout=3000 which I thought was rather generous. Perhaps I should set it even higher for the loads? But depending on your shared_buffer and bgwriter settings (as well as how much WAL traffic you're generating, you could still end up with big slugs of work to be done when checkpoints happen. If you set checkpoint_warning to 3001, you'll see exactly when checkpoints are happening, so you can determine if that's an issue. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] commit so slow program looks frozen
On Wed, Oct 25, 2006 at 04:32:16PM -0400, Carlo Stonebanks wrote: I have a question for you: did you have a long running query keeping open a transaction? I've just noticed the same problem here, but things cleaned up immediately when I aborted the long-running transaction. No, the only processes are from those in the import applications themselves: short transactions never lasting more than a fraction of a second. Do you have a linux/unix machine you could reproduce this on? -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster