Re: [PERFORM] hardware and For PostgreSQL
Joe Uhl wrote: I realize there are people who discourage looking at Dell, but i've been very happy with a larger ball of equipment we ordered recently from them. Our database servers consist of a PowerEdge 2950 connected to a PowerVault MD1000 with a 1 meter SAS cable. We have a similar piece of equipment from Dell (the PowerEdge), and when we had a problem with it we received excellent service from them. When our raid controller went down (machine 1 year old), Dell helped to diagnose the problem and installed a new one at our hosting facility, all within 24 hours. fyi Ron ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] 12 hour table vacuums
We vacuum only a few of our tables nightly, this one is the last one because it takes longer to run. I'll probably re-index it soon, but I would appreciate any advice on how to speed up the vacuum process (and the db in general). Okay, here's our system: postgres 8.1.4 Linux version 2.4.21 Red Hat Linux 3.2.3 8 GB ram Intel(R) Xeon(TM) CPU 3.20GHz Raid 5 autovacuum=off serves as the application server and database server server is co-located in another city, hardware upgrade is not currently an option Here's the table information: The table has 140,000 rows, 130 columns (mostly NUMERIC), 60 indexes. It is probably our 'key' table in the database and gets called by almost every query (usually joined to others). The table gets updated only about 10 times a day. We were running autovacuum but it interfered with the updates to we shut it off. We vacuum this table nightly, and it currently takes about 12 hours to vacuum it. Not much else is running during this period, nothing that should affect the table. Here are the current non-default postgresql.conf settings: max_connections = 100 shared_buffers = 5 work_mem = 9192 maintenance_work_mem = 786432 max_fsm_pages = 7 vacuum_cost_delay = 200 vacuum_cost_limit = 100 bgwriter_delay = 1 fsync = on checkpoint_segments = 64 checkpoint_timeout = 1800 effective_cache_size = 27 random_page_cost = 2 log_destination = 'stderr' redirect_stderr = on client_min_messages = warning log_min_messages = warning stats_start_collector = off stats_command_string = on stats_block_level = on stats_row_level = on autovacuum = off autovacuum_vacuum_threshold = 2000 deadlock_timeout = 1 max_locks_per_transaction = 640 add_missing_from = on As I mentioned, any insights into changing the configuration to optimize performance are most welcome. Thanks Ron ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] 12 hour table vacuums
Bill Moran wrote: In response to Ron St-Pierre [EMAIL PROTECTED]: We vacuum only a few of our tables nightly, this one is the last one because it takes longer to run. I'll probably re-index it soon, but I would appreciate any advice on how to speed up the vacuum process (and the db in general). I doubt anyone can provide meaningful advice without the output of vacuum verbose. The cron job is still running /usr/local/pgsql/bin/vacuumdb -d imperial -t stock.fdata -v -z /usr/local/pgsql/bin/fdata.txt I'll post the output when it's finished. Ron ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] 12 hour table vacuums
Tom Lane wrote: Here is your problem: vacuum_cost_delay = 200 If you are only vacuuming when nothing else is happening, you shouldn't be using vacuum_cost_delay at all: set it to 0. In any case this value is probably much too high. I would imagine that if you watch the machine while the vacuum is running you'll find both CPU and I/O load near zero ... which is nice, unless you would like the vacuum to finish sooner. Yeah, I've noticed that CPU, mem and I/O load are really low when this is running. I'll change that setting. In unrelated comments: maintenance_work_mem = 786432 That seems awfully high, too. Any thoughts on a more reasonable value? max_fsm_pages = 7 And this possibly too low --- The default appears to be 2, so I upped it to 7. I'll try 16 (max_fsm_relations*16). are you sure you are not leaking disk space? What do you mean leaking disk space? stats_start_collector = off stats_command_string = on stats_block_level = on stats_row_level = on These are not self-consistent. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] 12 hour table vacuums
Alvaro Herrera wrote: Ron St-Pierre wrote: Okay, here's our system: postgres 8.1.4 Upgrade to 8.1.10 Any particular fixes in 8.1.10 that would help with this? Here's the table information: The table has 140,000 rows, 130 columns (mostly NUMERIC), 60 indexes. 60 indexes? You gotta be kidding. You really have 60 columns on which to scan? Really. 60 indexes. They're the most commonly requested columns for company information (we believe). Any ideas on testing our assumptions about that? I would like to know definitively what are the most popular columns. Do you think that rules would be a good approach for this? (Sorry if I'm getting way off topic here) vacuum_cost_delay = 200 vacuum_cost_limit = 100 Isn't this a bit high? What happens if you cut the delay to, say, 10? (considering you've lowered the limit to half the default) Yes, Tom pointed this out too. I'll lower it and check out the results. Ron ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Tuning New Server (slow function)
We just purchased a new Dell PowerEdge 2800 (dual xeon, 8GB RAM, raid 4, RHEL, postgres 8.1) and ported our old database over to it (single cpu, 2GB RAM, no raid, postgres 7.4). Our apps perform great on it, however some queries are super slow. One function in particular, which used to take 15-30 minutes on the old server, has been running now for over 12 hours: BEGIN TRUNCATE stock.datacount; FOR rec IN SELECT itemID, item, hexValue FROM stock.activeitem LOOP histdate := (SELECT updatedate FROM stock.historical s WHERE s.itemID=rec.itemID ORDER BY updatedate DESC LIMIT 1); IF histdate IS NOT NULL THEN funddate := (SELECT updatedate FROM stock.funddata s WHERE s.itemID=rec.itemID); techdate := (SELECT updatedate FROM stock.techsignals s WHERE s.itemID=rec.itemID); IF (histdate funddate) OR (histdate techdate) OR (funddate IS NULL) OR (techdate IS NULL) THEN counter := counter + 1; outrec.itemID := rec.itemID; outrec.item := rec.item; outrec.hexvalue := rec.hexvalue; RETURN NEXT outrec; END IF; END IF; END LOOP; INSERT INTO stock.datacount (itemcount) VALUES (counter); COPY stock.datacount TO ''/tmp/datacount''; RETURN; END; top shows: CPU states: cpuusernice systemirq softirq iowaitidle total5.8%0.6% 31.2% 0.0% 0.0%0.5% 61.6% Mem: 8152592k av, 8143012k used,9580k free, 0k shrd, 179888k buff 6342296k actv, 1206340k in_d, 137916k in_c Swap: 8385760k av, 259780k used, 8125980k free 7668624k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 17027 postgres 25 0 566M 561M 560M R24.9 7.0 924:34 1 postmaster I've likely set some parameter(s) to the wrong values, but I don't know which one(s). Here are my relevant postgresql.conf settings: shared_buffers = 7 work_mem = 9192 maintenance_work_mem = 131072 max_fsm_pages = 7 fsync = off (temporarily, will be turned back on) checkpoint_segments = 64 checkpoint_timeout = 1800 effective_cache_size = 7 [EMAIL PROTECTED] root]# cat /proc/sys/kernel/shmmax 66000 We want to put this into production soon, but this is a showstopper. Can anyone help me out with this? Thanks Ron St.Pierre ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Tuning New Server (slow function)
We just purchased a new Dell PowerEdge 2800 (dual xeon, 8GB RAM, raid 4, RHEL, postgres 8.1) and ported our old database over to it (single cpu, 2GB RAM, no raid, postgres 7.4). Our apps perform great on it, however some queries are super slow. One function in particular, which used to take 15-30 minutes on the old server, has been running now for over 12 hours: BEGIN TRUNCATE stock.datacount; FOR rec IN SELECT itemID, item, hexValue FROM stock.activeitem LOOP histdate := (SELECT updatedate FROM stock.historical s WHERE s.itemID=rec.itemID ORDER BY updatedate DESC LIMIT 1); IF histdate IS NOT NULL THEN funddate := (SELECT updatedate FROM stock.funddata s WHERE s.itemID=rec.itemID); techdate := (SELECT updatedate FROM stock.techsignals s WHERE s.itemID=rec.itemID); IF (histdate funddate) OR (histdate techdate) OR (funddate IS NULL) OR (techdate IS NULL) THEN counter := counter + 1; outrec.itemID := rec.itemID; outrec.item := rec.item; outrec.hexvalue := rec.hexvalue; RETURN NEXT outrec; END IF; END IF; END LOOP; INSERT INTO stock.datacount (itemcount) VALUES (counter); COPY stock.datacount TO ''/tmp/datacount''; RETURN; END; top shows: CPU states: cpuusernice systemirq softirq iowaitidle total5.8%0.6% 31.2% 0.0% 0.0%0.5% 61.6% Mem: 8152592k av, 8143012k used,9580k free, 0k shrd, 179888k buff 6342296k actv, 1206340k in_d, 137916k in_c Swap: 8385760k av, 259780k used, 8125980k free 7668624k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 17027 postgres 25 0 566M 561M 560M R24.9 7.0 924:34 1 postmaster I've likely set some parameter(s) to the wrong values, but I don't know which one(s). Here are my relevant postgresql.conf settings: shared_buffers = 7 work_mem = 9192 maintenance_work_mem = 131072 max_fsm_pages = 7 fsync = off (temporarily, will be turned back on) checkpoint_segments = 64 checkpoint_timeout = 1800 effective_cache_size = 7 [EMAIL PROTECTED] root]# cat /proc/sys/kernel/shmmax 66000 We want to put this into production soon, but this is a showstopper. Can anyone help me out with this? Thanks Ron St.Pierre ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Tuning New Server (slow function)
We just purchased a new Dell PowerEdge 2800 (dual xeon, 8GB RAM, raid 4, RHEL, postgres 8.1) and ported our old database over to it (single cpu, 2GB RAM, no raid, postgres 7.4). Our apps perform great on it, however some queries are super slow. One function in particular, which used to take 15-30 minutes on the old server, has been running now for over 12 hours: BEGIN TRUNCATE stock.datacount; FOR rec IN SELECT itemID, item, hexValue FROM stock.activeitem LOOP histdate := (SELECT updatedate FROM stock.historical s WHERE s.itemID=rec.itemID ORDER BY updatedate DESC LIMIT 1); IF histdate IS NOT NULL THEN funddate := (SELECT updatedate FROM stock.funddata s WHERE s.itemID=rec.itemID); techdate := (SELECT updatedate FROM stock.techsignals s WHERE s.itemID=rec.itemID); IF (histdate funddate) OR (histdate techdate) OR (funddate IS NULL) OR (techdate IS NULL) THEN counter := counter + 1; outrec.itemID := rec.itemID; outrec.item := rec.item; outrec.hexvalue := rec.hexvalue; RETURN NEXT outrec; END IF; END IF; END LOOP; INSERT INTO stock.datacount (itemcount) VALUES (counter); COPY stock.datacount TO ''/tmp/datacount''; RETURN; END; note: stock.activeitem contains about 75000 rows top shows: CPU states: cpuusernice systemirq softirq iowaitidle total5.8%0.6% 31.2% 0.0% 0.0%0.5% 61.6% Mem: 8152592k av, 8143012k used,9580k free, 0k shrd, 179888k buff 6342296k actv, 1206340k in_d, 137916k in_c Swap: 8385760k av, 259780k used, 8125980k free 7668624k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 17027 postgres 25 0 566M 561M 560M R24.9 7.0 924:34 1 postmaster I've likely set some parameter(s) to the wrong values, but I don't know which one(s). Here are my relevant postgresql.conf settings: shared_buffers = 7 work_mem = 9192 maintenance_work_mem = 131072 max_fsm_pages = 7 fsync = off (temporarily, will be turned back on) checkpoint_segments = 64 checkpoint_timeout = 1800 effective_cache_size = 7 [EMAIL PROTECTED] root]# cat /proc/sys/kernel/shmmax 66000 We want to put this into production soon, but this is a showstopper. Can anyone help me out with this? Thanks Ron St.Pierre ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Tuning New Server (slow function)
Jim C. Nasby wrote: On Wed, Jun 21, 2006 at 02:27:41PM -0700, Ron St-Pierre wrote: We just purchased a new Dell PowerEdge 2800 (dual xeon, 8GB RAM, raid 4, RHEL, postgres 8.1) and ported our old database over to it (single cpu, RAID *4*? oops, raid 5 (but we are getting good io throughput...) If you do any kind of updating at all, you're likely to be real unhappy with that... 2GB RAM, no raid, postgres 7.4). Our apps perform great on it, however some queries are super slow. One function in particular, which used to take 15-30 minutes on the old server, has been running now for over 12 hours: BEGIN TRUNCATE stock.datacount; FOR rec IN SELECT itemID, item, hexValue FROM stock.activeitem LOOP histdate := (SELECT updatedate FROM stock.historical s WHERE s.itemID=rec.itemID ORDER BY updatedate DESC LIMIT 1); IF histdate IS NOT NULL THEN funddate := (SELECT updatedate FROM stock.funddata s WHERE s.itemID=rec.itemID); techdate := (SELECT updatedate FROM stock.techsignals s WHERE s.itemID=rec.itemID); IF (histdate funddate) OR (histdate techdate) OR (funddate IS NULL) OR (techdate IS NULL) THEN counter := counter + 1; outrec.itemID := rec.itemID; outrec.item := rec.item; outrec.hexvalue := rec.hexvalue; RETURN NEXT outrec; END IF; END IF; END LOOP; INSERT INTO stock.datacount (itemcount) VALUES (counter); COPY stock.datacount TO ''/tmp/datacount''; RETURN; END; note: stock.activeitem contains about 75000 rows Getting EXPLAIN ANALYZE from the queries would be good. Adding debug output via NOTICE to see how long each step is taking would be a good idea, too. I set client_min_messages = debug2, log_min_messages = debug2 and log_statement = 'all' and am running the query with EXPLAIN ANALYZE. I don't know how long it will take until something useful returns, but I will let it run for a while. Of course, even better would be to do away with the cursor... How would I rewrite it to do away with the cursor? top shows: CPU states: cpuusernice systemirq softirq iowaitidle total5.8%0.6% 31.2% 0.0% 0.0%0.5% 61.6% Mem: 8152592k av, 8143012k used,9580k free, 0k shrd, 179888k buff The high system % (if I'm reading this correctly) makes me wonder if this is some kind of locking issue. But it's the only postgres process running. 6342296k actv, 1206340k in_d, 137916k in_c Swap: 8385760k av, 259780k used, 8125980k free 7668624k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 17027 postgres 25 0 566M 561M 560M R24.9 7.0 924:34 1 postmaster I've likely set some parameter(s) to the wrong values, but I don't know which one(s). Here are my relevant postgresql.conf settings: shared_buffers = 7 work_mem = 9192 maintenance_work_mem = 131072 max_fsm_pages = 7 fsync = off (temporarily, will be turned back on) checkpoint_segments = 64 checkpoint_timeout = 1800 effective_cache_size = 7 [EMAIL PROTECTED] root]# cat /proc/sys/kernel/shmmax 66000 We want to put this into production soon, but this is a showstopper. Can anyone help me out with this? Thanks Ron St.Pierre ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [GENERAL] disk performance benchmarks
Jeffrey W. Baker wrote: All these replies are really interesting, but the point is not that my RAIDs are too slow, or that my CPUs are too slow. My point is that, for long stretches of time, by database doesn't come anywhere near using the capacity of the hardware. And I think that's odd and would like to config it to false. What motherboard are you using, and what distro? Earlier you mentioned that you're on linux 2.6.7 and a 64-bit Opteron 244 machine with 8GB main memory, two 4-disk RAID5 arrays (one for database, one for xlogs). Also, did you have a chance to test performance before you implemented RAID? Ron ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] Table UPDATE is too slow
We have a web based application with data that is updated daily. The biggest bottleneck occurs when we try to update one of the tables. This table contains 58,000 rows and 62 columns, and EVERY column is indexed. Every column is queryable (?) by the users through the web interface so we are reluctant to remove the indexes (recreating them would be time consuming too). The primary key is an INT and the rest of the columns are a mix of NUMERIC, TEXT, and DATEs. A typical update is: UPDATE dataTable SET field01=44.5, field02=44.5, field03='Bob', field04='foo', ... , field60='2004-08-30', field61='2004-08-29' WHERE id = 1234; Also of note is that the update is run about 10 times per day; we get blocks of data from 10 different sources, so we pre-process the data and then update the table. We also run VACUUM FULL ANALYZE on a nightly basis. Does anyone have some idea on how we can increase speed, either by changing the updates, designing the database differently, etc, etc? This is currently a big problem for us. Other notables: The UPDATE is run from a within a function: FOR rec IN SELECT ...LOOP RETURN NEXT rec; UPDATE dataTable. Postgres 7.4.3 debian stable 2 GB RAM 80 DB IDE drive (we can't change it) shared_buffers = 2048 sort_mem = 1024 max_fsm_pages = 4 checkpoint_segments = 5 random_page_cost = 3 Thanks Ron ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Table UPDATE is too slow
Thomas F. O'Connell wrote: What is the datatype of the id column? The id column is INTEGER. Ron ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Table UPDATE is too slow
Steinar H. Gunderson wrote: On Tue, Aug 31, 2004 at 11:11:02AM -0700, Ron St-Pierre wrote: We have a web based application with data that is updated daily. The biggest bottleneck occurs when we try to update one of the tables. This table contains 58,000 rows and 62 columns, and EVERY column is indexed. That is usually a very bad idea; for every update, PostgreSQL has to update 62 indexes. Do you really do queries on all those 62 columns? Yes, I know that it's not a very good idea, however queries are allowed against all of those columns. One option is to disable some or all of the indexes when we update, run the update, and recreate the indexes, however it may slow down user queries. Because there are so many indexes, it is time consuming to recreate them after the update. A typical update is: UPDATE dataTable SET field01=44.5, field02=44.5, field03='Bob', field04='foo', ... , field60='2004-08-30', field61='2004-08-29' WHERE id = 1234; That looks like poor database normalization, really. Are you sure you don't want to split this into multiple tables instead of having 62 columns? No, it is properly normalized. The data in this table is stock fundamentals, stuff like 52 week high, ex-dividend date, etc, etc. Other notables: The UPDATE is run from a within a function: FOR rec IN SELECT ...LOOP RETURN NEXT rec; UPDATE dataTable. Postgres 7.4.3 debian stable 2 GB RAM 80 DB IDE drive (we can't change it) Are you doing all this in multiple transactions, or in a sngle one? Wrapping the FOR loop in a transaction might help speed. We're doing it in multiple transactions within the function. Could we do something like this?: BEGIN FOR rec IN SELECT field01, field02, ... FROM otherTable LOOP RETURN NEXT rec; UPDATE dataTable SET field01=rec.field01, field02=rec.field02, rec.field03=field03, ... WHERE id = rec.id; COMMIT; If we can do it this way, are there any other gotcha's we should be aware of? Ron ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Table UPDATE is too slow
Thanks for everyone's comments (Thomas, Steinar, Frank, Matt, William). Right now I'm bench-marking the time it takes for each step in the end of day update process and then I am going to test a few things: - dropping most indexes, and check the full processing time and see if there is any noticeable performance degradation on the web-end - wrapping the updates in a transaction, and check the timing - combining the two - reviewing my shared_buffers and sort_memory settings Thanks Ron ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Help with performance problems
Josh Berkus wrote: Chris, Sorry for the confusion here. I can't run any sort of vacuum durin the day due to performance hits. However, I have run vacuums at night. Several nights a week I run a vacuumdb -f -z on all of the clusters. I can take serveral hours to complete, but it does complete. Well, here's your first problem: since your FSM pages is low, and you're only vacuuming once a day, you've got to have some serious table and index bloat. SO you're going to need to do VACUUM FULL on all of your databases, and then REINDEX on all of your indexes. After that, raise your max_fsm_pages to something useful, like 1,000,000. Of course, data on your real rate of updates would help more. If you're getting severe disk choke when you vacuum, you probably are I/O bound. You may want to try something which allows you to vacuum one table at a time, either pg_autovacuum or a custom script. Tom and Josh recently gave me some help about setting the fsm settings which was quite useful. The full message is at http://archives.postgresql.org/pgsql-performance/2004-04/msg00229.php and the 'most interesting' posrtion was: Actually, since he's running 7.4, there's an even better way. Do a VACUUM VERBOSE (full-database vacuum --- doesn't matter whether you ANALYZE or not). At the end of the very voluminous output, you'll see something like INFO: free space map: 240 relations, 490 pages stored; 4080 total pages needed DETAIL: Allocated FSM size: 1000 relations + 2 pages = 178 kB shared memory. Here, I would need max_fsm_relations = 240 and max_fsm_pages = 4080 to exactly cover the present freespace needs of my system. I concur with the suggestion to bump that up a good deal, of course, but that gives you a real number to start from. The DETAIL part of the message shows my current settings (which are the defaults) and what the FSM is costing me in shared memory space. Good luck Ron ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Help with performance problems
Chris Hoover wrote: On Friday 23 April 2004 14:57, Ron St-Pierre wrote: Does this apply to 7.3.4 also? No it doesn't, I didn't look back through the thread far enough to see what you were running. I tried it on 7.3.4 and none of the summary info listed below was returned. FWIW one of our DBs was slowing down considerably on an update (30+ minutes) and after I changed max_fsm_pages from the 7.4 default of 20,000 to 50,000, it completed in about eight minutes. Ron Actually, since he's running 7.4, there's an even better way. Do a VACUUM VERBOSE (full-database vacuum --- doesn't matter whether you ANALYZE or not). At the end of the very voluminous output, you'll see something like INFO: free space map: 240 relations, 490 pages stored; 4080 total pages needed DETAIL: Allocated FSM size: 1000 relations + 2 pages = 178 kB shared memory. Here, I would need max_fsm_relations = 240 and max_fsm_pages = 4080 to exactly cover the present freespace needs of my system. I concur with the suggestion to bump that up a good deal, of course, but that gives you a real number to start from. The DETAIL part of the message shows my current settings (which are the defaults) and what the FSM is costing me in shared memory space. Good luck Ron ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[PERFORM] Index Problem?
I am using postgres 7.4.1 and have a problem with a plpgsql function. When I run the function on the production server it takes approx 33 minutes to run. I dumped the DB and copied it to a similarly configured box and ran the function and it ran in about 10 minutes. Can anyone offer advice on tuning the function or my database? Here are the lengthy, gory details. F u n c t i o n It updates seven columns of a table 1 to 4 times daily. Current data = 42,000 rows, new data = 30,000 rows. CREATE TYPE employeeType AS (empID INTEGER, updateDate DATE, bDate INTEGER, val1 NUMERIC, val2 NUMERIC, val3 NUMERIC, val4 NUMERIC, favNum NUMERIC); CREATE OR REPLACE FUNCTION updateEmployeeData() RETURNS SETOF employeeType AS ' DECLARE rec RECORD; BEGIN FOR rec IN SELECT empID, updateDate, bDate, val1, val2 , val3, val4, favNum FROM newData LOOP RETURN NEXT rec; UPDATE currentData SET val1=rec.val1, val2=rec.val2, val3=rec.val2, val4=rec.val4, favNum=rec.favNum, updateDate=rec.updateDate WHERE empID=rec.empID; END LOOP; RETURN; END; ' LANGUAGE 'plpgsql'; The emp table has 60 columns, all indexed, about two-thirds are numeric, but they are not affected by this update. The other 50+ columns are updated in the middle of the night and the amount of time that update takes isn't a concern. Late last night I dumped the table, dropped it and re-created it from the dump (on the production server - when no one was looking). When I re-ran the function it took almost 11 minutes, which was pretty much in line with my results from the dev server. D e t a i l s v 7.4.1 Debian stable 1 GB ram shared_buffers = 2048 sort_mem = 1024 SHMMAX 36000 (360,000,000) VACUUM FULL ANALYZE is run every night, and I ran it yesterday between running the function and it made no difference in running time. top shows the postmaster using minimal cpu (0-40%) and miniscule memory. vmstat shows a fair amount of IO (bo=1000-4000). Yesterday on the dev server we upgraded to the 2.6 kernel and unfortunately only noticed a small increase in update time (about one minute). So does anyone have any suggestions for me on speeding this up? Is it the index? The function is run daily during the mid afternoon to early evening and really drags the performance of the server down (it also hosts a web site). Thanks Ron ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Index Problem?
Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: A better way to set this would be to run VACUUM VERBOSE ANALYZE right after doing one of your update batches, and see how many dead pages are being reclaimed, and then set max_fsm_pages to that # + 50% (or more). Actually, since he's running 7.4, there's an even better way. Do a VACUUM VERBOSE (full-database vacuum --- doesn't matter whether you ANALYZE or not). At the end of the very voluminous output, you'll see something like INFO: free space map: 240 relations, 490 pages stored; 4080 total pages needed DETAIL: Allocated FSM size: 1000 relations + 2 pages = 178 kB shared memory. Here, I would need max_fsm_relations = 240 and max_fsm_pages = 4080 to exactly cover the present freespace needs of my system. I concur with the suggestion to bump that up a good deal, of course, but that gives you a real number to start from. The DETAIL part of the message shows my current settings (which are the defaults) and what the FSM is costing me in shared memory space. Okay, after running the function VACUUM VERBOSE is telling me: INFO: free space map: 136 relations, 25014 pages stored; 22608 total pages needed DETAIL: Allocated FSM size: 1000 relations + 2 pages = 178 kB shared memory. My max_fsm_pages was set to 20,000 and I reset it to 40,000 on the dev server and the function ran about 20-30% faster, so I'll try the same on the production server. Thanks for the analysis of the VACUUM info. Ron ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] ORDER BY and LIMIT with SubSelects
I need to get 200 sets of the most recent data from a table for further processing, ordered by payDate. My current solution is to use subselects to: 1 - get a list of unique data 2 - get the 200 most recent records (first 200 rows, sorted descending) 3 - sort them in ascending order SELECT SSS.* FROM (SELECT SS.* FROM (SELECT DISTINCT ON (nonUniqField) first, second, third, cost, payDate, nonUniqField FROM histdata WHERE userID = 19048 AND cost IS NOT NULL ) SS ORDER BY SS.payDate DESC LIMIT 200 ) SSS ORDER BY payDate; My question is in regards to steps 2 and 3 above. Is there some way that I can combine both steps into one to save some time? PostgreSQL 7.4beta2 on i686-pc-linux-gnu, compiled by GCC 2.95.4 Thanks Ron ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster