Re: [PERFORM] Strange behavior: pgbench and new Linux kernels
On Fri, 18 Apr 2008, Tom Lane wrote: Yeah, it's starting to be obvious that we'd better not ignore sysbench as not our problem. Do you have any roadmap on what needs to be done to it? Just dug into this code again for a minute and it goes something like this: 1) Wrap the write statements into transactions properly so the OLTP code works. There's a BEGIN/COMMIT in there, but last time I tried that test it just deadlocked on me (I got a report of the same from someone else as well). There's some FIXME items in the code for PostgreSQL already that might be related here. 2) Make sure the implementation is running statistics correctly (they create a table and index, but there's certainly no ANALYZE in there). 3) Implement the part of the driver wrapper that haven't been done yet. 4) Try to cut down on crashes (I recall a lot of these when I tried to use all the features). 5) Compare performance on some simple operations to pgbench to see if it's competitive. Look into whether there's code in the PG wrapper they use that can be optimized usefully. There's two performance-related things that jump right out as things I'd want to confirm aren't causing issues: -It's a threaded design -The interesting tests look like they use prepared statements. I think the overall approach sysbench uses is good, it just needs some adjustments to work right against a PG database. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Group by more efficient than distinct?
On Freitag, 18. April 2008, Francisco Reyes wrote: | I am trying to get a distinct set of rows from 2 tables. | After looking at someone else's query I noticed they were doing a group by | to obtain the unique list. | | After comparing on multiple machines with several tables, it seems using | group by to obtain a distinct list is substantially faster than using | select distinct. | | Is there any dissadvantage of using group by to obtain a unique list? Searching the archives suggests that the code related to group by is much newer than the one related to distinct and thus might benefit from more optimization paths. Ciao, Thomas -- Thomas Pundt [EMAIL PROTECTED] http://rp-online.de/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Group by more efficient than distinct?
Francisco Reyes [EMAIL PROTECTED] writes: Is there any dissadvantage of using group by to obtain a unique list? On a small dataset the difference was about 20% percent. Group by HashAggregate (cost=369.61..381.12 rows=1151 width=8) (actual time=76.641..85.167 rows=2890 loops=1) HashAggregate needs to store more values in memory at the same time so it's not a good plan if you have a lot of distinct values. But the planner knows that and so as long as your work_mem is set to a reasonable size (keeping in mind each sort or other operation feels free to use that much itself even if there are several in the query) and the rows estimate is reasonable accurate -- here it's mediocre but not dangerously bad -- then if the planner is picking it it's probably a good idea. I'm not sure but I think there are cases where the DISTINCT method wins too. This is basically a bug, in an ideal world both queries would generate precisely the same plans since they're equivalent. It's just not a high priority since we have so many more interesting improvements competing for time. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Strange behavior: pgbench and new Linux kernels
On Thu, 17 Apr 2008, I wrote: There is only one central tunable (you have to switch on CONFIG_SCHED_DEBUG): /proc/sys/kernel/sched_granularity_ns which can be used to tune the scheduler from 'desktop' (low latencies) to 'server' (good batching) workloads. It defaults to a setting suitable for desktop workloads. SCHED_BATCH is handled by the CFS scheduler module too. So it'd be worth compiling a kernel with CONFIG_SCHED_DEBUG switched on and try increasing that value, and see if that fixes the problem. Alternatively, use sched_setscheduler to set SCHED_BATCH, which should increase the timeslice (a Linux-only option). Looking at the problem a bit closer, it's obvious to me that larger timeslices would not have fixed this problem, so ignore my suggestion. It appears that the problem is caused by inter-process communication blocking and causing processes to be put right to the back of the run queue, therefore causing a very fine-grained round-robin of the runnable processes, which trashes the CPU caches. You may also be seeing processes forced to switch between CPUs, which breaks the caches even more. So what happens if you run pgbench on a separate machine to the server? Does the problem still exist in that case? Matthew -- X's book explains this very well, but, poor bloke, he did the Cambridge Maths Tripos... -- Computer Science Lecturer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Group by more efficient than distinct?
On Fri, 18 Apr 2008 11:36:02 +0200, Gregory Stark [EMAIL PROTECTED] wrote: Francisco Reyes [EMAIL PROTECTED] writes: Is there any dissadvantage of using group by to obtain a unique list? On a small dataset the difference was about 20% percent. Group by HashAggregate (cost=369.61..381.12 rows=1151 width=8) (actual time=76.641..85.167 rows=2890 loops=1) Basically : - If you process up to some percentage of your RAM worth of data, hashing is going to be a lot faster - If the size of the hash grows larger than your RAM, hashing will fail miserably and sorting will be much faster since PG's disksort is really good - GROUP BY knows this and acts accordingly - DISTINCT doesn't know this, it only knows sorting, so it sorts - If you need DISTINCT x ORDER BY x, sorting may be faster too (depending on the % of distinct rows) - If you need DISTINCT ON, well, you're stuck with the Sort - So, for the time being, you can replace DISTINCT with GROUP BY... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance increase with elevator=deadline
Hi, Il giorno 11/apr/08, alle ore 20:03, Craig Ringer ha scritto: Speaking of I/O performance with PostgreSQL, has anybody here done any testing to compare results with LVM to results with the same filesystem on a conventionally partitioned or raw volume? I'd probably use LVM even at a performance cost because of its admin benefits, but I'd be curious if there is any known cost for use with Pg. I've never been able to measure one with other workloads. I performed some tests some time ago. LVM is significantly slower. The disk subsystem is a HP P400/512MB battery-backed controller with 4 disks in raid 10. See the tests: ext3 tests: bonnie++ -s 16000 -u 0 -f -b = = = = = = = Version 1.03 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP 16000M 153637 50 78895 17 204124 17 700.6 1 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 2233 10 + +++ 2606 8 2255 10 + ++ + 2584 7 16000M,,,153637,50,78895,17,,,204124,17,700.6,1,16,2233,10,+,+++, 2606,8,2255,10,+,+++,2584,7 bonnie++ -s 16000 -u 0 -f = = = = = = = Version 1.03 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP 16000M 162223 51 77277 17 207055 17 765.3 1 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 + +++ + +++ + +++ + +++ + +++ + +++ 16000M,,,162223,51,77277,17,,,207055,17,765.3,1,16,+,+++,+,+++, +,+++,+,+++,+,+++,+,+++ = = = = = = = LVM tests: bonnie++ -u 0 -f -s 16000 -b Version 1.03 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP 16000M 153768 52 53420 13 177414 15 699.8 1 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 2158 9 + +++ 2490 7 2177 9 + ++ + 2460 7 16000M,,,153768,52,53420,13,,,177414,15,699.8,1,16,2158,9,+,+++, 2490,7,2177,9,+,+++,2460,7 bonnie++ -u 0 -f -s 16000 Version 1.03 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP 16000M 161476 53 54904 13 171693 14 774.3 1 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 + +++ + +++ + +++ + +++ + +++ + +++ 16000M,,,161476,53,54904,13,,,171693,14,774.3,1,16,+,+++,+,+++, +,+++,+,+++,+,+++,+,+++ Bye, e. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] 3-days-long vacuum of 20GB table
This autovacuum has been hammering my server with purely random i/o for half a week. The table is only 20GB and the i/o subsystem is good for 250MB/s sequential and a solid 5kiops. When should I expect it to end (if ever)? current_query: VACUUM reuters.value query_start: 2008-04-15 20:12:48.806885-04 think=# select * from pg_class where relname = 'value'; -[ RECORD 1 ]--+- relname| value relfilenode| 191425 relpages | 1643518 reltuples | 1.37203e+08 # find -name 191425\* ./16579/191425 ./16579/191425.1 ./16579/191425.10 ./16579/191425.11 ./16579/191425.12 ./16579/191425.13 ./16579/191425.14 ./16579/191425.15 ./16579/191425.16 ./16579/191425.17 ./16579/191425.18 ./16579/191425.19 ./16579/191425.2 ./16579/191425.3 ./16579/191425.4 ./16579/191425.5 ./16579/191425.6 ./16579/191425.7 ./16579/191425.8 ./16579/191425.9 # vmstat 1 procs ---memory-- ---swap-- -io -system-- cpu r b swpd free buff cache si sobibo in cs us sy id wa 0 1 30336 46264 60 788235600 250 29911 6 2 87 5 0 1 30336 47412 60 788130800 289648 944 4861 3 2 71 24 0 2 30336 46696 60 788218800 816 4 840 5019 1 0 75 24 0 1 30336 49228 60 787986800 1888 164 971 5687 1 1 74 24 0 1 30336 49688 60 787891600 264048 1047 5751 1 0 75 23 autovacuum | on autovacuum_vacuum_cost_delay| -1 autovacuum_vacuum_cost_limit| -1 vacuum_cost_delay | 0 vacuum_cost_limit | 200 vacuum_cost_page_dirty | 20 vacuum_cost_page_hit| 1 vacuum_cost_page_miss | 10 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 3-days-long vacuum of 20GB table
Jeffrey Baker [EMAIL PROTECTED] writes: This autovacuum has been hammering my server with purely random i/o for half a week. The table is only 20GB and the i/o subsystem is good for 250MB/s sequential and a solid 5kiops. When should I expect it to end (if ever)? What have you got maintenance_work_mem set to? Which PG version exactly? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Strange behavior: pgbench and new Linux kernels
On Fri, 18 Apr 2008, Matthew wrote: You may also be seeing processes forced to switch between CPUs, which breaks the caches even more. So what happens if you run pgbench on a separate machine to the server? Does the problem still exist in that case? I haven't run that test yet but will before I submit a report. I did however try running things with the pgbench executable itself bound to a single CPU with no improvement. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 3-days-long vacuum of 20GB table
On Fri, Apr 18, 2008 at 10:03 AM, Tom Lane [EMAIL PROTECTED] wrote: Jeffrey Baker [EMAIL PROTECTED] writes: This autovacuum has been hammering my server with purely random i/o for half a week. The table is only 20GB and the i/o subsystem is good for 250MB/s sequential and a solid 5kiops. When should I expect it to end (if ever)? What have you got maintenance_work_mem set to? Which PG version exactly? This is 8.1.9 on Linux x86_64, # show maintenance_work_mem ; maintenance_work_mem -- 16384 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 3-days-long vacuum of 20GB table
On Fri, Apr 18, 2008 at 10:32 AM, Jeffrey Baker [EMAIL PROTECTED] wrote: On Fri, Apr 18, 2008 at 10:03 AM, Tom Lane [EMAIL PROTECTED] wrote: Jeffrey Baker [EMAIL PROTECTED] writes: This autovacuum has been hammering my server with purely random i/o for half a week. The table is only 20GB and the i/o subsystem is good for 250MB/s sequential and a solid 5kiops. When should I expect it to end (if ever)? What have you got maintenance_work_mem set to? Which PG version exactly? This is 8.1.9 on Linux x86_64, # show maintenance_work_mem ; maintenance_work_mem -- 16384 That appears to be the default. I will try increasing this. Can I increase it globally from a single backend, so that all other backends pick up the change, or do I have to restart the instance? -jwb -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Message queue table..
Hi. I have this message queue table.. currently with 8m+ records. Picking the top priority messages seem to take quite long.. it is just a matter of searching the index.. (just as explain analyze tells me it does). Can anyone digest further optimizations out of this output? (All records have funcid=4) # explain analyze SELECT job.jobid, job.funcid, job.arg, job.uniqkey, job.insert_time, job.run_after, job.grabbed_until, job.priority, job.coalesce FROM workqueue.job WHERE (job.funcid = 4) AND (job.run_after = 1208442668) AND (job.grabbed_until = 1208442668) AND (job.coalesce = 'Efam') ORDER BY funcid, priority ASC LIMIT 1 ; QUERY PLAN -- Limit (cost=0.00..0.09 rows=1 width=106) (actual time=245.273..245.274 rows=1 loops=1) - Index Scan using workqueue_job_funcid_priority_idx on job (cost=0.00..695291.80 rows=8049405 width=106) (actual time=245.268..245.268 rows=1 loops=1) Index Cond: (funcid = 4) Filter: ((run_after = 1208442668) AND (grabbed_until = 1208442668) AND (coalesce = 'Efam'::text)) Total runtime: 245.330 ms (5 rows) -- Jesper -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 3-days-long vacuum of 20GB table
On Fri, Apr 18, 2008 at 10:34 AM, Jeffrey Baker [EMAIL PROTECTED] wrote: On Fri, Apr 18, 2008 at 10:32 AM, Jeffrey Baker [EMAIL PROTECTED] wrote: On Fri, Apr 18, 2008 at 10:03 AM, Tom Lane [EMAIL PROTECTED] wrote: Jeffrey Baker [EMAIL PROTECTED] writes: This autovacuum has been hammering my server with purely random i/o for half a week. The table is only 20GB and the i/o subsystem is good for 250MB/s sequential and a solid 5kiops. When should I expect it to end (if ever)? What have you got maintenance_work_mem set to? Which PG version exactly? This is 8.1.9 on Linux x86_64, # show maintenance_work_mem ; maintenance_work_mem -- 16384 That appears to be the default. I will try increasing this. Can I increase it globally from a single backend, so that all other backends pick up the change, or do I have to restart the instance? I increased it to 1GB, restarted the vacuum, and system performance seems the same. The root of the problem, that an entire CPU is in the iowait state and the storage device is doing random i/o, is unchanged: r b swpd free buff cache si sobibo in cs us sy id wa 1 1 30328 53632 60 691471600 904 2960 1216 4720 1 1 74 23 0 1 30328 52492 60 691603600 1152 1380 948 3637 0 0 75 24 0 1 30328 49600 60 691768000 1160 1420 1055 4191 1 1 75 24 0 1 30328 49404 60 691900000 1048 1308 1133 5054 2 2 73 23 0 1 30328 47844 60 692109600 1552 1788 1002 3701 1 1 75 23 At that rate it will take a month. Compare the load generated by create table foo as select * from bar: r b swpd free buff cache si sobibo in cs us sy id wa 2 2 30328 46580 60 691102400 145156 408 2006 10729 52 8 17 23 3 1 30328 46240 60 690097600 133312 224 1834 10005 23 12 42 23 1 3 30328 60700 60 690205600 121480 172 1538 10629 22 14 32 32 1 2 30328 49520 60 691420400 122344 256 1408 14374 13 17 41 28 1 2 30328 47844 60 691596000 127752 248 1313 9452 16 15 42 27 That's rather more like it. I guess I always imagined that VACUUM was a sort of linear process, not random, and that it should proceed at sequential scan speeds. -jwb -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Message queue table..
Jesper Krogh wrote: Hi. I have this message queue table.. currently with 8m+ records. Picking the top priority messages seem to take quite long.. it is just a matter of searching the index.. (just as explain analyze tells me it does). Can anyone digest further optimizations out of this output? (All records have funcid=4) You mean all records of interest, right, not all records in the table? What indexes do you have in place? What's the schema? Can you post a \d tablename from psql? # explain analyze SELECT job.jobid, job.funcid, job.arg, job.uniqkey, job.insert_time, job.run_after, job.grabbed_until, job.priority, job.coalesce FROM workqueue.job WHERE (job.funcid = 4) AND (job.run_after = 1208442668) AND (job.grabbed_until = 1208442668) AND (job.coalesce = 'Efam') ORDER BY funcid, priority ASC LIMIT 1 ; QUERY PLAN -- Limit (cost=0.00..0.09 rows=1 width=106) (actual time=245.273..245.274 rows=1 loops=1) - Index Scan using workqueue_job_funcid_priority_idx on job (cost=0.00..695291.80 rows=8049405 width=106) (actual time=245.268..245.268 rows=1 loops=1) Index Cond: (funcid = 4) Filter: ((run_after = 1208442668) AND (grabbed_until = 1208442668) AND (coalesce = 'Efam'::text)) Total runtime: 245.330 ms (5 rows) Without seeing the schema and index definitions ... maybe you'd benefit from a multiple column index. I'd experiment with an index on (funcid,priority) first. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 3-days-long vacuum of 20GB table
Jeffrey Baker escribió: That's rather more like it. I guess I always imagined that VACUUM was a sort of linear process, not random, and that it should proceed at sequential scan speeds. It's linear for the table, but there are passes for indexes which are random in 8.1. That code was rewritten by Heikki Linnakangas to do linear passes for indexes in 8.2 AFAIR. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 3-days-long vacuum of 20GB table
Jeffrey Baker [EMAIL PROTECTED] writes: I increased it to 1GB, restarted the vacuum, and system performance seems the same. The root of the problem, that an entire CPU is in the iowait state and the storage device is doing random i/o, is unchanged: Yeah, but you just reduced the number of index scans that will be needed by a factor of 1GB/16MB. Hang in there ... regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Message queue table..
Craig Ringer wrote: Jesper Krogh wrote: Hi. I have this message queue table.. currently with 8m+ records. Picking the top priority messages seem to take quite long.. it is just a matter of searching the index.. (just as explain analyze tells me it does). Can anyone digest further optimizations out of this output? (All records have funcid=4) You mean all records of interest, right, not all records in the table? Actually all the records.. since all the other virtual queues currently are empty. What indexes do you have in place? What's the schema? Can you post a \d tablename from psql? # explain analyze SELECT job.jobid, job.funcid, job.arg, job.uniqkey, job.insert_time, job.run_after, job.grabbed_until, job.priority, job.coalesce FROM workqueue.job WHERE (job.funcid = 4) AND (job.run_after = 1208442668) AND (job.grabbed_until = 1208442668) AND (job.coalesce = 'Efam') ORDER BY funcid, priority ASC LIMIT 1 I found that removing the funcid from the order by made it use a better index. (priority, run_after, grabbed_until) that probably makes sense since the funcid doesnt give any value in the index at all. thanks for leading me back on track. Jesper -- Jesper -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Message queue table..
Jesper Krogh [EMAIL PROTECTED] writes: I have this message queue table.. currently with 8m+ records. Picking the top priority messages seem to take quite long.. it is just a matter of searching the index.. (just as explain analyze tells me it does). Limit (cost=0.00..0.09 rows=1 width=106) (actual time=245.273..245.274 rows=1 loops=1) - Index Scan using workqueue_job_funcid_priority_idx on job (cost=0.00..695291.80 rows=8049405 width=106) (actual time=245.268..245.268 rows=1 loops=1) Index Cond: (funcid = 4) Filter: ((run_after = 1208442668) AND (grabbed_until = 1208442668) AND (coalesce = 'Efam'::text)) Total runtime: 245.330 ms Well, what that's doing in English is: scan all the rows with funcid = 4, in priority order, until we hit the first one satisfying the filter conditions. Apparently there are a lot of low-priority rows that have funcid = 4 but not the other conditions. If it's the coalesce condition that's the problem, an index on (funcid, coalesce, priority) --- or (coalesce, funcid, priority) --- would probably help. I'm not sure there's a simple fix if it's the other conditions that are really selective. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] full_page_write and also compressed logging
Has there ever been any analysis regarding the redundant write overhead of full page writes? I'm wondering if once could regard an 8k page as being 64 off 128 byte paragraphs or 32 off 256byte paragraphs, each represented by a bit in a word. And, when a pageis dirtied by changes some record is kept of this based on the paragraphs affected. Then you could just incrementally dump the pre-image of newly dirtied paragraphs as you go, and the cost in terms of dirtied pages would be much lower for the case of scattered updates. (I was also wondering about just doing preimages based on chaned byte ranges but the approach above is probably faster, doesn't dump the same range twice, and may fit the existing flow more directly) Also - has any attempt been made to push log writes through a cheap compressor, such a zlib on lowest setting or one like Jeff Bonwick's for ZFS (http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/compress.c). Would work well for largely textual tables (and I suspect a lot of integer data too). James -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Message queue table..
[EMAIL PROTECTED] (Jesper Krogh) writes: I have this message queue table.. currently with 8m+ records. Picking the top priority messages seem to take quite long.. it is just a matter of searching the index.. (just as explain analyze tells me it does). Can anyone digest further optimizations out of this output? (All records have funcid=4) # explain analyze SELECT job.jobid, job.funcid, job.arg, job.uniqkey, job.insert_time, job.run_after, job.grabbed_until, job.priority, job.coalesce FROM workqueue.job WHERE (job.funcid = 4) AND (job.run_after = 1208442668) AND (job.grabbed_until = 1208442668) AND (job.coalesce = 'Efam') ORDER BY funcid, priority ASC LIMIT 1 ; There might be value in having one or more extra indices... Here are *plausible* candidates: 1. If funcid = 4 is highly significant (e.g. - you are always running this query, and funcid often 4), then you might add a functional index such as: create index job_funcid_run_after on workqueue.job (run_after) where funcid = 4; create index job_funcid_grabbeduntil on workqueue.job (grabbed_until) where funcid = 4; 2. Straight indices like the following: create index job_run_after on workqueue.job(run_after); create index job_grabbed_until on workqueue.job(grabbed_until); create index job_funcid on workqueue.job(funcid); create index job_coalesce on workqueue.job(coalesce); Note that it is _possible_ (though by no means guaranteed) that all three might prove useful, if you're running 8.1+ where PostgreSQL supports bitmap index scans. Another possibility... 3. You might change your process to process multiple records in a run so that you might instead run the query (perhaps via a cursor?) with LIMIT [Something Bigger than 1]. It does seem mighty expensive to run a 245ms query to find just one record. It seems quite likely that you could return the top 100 rows (LIMIT 100) without necessarily finding it runs in any more time. Returning 100 tuples in 245ms seems rather more acceptable, no? :-) -- (format nil [EMAIL PROTECTED] cbbrowne linuxfinances.info) http://linuxdatabases.info/info/linuxdistributions.html Rules of the Evil Overlord #32. I will not fly into a rage and kill a messenger who brings me bad news just to illustrate how evil I really am. Good messengers are hard to come by. http://www.eviloverlord.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 3-days-long vacuum of 20GB table
Jeffrey Baker wrote: On Fri, Apr 18, 2008 at 10:32 AM, Jeffrey Baker [EMAIL PROTECTED] wrote: # show maintenance_work_mem ; maintenance_work_mem -- 16384 That appears to be the default. I will try increasing this. Can I increase it globally from a single backend, so that all other backends pick up the change, or do I have to restart the instance? You can change it in the config file, and send postmaster the HUP signal, which tells all backends to reload the file. killall -HUP postmaster or similar. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance