Re: [PERFORM] Increasing pattern index query speed
Andrus wrote: Both queries return same result (19) and return same data. Pattern query is a much slower (93 sec) than equality check (13 sec). How to fix this ? Using 8.1.4, utf-8 encoding, et-EE locale They're different queries. The fact that they return the same results is a coincidence. This WHERE rid.toode = '9910' Is a different condition to this WHERE rid.toode like '9910%' You aren't going to get the same plans. Anyway, I think the problem is in the dok JOIN rid bit look: Aggregate (cost=43.09..43.10 rows=1 width=0) (actual time=12674.675..12674.679 rows=1 loops=1) - Nested Loop (cost=29.57..43.08 rows=1 width=0) (actual time=2002.045..12673.645 rows=19 loops=1) - Nested Loop (cost=29.57..37.06 rows=1 width=24) (actual time=2001.922..12672.344 rows=19 loops=1) Aggregate (cost=15.52..15.53 rows=1 width=0) (actual time=92966.501..92966.505 rows=1 loops=1) - Nested Loop (cost=0.00..15.52 rows=1 width=0) (actual time=24082.032..92966.366 rows=19 loops=1) - Nested Loop (cost=0.00..9.50 rows=1 width=24) (actual time=24081.919..92965.116 rows=19 loops=1) These are the same but the times are different. I'd be very surprised if you can reproduce these times reliably. Can I give you some wider-ranging suggestions Andrus? 1. Fix the vacuuming issue in your hash-join question. 2. Monitor the system to make sure you know if/when disk activity is high. 3. *Then* start to profile individual queries and look into their plans. Change the queries one at a time and monitor again. Otherwise, it's very difficult to figure out whether changes you make are effective. HTH -- Richard Huxton Archonet Ltd -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash join on int takes 8..114 seconds
Tomas, Let's suppose you set a reasonable value (say 8096) instead of 2GB. That gives about 160MB. Anyway this depends - if you have a lot of slow queries caused by on-disk sorts / hash tables, use a higher value. Otherwise leave it as it is. Probably product orders table is frequently joined which product table. currently there was work_memory = 512 in conf file. I changed it to work_memory = 8096 If it is all cached in memory, you may want to ensure that your shared_buffers is a reasonalbe size so that there is less shuffling of data from the kernel to postgres and back. Generally, shared_buffers works best between 5% and 25% of system memory. currently shared_buffers = 15000 That's 120MB, i.e. about 6% of the memory. Might be a little bit higher, but seems reasonable. I changed it to 2 Given the fact that the performance issues are caused by bloated tables and / or slow I/O subsystem, moving to a similar system won't help I guess. I have ran VACUUM FULL ANALYZE VERBOSE and set MAX_FSM_PAGES = 15 So there is no any bloat except pg_shdepend indexes which should not affect to query speed. Andrus. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perc 3 DC
Glyn Astill wrote: --- On Sat, 22/11/08, Scott Marlowe [EMAIL PROTECTED] wrote: You really have two choices. First is to try and use it as a plain SCSI card, maybe with caching turned on, and do the raid in software. Second is to cut it into pieces and make jewelry out of it. Haha, I'm not really into jewelry, although I had thought of smacking it into a pile of dust with a lump hammer, that's much more my thing. Anything before the Perc 6 series is seriously brain damaged, and the Perc6 brings the dell raid array line squarly in line with a 5 year old LSI megaraid, give or take. And that's being generous. Well this card thinks it's a 5 year old lsi megaraid. I've got a pile of perc5i megaraid paperweights on my desk at work, so this was kinda expected really. I've tried writeback and write through modes, tried changing cache flush times, disabled and enabled multiple PCI delayed transactions, all seem to have little effect. Yeah, it's like trying to performance tune a yugo. Did I mention I drive a yugo? Finally I decided to wave goodbye to Dell's firmware. LSI has it down as a MegaRAID 493 elite 1600, so I flashed it with their latest firmware. Doesn't seem to have helped either though. Does it have a battery backup module? Often you can't really turn on write-back without one. That would certainly slow things down. But you should certainly expect 20 M/s on a modern RAID controller writing out to a 4 disk RAID10 Yeah the battery's on it, that and the 128Mb is really the only reason I thought I'd give it a whirl. Is the battery functioning? We found that the unit had to be on and charged before write back caching would work. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash join on int takes 8..114 seconds
Given the fact that the performance issues are caused by bloated tables and / or slow I/O subsystem, moving to a similar system won't help I guess. I have ran VACUUM FULL ANALYZE VERBOSE and set MAX_FSM_PAGES = 15 So there is no any bloat except pg_shdepend indexes which should not affect to query speed. OK, what was the number of unused pointer items in the VACUUM output? The query performance is still the same as when the tables were bloated? What are the outputs of iostat/vmstat/dstat/top when running the query? regards Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perc 3 DC
--- On Mon, 24/11/08, Steve Clark [EMAIL PROTECTED] wrote: Yeah the battery's on it, that and the 128Mb is really the only reason I thought I'd give it a whirl. Is the battery functioning? We found that the unit had to be on and charged before write back caching would work. Yeah the battery is on there, and in the BIOS it says it's PRESENT and the status is GOOD. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perc 3 DC
On Mon, Nov 24, 2008 at 7:49 AM, Glyn Astill [EMAIL PROTECTED] wrote: --- On Mon, 24/11/08, Steve Clark [EMAIL PROTECTED] wrote: Yeah the battery's on it, that and the 128Mb is really the only reason I thought I'd give it a whirl. Is the battery functioning? We found that the unit had to be on and charged before write back caching would work. Yeah the battery is on there, and in the BIOS it says it's PRESENT and the status is GOOD. If I remember correctly, older LSI cards had pretty poor performance in RAID 1+0 (or any layered RAID really). Have you tried setting up RAID-1 pairs on the card and then striping them with the OS? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perc 3 DC
--- Scott Marlowe [EMAIL PROTECTED] wrote: Yeah the battery is on there, and in the BIOS it says it's PRESENT and the status is GOOD. If I remember correctly, older LSI cards had pretty poor performance in RAID 1+0 (or any layered RAID really). Have you tried setting up RAID-1 pairs on the card and then striping them with the OS? Not yet no, but that's a good suggestion and I do intend to give it a whirl. I get about 27MB/s from raid 1 (10 is about the same) so hopefully I can up the throughput to the speed of about one disk with sw raid. For kicks I did try raid 5 on it; 6.9MB/s made it hard to resist going to get the hammer, which is still a very attractive option. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perc 3 DC
On Mon, Nov 24, 2008 at 8:06 AM, Glyn Astill [EMAIL PROTECTED] wrote: --- Scott Marlowe [EMAIL PROTECTED] wrote: Yeah the battery is on there, and in the BIOS it says it's PRESENT and the status is GOOD. If I remember correctly, older LSI cards had pretty poor performance in RAID 1+0 (or any layered RAID really). Have you tried setting up RAID-1 pairs on the card and then striping them with the OS? Not yet no, but that's a good suggestion and I do intend to give it a whirl. I get about 27MB/s from raid 1 (10 is about the same) so hopefully I can up the throughput to the speed of about one disk with sw raid. For kicks I did try raid 5 on it; 6.9MB/s made it hard to resist going to get the hammer, which is still a very attractive option. Well, I prefer making keychain fobs still, but from a technical perspective, I guess either option is a good one. Srsly, also look at running pure sw RAID on it with the controller providing caching only. I don't expect a PERC 3DC to win any awards, but the less you give that card to do the better off you'll be. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perc 3 DC
On Monday 24 November 2008 14:49:17 Glyn Astill wrote: --- On Mon, 24/11/08, Steve Clark [EMAIL PROTECTED] wrote: Yeah the battery's on it, that and the 128Mb is really the only reason I thought I'd give it a whirl. Is the battery functioning? We found that the unit had to be on and charged before write back caching would work. Yeah the battery is on there, and in the BIOS it says it's PRESENT and the status is GOOD. Sorry I deleted the beginning of this on getting back from a week off. Writeback is configurable. You can enabled write back caching when the unit is not charged if you like. It is offered when you create the array (and can be changed later). It is arguably a silly thing to do, but it is an option. I have some reasonable performance stats for this card assuming you have a suitably recent version of the driver software, DELL use to ship with a Linux kernel that had a broken driver for this card resulting is very poor performance (i.e. substantially slower than software RAID). I have a note never to use with Linux before 2.6.22 as the LSI driver bundled had issues, DELL themselves shipped (if you asked why is performance so bad) a Redhat kernel with a later driver for the card than the official Linux kernel. That said a couple of weeks back ours corrupted a volume on replacing a dead hard disk, so I'm never touching these cheap and tacky LSI RAID cards ever again. It is suppose to just start rebuilding the array when you insert the replacement drive, if it doesn't just work schedule some down time and figure out exactly why, don't (for example) blindly follow the instructions in the manual on what to do if it doesn't just work. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perc 3 DC
On Mon, Nov 24, 2008 at 8:41 AM, Simon Waters [EMAIL PROTECTED] wrote: That said a couple of weeks back ours corrupted a volume on replacing a dead hard disk, so I'm never touching these cheap and tacky LSI RAID cards ever again. It is suppose to just start rebuilding the array when you insert the replacement drive, if it doesn't just work schedule some down time and figure out exactly why, don't (for example) blindly follow the instructions in the manual on what to do if it doesn't just work. Reminds me of a horror story at a company I was at some years ago. Another group was running Oracle on a nice little 4 way Xeon with a Gig of ram (back when they was a monster server) and had an LSI card. They unplugged the server to move it into the hosting center, and in the move, the scsi cable came loose. When the machine came up, the LSI RAID marked every drive bad and the old 4xx series card had no facility for forcing it to take back a drive. All their work on the db was gone, newest backup was months old. I'm pretty sure they now understand why RAID5 is no replacement for a good backup plan. I had a 438 in a dual ppro200, and it worked just fine, but I never trusted it to auto rebuild anything, and made backups every night. It was slow (in the 30 meg/second reads on a 6 disk RAID 5, not faster in RAID-10 for reads or writes) but reliable. Newer LSI cards seem quite nice, but I'm now using an Areca 16xx series and am so far very happy with it's reliability and somewhat happy with its performance. Sequential read speed is meh, but random performance is very good, so for a db server, it's a nice unit. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Monitoring buffercache...
Hi All; I've installed pg_buffercache and I want to use it to help define the optimal shared_buffers size. Currently I run this each 15min via cron: insert into buffercache_stats select now(), isdirty, count(*) as buffers, (count(*) * 8192) as memory from pg_buffercache group by 1,2; and here's it's explain plan explain insert into buffercache_stats select now(), isdirty, count(*) as buffers, (count(*) * 8192) as memory from pg_buffercache group by 1,2; QUERY PLAN --- Subquery Scan *SELECT* (cost=65.00..65.23 rows=2 width=25) - HashAggregate (cost=65.00..65.12 rows=2 width=1) - Function Scan on pg_buffercache_pages p (cost=0.00..55.00 rows=1000 width=1) (3 rows) Then once a day I will pull a report from the buffercache_stats table. The buffercache_stats table is our own creation : \d buffercache_stats Table public.buffercache_stats Column |Type | Modifiers +-+--- snap_timestamp | timestamp without time zone | isdirty| boolean | buffers | integer | memory| integer | Here's my issue, the server that we'll eventually roll this out to is extremely busy and the every 15min query above has the potential to have a huge impact on performance. Does anyone have any suggestions per a better approach or maybe a way to improve the performance for the above query ? Thanks in advance... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Monitoring buffercache...
On Mon, 2008-11-24 at 11:43 -0700, Kevin Kempter wrote: Hi All; I've installed pg_buffercache and I want to use it to help define the optimal shared_buffers size. Currently I run this each 15min via cron: insert into buffercache_stats select now(), isdirty, count(*) as buffers, (count(*) * 8192) as memory from pg_buffercache group by 1,2; and here's it's explain plan explain insert into buffercache_stats select now(), isdirty, count(*) as buffers, (count(*) * 8192) as memory from pg_buffercache group by 1,2; QUERY PLAN --- Subquery Scan *SELECT* (cost=65.00..65.23 rows=2 width=25) - HashAggregate (cost=65.00..65.12 rows=2 width=1) - Function Scan on pg_buffercache_pages p (cost=0.00..55.00 rows=1000 width=1) (3 rows) Then once a day I will pull a report from the buffercache_stats table. The buffercache_stats table is our own creation : \d buffercache_stats Table public.buffercache_stats Column |Type | Modifiers +-+--- snap_timestamp | timestamp without time zone | isdirty| boolean | buffers | integer | memory| integer | Here's my issue, the server that we'll eventually roll this out to is extremely busy and the every 15min query above has the potential to have a huge impact on performance. I wouldn't routinely run pg_buffercache on a busy database. Plus, I don't think that pg_buffercache will answer this question for you. It will tell you whats currently in the buffer pool and the clean/dirty status, but that's not the first place I'd look, but what you really need is to figure out the hit ratio on the buffer pool and go from there. Does anyone have any suggestions per a better approach or maybe a way to improve the performance for the above query ? You should be able to use the blocks hit vs block read data in the pg_stat_database view (for the overall database), and drill down into pg_statio_user_tables/pg_statio_all_tables to get more detailed data if you want. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Monitoring buffercache...
On Mon, Nov 24, 2008 at 11:43 AM, Kevin Kempter [EMAIL PROTECTED] wrote: Hi All; I've installed pg_buffercache and I want to use it to help define the optimal shared_buffers size. Currently I run this each 15min via cron: insert into buffercache_stats select now(), isdirty, count(*) as buffers, (count(*) * 8192) as memory from pg_buffercache group by 1,2; and here's it's explain plan explain insert into buffercache_stats select now(), isdirty, count(*) as buffers, (count(*) * 8192) as memory from pg_buffercache group by 1,2; QUERY PLAN --- Subquery Scan *SELECT* (cost=65.00..65.23 rows=2 width=25) - HashAggregate (cost=65.00..65.12 rows=2 width=1) - Function Scan on pg_buffercache_pages p (cost=0.00..55.00 rows=1000 width=1) (3 rows) Then once a day I will pull a report from the buffercache_stats table. The buffercache_stats table is our own creation : \d buffercache_stats Table public.buffercache_stats Column |Type | Modifiers +-+--- snap_timestamp | timestamp without time zone | isdirty| boolean | buffers | integer | memory| integer | Here's my issue, the server that we'll eventually roll this out to is extremely busy and the every 15min query above has the potential to have a huge impact on performance. Does anyone have any suggestions per a better approach or maybe a way to improve the performance for the above query ? I wouldn't worry about running it every 15 minutes unless it's on a REALLY slow machine. I just ran it in a loop over and over on my 8 core opteron server and it ran the load factor up by almost exactly 1.0. Under our normal daily load, it sits at 1.9 to 2.5, and it climbed to 2.9 under the new load of running that query over and over. So, it doesn't seem to be blocking or anything. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Monitoring buffercache...
On Mon, 2008-11-24 at 12:46 -0700, Scott Marlowe wrote: On Mon, Nov 24, 2008 at 11:43 AM, Kevin Kempter [EMAIL PROTECTED] wrote: Hi All; I've installed pg_buffercache and I want to use it to help define the optimal shared_buffers size. Currently I run this each 15min via cron: insert into buffercache_stats select now(), isdirty, count(*) as buffers, (count(*) * 8192) as memory from pg_buffercache group by 1,2; and here's it's explain plan explain insert into buffercache_stats select now(), isdirty, count(*) as buffers, (count(*) * 8192) as memory from pg_buffercache group by 1,2; QUERY PLAN --- Subquery Scan *SELECT* (cost=65.00..65.23 rows=2 width=25) - HashAggregate (cost=65.00..65.12 rows=2 width=1) - Function Scan on pg_buffercache_pages p (cost=0.00..55.00 rows=1000 width=1) (3 rows) Then once a day I will pull a report from the buffercache_stats table. The buffercache_stats table is our own creation : \d buffercache_stats Table public.buffercache_stats Column |Type | Modifiers +-+--- snap_timestamp | timestamp without time zone | isdirty| boolean | buffers | integer | memory| integer | Here's my issue, the server that we'll eventually roll this out to is extremely busy and the every 15min query above has the potential to have a huge impact on performance. Does anyone have any suggestions per a better approach or maybe a way to improve the performance for the above query ? I wouldn't worry about running it every 15 minutes unless it's on a REALLY slow machine. I just ran it in a loop over and over on my 8 core opteron server and it ran the load factor up by almost exactly 1.0. Under our normal daily load, it sits at 1.9 to 2.5, and it climbed to 2.9 under the new load of running that query over and over. So, it doesn't seem to be blocking or anything. The internal docs for pg_buffercache_pages.c state: To get a consistent picture of the buffer state, we must lock all partitions of the buffer map. Needless to say, this is horrible for concurrency. Must grab locks in increasing order to avoid possible deadlocks. I'd be concerned about that running routinely. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] limit clause produces wrong query plan
Scott, And how exactly should it be optimized? If a query is even moderately interesting, with a few joins and a where clause, postgresql HAS to create the rows that come before your offset in order to assure that it's giving you the right rows. SELECT ... FROM bigtable ORDER BY intprimarykey OFFSET 100 LIMIT 100 It should scan primary key in index order for 200 first keys and skipping first 100 keys. SELECT ... FROM bigtable ORDER BY intprimarykey OFFSET 0 LIMIT 100 That should be plenty fast. The example which I posted shows that SELECT ... FROM bigtable ORDER BY intprimarykey LIMIT 100 this is extremely *slow*: seq scan is performed over whole bigtable. A standard workaround is to use some kind of sequential, or nearly so, id field, and then use between on that field. select * from table where idfield between x and x+100; Users can delete and insert any rows in table. This appoarch requires updating x in every row in big table after each insert, delete or order column change and is thus extremely slow. So I do'nt understand how this can be used for large tables. Andrus. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perc 3 DC
Not yet no, but that's a good suggestion and I do intend to give it a whirl. I get about 27MB/s from raid 1 (10 is about the same) so hopefully I can up the throughput to the speed of about one disk with sw raid. FYI I get more than 200 MB/s out of a Linux Software RAID5 of 3 SATA drives (the new Samsung Spinpoints...) (Intel ICH8 chipset, Core 2 Duo). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Increasing pattern index query speed
Richard, These are the same but the times are different. I'd be very surprised if you can reproduce these times reliably. I re-tried today again and got same results: in production database pattern query is many times slower that equality query. toode and rid base contain only single product starting with 9910 So both queries should scan exactly same numbers of rows. Can I give you some wider-ranging suggestions Andrus? 1. Fix the vacuuming issue in your hash-join question. I have ran VACUUM FULL VERBOSE ANALYSE and set max_fsm_pages=15 So issue is fixed before those tests. 2. Monitor the system to make sure you know if/when disk activity is high. I optimized this system. Now there are short (some seconds) sales queries about after every 5 - 300 seconds which cause few disk activity and add few new rows to some tables. I havent seen that this activity affects to this test result. 3. *Then* start to profile individual queries and look into their plans. Change the queries one at a time and monitor again. How to change pattern matching query to faster ? Andrus. Btw. I tried to reproduce this big difference in test server in 8.3 using sample data script below and got big difference but in opposite direction. explain analyze SELECT sum(1) FROM orders JOIN orders_products USING (order_id) JOIN products USING (product_id) WHERE orders.order_date'2006-01-01' and ... different where clauses produce different results: AND orders_products.product_id = '3370' -- 880 .. 926 ms AND orders_products.product_id like '3370%' -- 41 ..98 ms So patter index is 10 .. 20 times (!) faster always. No idea why. Test data creation script: begin; CREATE OR REPLACE FUNCTION Counter() RETURNS int IMMUTABLE AS $_$ SELECT 350; $_$ LANGUAGE SQL; CREATE TEMP TABLE orders (order_id INTEGER NOT NULL, order_date DATE NOT NULL); CREATE TEMP TABLE products (product_id CHAR(20) NOT NULL, product_name char(70) NOT NULL, quantity numeric(12,2) default 1); CREATE TEMP TABLE orders_products (order_id INTEGER NOT NULL, product_id CHAR(20), id serial, price numeric(12,2) default 1 ); INSERT INTO products SELECT (n*power( 10,13))::INT8::CHAR(20), 'product number ' || n::TEXT FROM generate_series(0,13410) AS n; INSERT INTO orders SELECT n,'2005-01-01'::date + (4000.0 * n/Counter() * '1 DAY'::interval) FROM generate_series(0, Counter()/3 ) AS n; SET work_mem TO 2097151; INSERT INTO orders_products SELECT generate_series/3 as order_id, ( (1+ (generate_series % 13410))*power( 10,13))::INT8::CHAR(20) AS product_id FROM generate_series(1, Counter()); ALTER TABLE orders ADD PRIMARY KEY (order_id); ALTER TABLE products ADD PRIMARY KEY (product_id); ALTER TABLE orders_products ADD PRIMARY KEY (id); ALTER TABLE orders_products ADD FOREIGN KEY (product_id) REFERENCES products(product_id); ALTER TABLE orders_products ADD FOREIGN KEY (order_id) REFERENCES orders(order_id) ON DELETE CASCADE; CREATE INDEX orders_date ON orders( order_date ); CREATE INDEX order_product_pattern_idx ON orders_products( product_id bpchar_pattern_ops ); COMMIT; SET work_mem TO DEFAULT; ANALYZE; -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash join on int takes 8..114 seconds
Tomas, OK, what was the number of unused pointer items in the VACUUM output? I posted it in this thread: VACUUM FULL ANALYZE VERBOSE; ... INFO: free space map contains 14353 pages in 314 relations DETAIL: A total of 2 page slots are in use (including overhead). 89664 page slots are required to track all free space. Current limits are: 2 page slots, 1000 relations, using 182 KB. NOTICE: number of page slots needed (89664) exceeds max_fsm_pages (2) HINT: Consider increasing the configuration parameter max_fsm_pages to a value over 89664. Query returned successfully with no result in 10513335 ms. The query performance is still the same as when the tables were bloated? Seems to be the same. However I improved yesterday after previous message other queries not to scan whole product orders (rid) table. Now there is only few disk activity after 5-300 seconds which seems not to affect to those query results. So issue in this thread has been gone away. Now this query runs using constant time 8 seconds: explain analyze SELECT sum(1) FROM dok JOIN rid USING (dokumnr) JOIN toode USING (toode) WHERE rid.toode = 'X05' AND dok.kuupaev='2008-09-01' Aggregate (cost=182210.77..182210.78 rows=1 width=0) (actual time=8031.600..8031.604 rows=1 loops=1) - Nested Loop (cost=74226.74..182149.27 rows=24598 width=0) (actual time=2602.474..7948.121 rows=21711 loops=1) - Index Scan using toode_pkey on toode (cost=0.00..6.01 rows=1 width=24) (actual time=0.077..0.089 rows=1 loops=1) Index Cond: ('X05'::bpchar = toode) - Hash Join (cost=74226.74..181897.28 rows=24598 width=24) (actual time=2602.378..7785.315 rows=21711 loops=1) Hash Cond: (outer.dokumnr = inner.dokumnr) - Bitmap Heap Scan on rid (cost=4084.54..101951.60 rows=270725 width=28) (actual time=1129.925..4686.601 rows=278417 loops=1) Recheck Cond: (toode = 'X05'::bpchar) - Bitmap Index Scan on rid_toode_idx (cost=0.00..4084.54 rows=270725 width=0) (actual time=1123.202..1123.202 rows=278426 loops=1) Index Cond: (toode = 'X05'::bpchar) - Hash (cost=69419.29..69419.29 rows=112766 width=4) (actual time=1251.496..1251.496 rows=111088 loops=1) - Bitmap Heap Scan on dok (cost=1492.68..69419.29 rows=112766 width=4) (actual time=70.837..776.249 rows=111088 loops=1) Recheck Cond: (kuupaev = '2008-09-01'::date) - Bitmap Index Scan on dok_kuupaev_idx (cost=0.00..1492.68 rows=112766 width=0) (actual time=64.177..64.177 rows=111343 loops=1) Index Cond: (kuupaev = '2008-09-01'::date) Total runtime: 8031.905 ms Interestingly using like is 80 times faster: explain analyze SELECT sum(1) FROM dok JOIN rid USING (dokumnr) JOIN toode USING (toode) WHERE rid.toode like 'X05' AND dok.kuupaev='2008-09-01' Aggregate (cost=88178.69..88178.70 rows=1 width=0) (actual time=115.335..115.339 rows=1 loops=1) - Hash Join (cost=71136.22..88117.36 rows=24532 width=0) (actual time=115.322..115.322 rows=0 loops=1) Hash Cond: (outer.toode = inner.toode) - Hash Join (cost=70163.36..86253.20 rows=24598 width=24) (actual time=0.046..0.046 rows=0 loops=1) Hash Cond: (outer.dokumnr = inner.dokumnr) - Bitmap Heap Scan on rid (cost=21.16..6307.52 rows=270725 width=28) (actual time=0.037..0.037 rows=0 loops=1) Filter: (toode ~~ 'X05'::text) - Bitmap Index Scan on rid_toode_pattern_idx (cost=0.00..21.16 rows=1760 width=0) (actual time=0.028..0.028 rows=0 loops=1) Index Cond: (toode ~=~ 'X05'::bpchar) - Hash (cost=69419.29..69419.29 rows=112766 width=4) (never executed) - Bitmap Heap Scan on dok (cost=1492.68..69419.29 rows=112766 width=4) (never executed) Recheck Cond: (kuupaev = '2008-09-01'::date) - Bitmap Index Scan on dok_kuupaev_idx (cost=0.00..1492.68 rows=112766 width=0) (never executed) Index Cond: (kuupaev = '2008-09-01'::date) - Hash (cost=853.29..853.29 rows=13429 width=24) (actual time=114.757..114.757 rows=13412 loops=1) - Seq Scan on toode (cost=0.00..853.29 rows=13429 width=24) (actual time=0.014..58.319 rows=13412 loops=1) Total runtime: 115.505 ms I posted also a test script in other thread which shows also that like is magitude faster than equality check. rid.toode = 'X05' and rid.toode like 'X05' are exactly the same conditions, there are indexes for both conditions. So I do'nt understand why results are so different. In other sample which I posted in thread Increasing pattern index query speed like is 4 times slower: SELECT sum(1) FROM dok JOIN rid USING (dokumnr) JOIN toode USING (toode) WHERE rid.toode = '9910' AND
Re: [PERFORM] Monitoring buffercache...
On Mon, Nov 24, 2008 at 12:52 PM, Brad Nicholson [EMAIL PROTECTED] wrote: I just ran it in a loop over and over on my 8 core opteron server and it ran the load factor up by almost exactly 1.0. Under our normal daily load, it sits at 1.9 to 2.5, and it climbed to 2.9 under the new load of running that query over and over. So, it doesn't seem to be blocking or anything. The internal docs for pg_buffercache_pages.c state: To get a consistent picture of the buffer state, we must lock all partitions of the buffer map. Needless to say, this is horrible for concurrency. Must grab locks in increasing order to avoid possible deadlocks. Well, the pg hackers tend to take a parnoid view (it's a good thing TM) on things like this. My guess is that the period of time for which pg_buffercache takes locks on the buffer map are short enough that it isn't a real big deal on a fast enough server. On mine, it certainly had no real negative effects for the 5 minutes or so it was running in a loop. None I could see, and we run hundreds of queries per second on our system. Of course, for certain other types of loads it could be a much bigger issue. But for our load, on our machine, it was virtually unnoticeable. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] limit clause produces wrong query plan
it was veery fast. To be honest I do not know what is happening?! This is really weird. It seems that PostgreSql OFFSET / LIMIT are not optimized and thus typical paging queries SELECT ... FROM bigtable ORDER BY intprimarykey OFFSET pageno*100 LIMIT 100 or even first page query SELECT ... FROM bigtable ORDER BY intprimarykey OFFSET 0 LIMIT 100 cannot be used in PostgreSql at all for big tables. Do you have any idea how to fix this ? Andrus. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance