Re: [PERFORM] SELECT ignoring index even though ORDER BY and LIMIT present
On Wed, 2 Jun 2010, Jori Jovanovich wrote: (2) Making the query faster by making the string match LESS specific (odd, seems like it should be MORE) No, that's the way round it should be. The LIMIT changes it all. Consider if you have a huge table, and half of the entries match your WHERE clause. To fetch the ORDER BY ... LIMIT 20 using an index scan would involve accessing only on average 40 entries from the table referenced by the index. Therefore, the index is quick. However, consider a huge table that only has twenty matching entries. The index scan would need to touch every single row in the table to return the matching rows, so a sequential scan, filter, and sort would be much faster. Of course, if you had an index capable of answering the WHERE clause, that would be even better for that case. Matthew -- Don't criticise a man until you have walked a mile in his shoes; and if you do at least he will be a mile behind you and bare footed. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
On Wed, Jun 2, 2010 at 7:30 PM, Craig James craig_ja...@emolecules.com wrote: I'm testing/tuning a new midsize server and ran into an inexplicable problem. With an RAID10 drive, when I move the WAL to a separate RAID1 drive, TPS drops from over 1200 to less than 90! I've checked everything and can't find a reason. Here are the details. 8 cores (2x4 Intel Nehalem 2 GHz) 12 GB memory 12 x 7200 SATA 500 GB disks 3WARE 9650SE-12ML RAID controller with bbu 2 disks: RAID1 500GB ext4 blocksize=4096 8 disks: RAID10 2TB, stripe size 64K, blocksize=4096 (ext4 or xfs - see below) 2 disks: hot swap Ubuntu 10.04 LTS (Lucid) With xfs or ext4 on the RAID10 I got decent bonnie++ and pgbench results (this one is for xfs): Version 1.03e --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP argon 24064M 70491 99 288158 25 129918 16 65296 97 428210 23 558.9 1 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 23283 81 + +++ 13775 56 20143 74 + +++ 15152 54 argon,24064M,70491,99,288158,25,129918,16,65296,97,428210,23,558.9,1,16,23283,81,+,+++,13775,56,20143\ ,74,+,+++,15152,54 pgbench -i -s 100 -U test pgbench -c 10 -t 1 -U test scaling factor: 100 query mode: simple number of clients: 10 number of transactions per client: 1 number of transactions actually processed: 10/10 tps = 1046.104635 (including connections establishing) tps = 1046.337276 (excluding connections establishing) Now the mystery: I moved the pg_xlog directory to a RAID1 array (same 3WARE controller, two more SATA 7200 disks). Run the same tests and ... tps = 82.325446 (including connections establishing) tps = 82.326874 (excluding connections establishing) I thought I'd made a mistake, like maybe I moved the whole database to the RAID1 array, but I checked and double checked. I even watched the lights blink - the WAL was definitely on the RAID1 and the rest of Postgres on the RAID10. So I moved the WAL back to the RAID10 array, and performance jumped right back up to the 1200 TPS range. Next I check the RAID1 itself: dd if=/dev/zero of=./bigfile bs=8192 count=200 which yielded 98.8 MB/sec - not bad. bonnie++ on the RAID1 pair showed good performance too: Version 1.03e --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP argon 24064M 68601 99 110057 18 46534 6 59883 90 123053 7 471.3 1 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 + +++ + +++ + +++ + +++ + +++ + +++ argon,24064M,68601,99,110057,18,46534,6,59883,90,123053,7,471.3,1,16,+,+++,+,+++,+,+++,+,\ +++,+,+++,+,+++ So ... anyone have any idea at all how TPS drops to below 90 when I move the WAL to a separate RAID1 disk? Does this make any sense at all? It's repeatable. It happens for both ext4 and xfs. It's weird. You can even watch the disk lights and see it: the RAID10 disks are on almost constantly when the WAL is on the RAID10, but when you move the WAL over to the RAID1, its lights are dim and flicker a lot, like it's barely getting any data, and the RAID10 disk's lights barely go on at all. *) Is your raid 1 configured writeback cache on the controller? *) have you tried changing wal_sync_method to fdatasync? merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SELECT ignoring index even though ORDER BY and LIMIT present
hi, I'm sorry for not posting this first. The server is the following and is being used exclusively for this PostgreSQL instance: PostgreSQL 8.4.2 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.2.real (GCC) 4.2.4 (Ubuntu 4.2.4-1ubuntu4), 64-bit Amazon EC2 Large Instance, 7.5GB memory, 64-bit This is what is set in my postgresql.conf file: max_connections = 100 ssl = true shared_buffers = 24MB ANALYZE VERBOSE EVENTS; INFO: analyzing public.events INFO: events: scanned 3 of 211312 pages, containing 1725088 live rows and 0 dead rows; 3 rows in sample, 12151060 estimated total rows Updating statistics did not effect the results -- it's still doing full table scans (I had run statistics as well before posting here as well so this was expected). thank you On Wed, Jun 2, 2010 at 8:49 PM, Bob Lunney bob_lun...@yahoo.com wrote: Jori, What is the PostgreSQL version/shared_buffers/work_mem/effective_cache_size/default_statistics_target? Are the statistics for the table up to date? (Run analyze verbose tablename to update them.) Table and index structure would be nice to know, too. If all else fails you can set enable_seqscan = off for the session, but that is a Big Hammer for what is probably a smaller problem. Bob Lunney --- On *Wed, 6/2/10, Jori Jovanovich j...@dimensiology.com* wrote: From: Jori Jovanovich j...@dimensiology.com Subject: [PERFORM] SELECT ignoring index even though ORDER BY and LIMIT present To: pgsql-performance@postgresql.org Date: Wednesday, June 2, 2010, 4:28 PM hi, I have a problem space where the main goal is to search backward in time for events. Time can go back very far into the past, and so the table can get quite large. However, the vast majority of queries are all satisfied by relatively recent data. I have an index on the row creation date and I would like almost all of my queries to have a query plan looking something like: Limit ... - Index Scan Backward using server_timestamp_idx on events (cost=0.00..623055.91 rows=8695 width=177) ... However, PostgreSQL frequently tries to do a full table scan. Often what controls whether a scan is performed or not is dependent on the size of the LIMIT and how detailed the WHERE clause is. In practice, the scan is always the wrong answer for my use cases (where always is defined to be 99.9%). Some examples: (1) A sample query that devolves to a full table scan EXPLAIN SELECT events.id, events.client_duration, events.message, events.created_by, events.source, events.type, events.event, events.environment, events.server_timestamp, events.session_id, events.reference, events.client_uuid FROM events WHERE client_uuid ~* E'^foo bar so what' ORDER BY server_timestamp DESC LIMIT 20; QUERY PLAN (BAD!) -- Limit (cost=363278.56..363278.61 rows=20 width=177) - Sort (cost=363278.56..363278.62 rows=24 width=177) Sort Key: server_timestamp - Seq Scan on events (cost=0.00..363278.01 rows=24 width=177) Filter: (client_uuid ~* '^foo bar so what'::text) (2) Making the query faster by making the string match LESS specific (odd, seems like it should be MORE) EXPLAIN SELECT events.id, events.client_duration, events.message, events.created_by, events.source, events.type, events.event, events.environment, events.server_timestamp, events.session_id, events.reference, events.client_uuid FROM events WHERE client_uuid ~* E'^foo' ORDER BY server_timestamp DESC LIMIT 20; QUERY PLAN (GOOD!) Limit (cost=0.00..1433.14 rows=20 width=177) - Index Scan Backward using server_timestamp_idx on events (cost=0.00..623055.91 rows=8695 width=177) Filter: (client_uuid ~* '^foo'::text) (3) Alternatively making the query faster by using a smaller limit EXPLAIN SELECT events.id, events.client_duration, events.message, events.created_by, events.source, events.type, events.event, events.environment, events.server_timestamp, events.session_id, events.reference, events.client_uuid FROM events WHERE client_uuid ~* E'^foo bar so what' ORDER BY server_timestamp DESC LIMIT 10; QUERY PLAN (GOOD!) -- Limit (cost=0.00..259606.63 rows=10 width=177) - Index Scan Backward using server_timestamp_idx on events (cost=0.00..623055.91 rows=24 width=177) Filter: (client_uuid ~* '^foo bar so what'::text) I find myself wishing I could just put a SQL HINT on the query to force the index to be used but I understand that HINTs are
Re: [PERFORM] Weird XFS WAL problem
Craig James wrote: I'm testing/tuning a new midsize server and ran into an inexplicable problem. With an RAID10 drive, when I move the WAL to a separate RAID1 drive, TPS drops from over 1200 to less than 90! Normally 100 TPS means that the write cache on the WAL drive volume is disabled (or set to write-through instead of write-back). When things in this area get fishy, I will usually download sysbench and have it specifically test how many fsync calls can happen per second. http://projects.2ndquadrant.com/talks , Database Hardware Benchmarking, page 28 has an example of the right incantation for that. Also, make sure you run 3ware's utilities and confirm all the disks have finished their initialization and verification stages. If you just adjusted disk layout that and immediate launched into benchmarks, those are useless until the background cleanup is done. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
On 6/2/10 4:40 PM, Mark Kirkwood wrote: On 03/06/10 11:30, Craig James wrote: I'm testing/tuning a new midsize server and ran into an inexplicable problem. With an RAID10 drive, when I move the WAL to a separate RAID1 drive, TPS drops from over 1200 to less than 90! I've checked everything and can't find a reason. Are the 2 new RAID1 disks the same make and model as the 12 RAID10 ones? Yes. Also, are barriers *on* on the RAID1 mount and off on the RAID10 one? It was the barriers. barrier=1 isn't just a bad idea on ext4, it's a disaster. pgbench -i -s 100 -U test pgbench -c 10 -t 1 -U test Change WAL to barrier=0 tps = 1463.264981 (including connections establishing) tps = 1463.725687 (excluding connections establishing) Change WAL to noatime, nodiratime, barrier=0 tps = 1479.331476 (including connections establishing) tps = 1479.810545 (excluding connections establishing) Change WAL to barrier=1 tps = 82.325446 (including connections establishing) tps = 82.326874 (excluding connections establishing) This is really hard to believe, because the bonnie++ numbers and dd(1) numbers look good (see my original post). But it's totally repeatable. It must be some really unfortunate just missed the next sector going by the write head problem. So with ext4, bonnie++ and dd aren't the whole story. BTW, I also learned that if you edit /etc/fstab and use mount -oremount it WON'T change barrier=0/1 unless it is explicit in the fstab file. That is, if you put barrier=0 into /etc/fstab and use the remount, it will change it to no barriers. But if you then remove it from /etc/fstab, it won't change it back to the default. You have to actually put barrier=1 if you want to get it back to the default. This seems like a bug to me, and it made it really hard to track this down. mount -oremount is not the same as umount/mount! Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
On Thu, 3 Jun 2010, Craig James wrote: Also, are barriers *on* on the RAID1 mount and off on the RAID10 one? It was the barriers. barrier=1 isn't just a bad idea on ext4, it's a disaster. This worries me a little. Does your array have a battery-backed cache? If so, then it should be fast regardless of barriers (although barriers may make a small difference). If it does not, then it is likely that the fast speed you are seeing with barriers off is unsafe. There should be no just missed the sector going past for write problem ever with a battery-backed cache. Matthew -- There once was a limerick .sig that really was not very big It was going quite fine Till it reached the fourth line -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Matthew Wakeling matt...@flymine.org wrote: On Thu, 3 Jun 2010, Craig James wrote: Also, are barriers *on* on the RAID1 mount and off on the RAID10 one? It was the barriers. barrier=1 isn't just a bad idea on ext4, it's a disaster. This worries me a little. Does your array have a battery-backed cache? If so, then it should be fast regardless of barriers (although barriers may make a small difference). If it does not, then it is likely that the fast speed you are seeing with barriers off is unsafe. I've seen this, too (with xfs). Our RAID controller, in spite of having BBU cache configured for writeback, waits for actual persistence on disk for write barriers (unlike for fsync). This does strike me as surprising to the point of bordering on qualifying as a bug. It means that you can't take advantage of the BBU cache and get the benefit of write barriers in OS cache behavior. :-( -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] slow query performance
I cant seem to pinpoint why this query is slow . No full table scans are being done. The hash join is taking maximum time. The table dev4_act_action has only 3 rows. box is a 2 cpu quad core intel 5430 with 32G RAM... Postgres 8.4.0 1G work_mem 20G effective_cache random_page_cost=1 default_statistics_target=1000 The larget table in the inner query is dev4_act_dy_fact which is partitioned into 3 partitions per month. Each partition has about 25 million rows. The rest of the tables are very small (100- 1000 rows) explain analyze select ipconvert(srctest_num),CASE targetpt::character varying WHEN NULL::text THEN serv.targetsrv ELSE targetpt::character varying END AS targetsrv, sesstype,hits as cons,bytes, srcz.srcarea as srcz, dstz.dstarea as dstz from ( select srctest_num, targetpt,targetsrv_id, sesstype_id, sum(total) as hits, sum(bin) + sum(bout) as bts, sourcearea_id, destinationarea_id from dev4_act_dy_fact a, dev4_act_action act where thedate between '2010-05-22' and '2010-05-22' and a.action_id = act.action_id and action in ('rejected','sess_rejected') and guardid_id in (select guardid_id from dev4_act_guardid where guardid like 'cust00%') and node_id=(select node_id from dev4_act_node where node='10.90.100.2') group by srctest_num,targetpt,targetsrv_id,sesstype_id, sourcearea_id, destinationarea_id order by (sum(bin) + sum(bout)) desc limit 1000 ) a left outer join dev4_act_dstarea dstz on a.destinationarea_id = dstz.dstarea_id left outer join dev4_act_srcarea srcz on a.sourcearea_id = srcz.srcarea_id left outer join dev4_act_targetsrv serv on a.targetsrv_id = serv.targetsrv_id left outer join dev4_sesstype proto on a.sesstype_id = proto.sesstype_id order by bytes desc Nested Loop Left Join (cost=95392.32..95496.13 rows=20 width=510) (actual time=164533.831..164533.831 rows=0 loops=1) - Nested Loop Left Join (cost=95392.32..95473.43 rows=20 width=396) (actual time=164533.830..164533.830 rows=0 loops=1) - Nested Loop Left Join (cost=95392.32..95455.83 rows=20 width=182) (actual time=164533.829..164533.829 rows=0 loops=1) - Nested Loop Left Join (cost=95392.32..95410.17 rows=20 width=186) (actual time=164533.829..164533.829 rows=0 loops=1) - Limit (cost=95392.32..95392.37 rows=20 width=52) (actual time=164533.828..164533.828 rows=0 loops=1) InitPlan 1 (returns $0) - Index Scan using dev4_act_node_uindx on dev4_act_node (cost=0.00..2.27 rows=1 width=4) (actual time=0.052..0.052 rows=0 loops=1) Index Cond: ((node)::text = '10.90.100.2'::text) - Sort (cost=95390.05..95390.10 rows=20 width=52) (actual time=164533.826..164533.826 rows=0 loops=1) Sort Key: ((sum(a.bin) + sum(a.bout))) Sort Method: quicksort Memory: 17kB - HashAggregate (cost=95389.22..95389.62 rows=20 width=52) (actual time=164533.796..164533.796 rows=0 loops=1) - Nested Loop Semi Join (cost=7.37..95388.77 rows=20 width=52) (actual time=164533.793..164533.793 rows=0 loops=1) - Hash Join (cost=7.37..94836.75 rows=2043 width=56) (actual time=164533.792..164533.792 rows=0 loops=1) Hash Cond: (a.action_id = act.action_id) - Append (cost=2.80..94045.71 rows=204277 width=60) (actual time=164533.790..164533.790 rows=0 loops=1) - Bitmap Heap Scan on dev4_act_dy_fact a (cost=2.80..3.82 rows=1 width=60) (actual time=0.064..0.064 rows=0 loops=1) Recheck Cond: ((node_id = $0) AND (thedate = '2010-05-22 00:00:00'::timestamp without time area) AND (thedate = '2010-05-22 00:00:00'::timestamp without time area)) - BitmapAnd (cost=2.80..2.80 rows=1 width=0) (actual time=0.062..0.062 rows=0 loops=1) - Bitmap Index Scan on dev4_act_dy_dm_nd_indx (cost=0.00..1.27 rows=3 width=0) (actual time=0.062..0.062 rows=0 loops=1) Index Cond: (node_id = $0) - Bitmap Index Scan on dev4_act_dy_dm_cd_indx (cost=0.00..1.28 rows=3 width=0) (never executed) Index Cond: ((thedate = '2010-05-22 00:00:00'::timestamp without time area) AND (thedate = '2010-05-22 00:00:00'::timestamp without time area)) - Index Scan using dev4_act_dy_fact_2010_05_t3_thedate on dev4_act_dy_fact_2010_05_t3 a (cost=0.00..94041.89 rows=204276 width=60) (actual time=164533.725..164533.725 rows=0 loops=1) Index Cond: ((thedate = '2010-05-22 00:00:00'::timestamp without time area) AND (thedate
Re: [PERFORM] Weird XFS WAL problem
Kevin Grittner wrote: I've seen this, too (with xfs). Our RAID controller, in spite of having BBU cache configured for writeback, waits for actual persistence on disk for write barriers (unlike for fsync). This does strike me as surprising to the point of bordering on qualifying as a bug. Completely intentional, and documented at http://xfs.org/index.php/XFS_FAQ#Q._Should_barriers_be_enabled_with_storage_which_has_a_persistent_write_cache.3F The issue is that XFS will actually send the full flush your cache call to the controller, rather than just the usual fsync call, and that eliminates the benefit of having a write cache there in the first place. Good controllers respect that and flush their whole write cache out. And ext4 has adopted the same mechanism. This is very much a good thing from the perspective of database reliability for people with regular hard drives who don't have a useful write cache on their cheap hard drives. It allows them to keep the disk's write cache on for other things, while still getting the proper cache flushes when the database commits demand them. It does mean that everyone with a non-volatile battery backed cache, via RAID card typically, needs to turn barriers off manually. I've already warned on this list that PostgreSQL commit performance on ext4 is going to appear really terrible to many people. If you benchmark and don't recognize ext3 wasn't operating in a reliable mode before, the performance drop now that ext4 is doing the right thing with barriers looks impossibly bad. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Craig James wrote: This is really hard to believe, because the bonnie++ numbers and dd(1) numbers look good (see my original post). But it's totally repeatable. It must be some really unfortunate just missed the next sector going by the write head problem. Commit performance is a separate number to measure that is not reflected in any benchmark that tests sequential performance. I consider it the fourth axis of disk system performance (seq read, seq write, random IOPS, commit rate), and directly measure it with the sysbench fsync test I recommended already. (You can do it with the right custom pgbench script too). You only get one commit per rotation on a drive, which is exactly what you're seeing: a bit under the 120 spins/second @ 7200 RPM. Attempts to time things just right to catch more than one sector per spin are extremely difficult to accomplish, I spent a week on that once without making any good progress. You can easily get 100MB/s on reads and writes but only manage 100 commits/second. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Greg Smith g...@2ndquadrant.com wrote: Kevin Grittner wrote: I've seen this, too (with xfs). Our RAID controller, in spite of having BBU cache configured for writeback, waits for actual persistence on disk for write barriers (unlike for fsync). This does strike me as surprising to the point of bordering on qualifying as a bug. Completely intentional, and documented at http://xfs.org/index.php/XFS_FAQ#Q._Should_barriers_be_enabled_with_storage_which_has_a_persistent_write_cache.3F Yeah, I read that long ago and I've disabled write barriers because of it; however, it still seems wrong that the RAID controller insists on flushing to the drives in write-back mode. Here are my reasons for wishing it was otherwise: (1) We've had batteries on our RAID controllers fail occasionally. The controller automatically degrades to write-through, and we get an email from the server and schedule a tech to travel to the site and replace the battery; but until we take action we are now exposed to possible database corruption. Barriers don't automatically come on when the controller flips to write-through mode. (2) It precludes any possibility of moving from fsync techniques to write barrier techniques for ensuring database integrity. If the OS respected write barriers and the controller considered the write satisfied when it hit BBU cache, write barrier techniques would work, and checkpoints could be made smoother. Think how nicely that would inter-operate with point (1). So, while I understand it's Working As Designed, I think the design is surprising and sub-optimal. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow query performance
On 6/3/2010 12:47 PM, Anj Adu wrote: I cant seem to pinpoint why this query is slow . No full table scans are being done. The hash join is taking maximum time. The table dev4_act_action has only 3 rows. box is a 2 cpu quad core intel 5430 with 32G RAM... Postgres 8.4.0 1G work_mem 20G effective_cache random_page_cost=1 default_statistics_target=1000 The larget table in the inner query is dev4_act_dy_fact which is partitioned into 3 partitions per month. Each partition has about 25 million rows. The rest of the tables are very small (100- 1000 rows) explain analyze select ipconvert(srctest_num),CASE targetpt::character varying WHEN NULL::text THEN serv.targetsrv ELSE targetpt::character varying END AS targetsrv, sesstype,hits as cons,bytes, srcz.srcarea as srcz, dstz.dstarea as dstz from ( select srctest_num, targetpt,targetsrv_id, sesstype_id, sum(total) as hits, sum(bin) + sum(bout) as bts, sourcearea_id, destinationarea_id from dev4_act_dy_fact a, dev4_act_action act where thedate between '2010-05-22' and '2010-05-22' and a.action_id = act.action_id and action in ('rejected','sess_rejected') and guardid_id in (select guardid_id from dev4_act_guardid where guardid like 'cust00%') and node_id=(select node_id from dev4_act_node where node='10.90.100.2') group by srctest_num,targetpt,targetsrv_id,sesstype_id, sourcearea_id, destinationarea_id order by (sum(bin) + sum(bout)) desc limit 1000 ) a left outer join dev4_act_dstarea dstz on a.destinationarea_id = dstz.dstarea_id left outer join dev4_act_srcarea srcz on a.sourcearea_id = srcz.srcarea_id left outer join dev4_act_targetsrv serv on a.targetsrv_id = serv.targetsrv_id left outer join dev4_sesstype proto on a.sesstype_id = proto.sesstype_id order by bytes desc Wow, the word wrap on that makes it hard to read... can you paste it here and send us a link? http://explain.depesz.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
On Thu, Jun 3, 2010 at 12:40 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Yeah, I read that long ago and I've disabled write barriers because of it; however, it still seems wrong that the RAID controller insists on flushing to the drives in write-back mode. Here are my reasons for wishing it was otherwise: I think it's a case of the quickest, simplest answer to semi-new tech. Not sure what to do with barriers? Just flush the whole cache. I'm guessing that this will get optimized in the future. BTW, I'll have LSI Megaraid latest and greatest to test on in a month, and older Areca 1680s as well. I'll be updating the firmware on the arecas, and I'll run some tests on the whole barrier behaviour to see if it's gotten any better lately. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Scott Marlowe scott.marl...@gmail.com wrote: I think it's a case of the quickest, simplest answer to semi-new tech. Not sure what to do with barriers? Just flush the whole cache. I'm guessing that this will get optimized in the future. Let's hope so. That reminds me, the write barrier concept is at least on the horizon as a viable technology; does anyone know if the asynchronous graphs concept in this (one page) paper ever came to anything? (I haven't hear anything about it lately.) http://www.usenix.org/events/fast05/wips/burnett.pdf -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Scott Marlowe wrote: I think it's a case of the quickest, simplest answer to semi-new tech. Not sure what to do with barriers? Just flush the whole cache. Well, that really is the only useful thing you can do with regular SATA drives; the ATA command set isn't any finer grained than that in a way that's useful for this context. And it's also quite reasonable for a RAID controller to respond to that flush the whole cache call by flushing its cache. So it's not just the simplest first answer, I believe it's the only answer until a better ATA command set becomes available. I think this can only be resolved usefully for all of us at the RAID firmware level. If the controller had some logic that said it's OK to not flush the cache when that call comes in if my battery is working fine, that would make this whole problem go away. I don't expect it's possible to work around the exact set of concerns Kevin listed any other way, because as he pointed out the right thing to do is very dependent on the battery health, which the OS also doesn't know (again, would require some new command set verbage). -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
On Thu, Jun 3, 2010 at 1:31 PM, Greg Smith g...@2ndquadrant.com wrote: Scott Marlowe wrote: I think it's a case of the quickest, simplest answer to semi-new tech. Not sure what to do with barriers? Just flush the whole cache. Well, that really is the only useful thing you can do with regular SATA drives; the ATA command set isn't any finer grained than that in a way that's useful for this context. And it's also quite reasonable for a RAID controller to respond to that flush the whole cache call by flushing its cache. So it's not just the simplest first answer, I believe it's the only answer until a better ATA command set becomes available. I think this can only be resolved usefully for all of us at the RAID firmware level. If the controller had some logic that said it's OK to not flush the cache when that call comes in if my battery is working fine, That's what already happens for fsync on a BBU controller, so I don't think the code to do so would be something fancy and new, just a simple change of logic on which code path to take. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Greg Smith g...@2ndquadrant.com wrote: I think this can only be resolved usefully for all of us at the RAID firmware level. If the controller had some logic that said it's OK to not flush the cache when that call comes in if my battery is working fine, that would make this whole problem go away. That is exactly what I've been trying to suggest. Sorry for not being more clear about it. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How to insert a bulk of data with unique-violations very fast
On Thu, Jun 3, 2010 at 11:19 AM, Torsten Zühlsdorff f...@meisterderspiele.de wrote: Scott Marlowe schrieb: On Tue, Jun 1, 2010 at 9:03 AM, Torsten Zühlsdorff f...@meisterderspiele.de wrote: Hello, i have a set of unique data which about 150.000.000 rows. Regullary i get a list of data, which contains multiple times of rows than the already stored one. Often around 2.000.000.000 rows. Within this rows are many duplicates and often the set of already stored data. I want to store just every entry, which is not within the already stored one. Also i do not want to store duplicates. Example: The standard method in pgsql is to load the data into a temp table then insert where not exists in old table. Sorry, i didn't get it. I've googled some examples, but no one match at my case. Every example i found was a single insert which should be done or ignored, if the row is already stored. But in my case i have a bulk of rows with duplicates. Either your tipp doesn't match my case or i didn't unterstand it correctly. Can you provide a simple example? create table main (id int primary key, info text); create table loader (id int, info text); insert into main values (1,'abc'),(2,'def'),(3,'ghi'); insert into loader values (1,'abc'),(4,'xyz'); select * from main; id | info +-- 1 | abc 2 | def 3 | ghi (3 rows) select * from loader; id | info +-- 1 | abc 4 | xyz (2 rows) insert into main select * from loader except select * from main; select * from main; id | info +-- 1 | abc 2 | def 3 | ghi 4 | xyz (4 rows) Note that for the where not exists to work the fields would need to be all the same, or you'd need a more complex query. If the info field here was different you'd get an error an no insert / update. For that case you might want to use where not in: insert into main select * from loader where id not in (select id from main); If you wanted the new rows to update pre-existing rows, then you could run an update first where the ids matched, then the insert where no id matches. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow query performance
Link to plan http://explain.depesz.com/s/kHa On Thu, Jun 3, 2010 at 11:43 AM, Andy Colson a...@squeakycode.net wrote: On 6/3/2010 12:47 PM, Anj Adu wrote: I cant seem to pinpoint why this query is slow . No full table scans are being done. The hash join is taking maximum time. The table dev4_act_action has only 3 rows. box is a 2 cpu quad core intel 5430 with 32G RAM... Postgres 8.4.0 1G work_mem 20G effective_cache random_page_cost=1 default_statistics_target=1000 The larget table in the inner query is dev4_act_dy_fact which is partitioned into 3 partitions per month. Each partition has about 25 million rows. The rest of the tables are very small (100- 1000 rows) explain analyze select ipconvert(srctest_num),CASE targetpt::character varying WHEN NULL::text THEN serv.targetsrv ELSE targetpt::character varying END AS targetsrv, sesstype,hits as cons,bytes, srcz.srcarea as srcz, dstz.dstarea as dstz from ( select srctest_num, targetpt,targetsrv_id, sesstype_id, sum(total) as hits, sum(bin) + sum(bout) as bts, sourcearea_id, destinationarea_id from dev4_act_dy_fact a, dev4_act_action act where thedate between '2010-05-22' and '2010-05-22' and a.action_id = act.action_id and action in ('rejected','sess_rejected') and guardid_id in (select guardid_id from dev4_act_guardid where guardid like 'cust00%') and node_id=(select node_id from dev4_act_node where node='10.90.100.2') group by srctest_num,targetpt,targetsrv_id,sesstype_id, sourcearea_id, destinationarea_id order by (sum(bin) + sum(bout)) desc limit 1000 ) a left outer join dev4_act_dstarea dstz on a.destinationarea_id = dstz.dstarea_id left outer join dev4_act_srcarea srcz on a.sourcearea_id = srcz.srcarea_id left outer join dev4_act_targetsrv serv on a.targetsrv_id = serv.targetsrv_id left outer join dev4_sesstype proto on a.sesstype_id = proto.sesstype_id order by bytes desc Wow, the word wrap on that makes it hard to read... can you paste it here and send us a link? http://explain.depesz.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How to insert a bulk of data with unique-violations very fast
2010/6/1 Torsten Zühlsdorff f...@meisterderspiele.de: Hello, i have a set of unique data which about 150.000.000 rows. Regullary i get a list of data, which contains multiple times of rows than the already stored one. Often around 2.000.000.000 rows. Within this rows are many duplicates and often the set of already stored data. I want to store just every entry, which is not within the already stored one. Also i do not want to store duplicates. Example: Already stored set: a,b,c Given set: a,b,a,c,d,a,c,d,b Expected set after import: a,b,c,d I now looking for a faster way for the import. At the moment i import the new data with copy into an table 'import'. then i remove the duplicates and insert every row which is not already known. after that import is truncated. Is there a faster way? Should i just insert every row and ignore it, if the unique constrain fails? Here the simplified table-schema. in real life it's with partitions: test=# \d urls Tabelle »public.urls« Spalte | Typ | Attribute +-+--- url_id | integer | not null default nextval('urls_url_id_seq'::regclass) url | text | not null Indexe: »urls_url« UNIQUE, btree (url) »urls_url_id« btree (url_id) Thanks for every hint or advice! :) I think you need to have a look at pgloader. It does COPY with error handling. very effective. http://pgloader.projects.postgresql.org/ Greetings from Germany, Torsten -- http://www.dddbl.de - ein Datenbank-Layer, der die Arbeit mit 8 verschiedenen Datenbanksystemen abstrahiert, Queries von Applikationen trennt und automatisch die Query-Ergebnisse auswerten kann. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] slow query
I am reposting as my original query was mangled The link to the explain plan is here as it does not paste well into the email body. http://explain.depesz.com/s/kHa The machine is a 2 cpu quad core 5430 with 32G RAM and 6x450G 15K single raid-10 array 1G work_mem default_statistics_target=1000 random_page_cost=1 I am curious why the hash join takes so long. The main table dev4_act_dy_fact_2010_05_t has 25 million rows. The table is partitioned into 3 parts per month. Remaining tables are very small ( 1000 rows) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Performance tuning for postgres
Hi I am performing a DB insertion and update for 3000+ records and while doing so i get CPU utilization to 100% with 67% of CPU used by postgres I have also done optimization on queries too... Is there any way to optimized the CPU utilization for postgres I am currently using postgres 8.3 version... Help will be appreciated Regards Yogesh Naik DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.