Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tory M Blue
On Wed, Nov 16, 2011 at 9:19 PM, Josh Berkus wrote: > Tory, > > A seq scan across 83GB in 4 minutes is pretty good.   That's over > 300MB/s.  Even if you assume that 1/3 of the table was already cached, > that's still over 240mb/s.  Good disk array. > > Either you need an index, or you need to not

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tory M Blue
On Wed, Nov 16, 2011 at 7:47 PM, Tomas Vondra wrote: > On 17 Listopad 2011, 4:16, Tory M Blue wrote: >> On Wed, Nov 16, 2011 at 7:02 PM, Scott Marlowe >> wrote: >>> On Wed, Nov 16, 2011 at 7:42 PM, Tory M Blue wrote: On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra wrote: > On 17 Listopad

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Josh Berkus
Tory, A seq scan across 83GB in 4 minutes is pretty good. That's over 300MB/s. Even if you assume that 1/3 of the table was already cached, that's still over 240mb/s. Good disk array. Either you need an index, or you need to not do this query at user request time. Or a LOT more RAM. -- Jos

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tomas Vondra
On 17 Listopad 2011, 4:16, Tory M Blue wrote: > On Wed, Nov 16, 2011 at 7:02 PM, Scott Marlowe > wrote: >> On Wed, Nov 16, 2011 at 7:42 PM, Tory M Blue wrote: >>> On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra wrote: On 17 Listopad 2011, 2:57, Scott Marlowe wrote: > On Wed, Nov 16, 2011

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tory M Blue
On Wed, Nov 16, 2011 at 7:02 PM, Scott Marlowe wrote: > On Wed, Nov 16, 2011 at 7:42 PM, Tory M Blue wrote: >> On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra wrote: >>> On 17 Listopad 2011, 2:57, Scott Marlowe wrote: On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra wrote: > But you're

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Scott Marlowe
On Wed, Nov 16, 2011 at 8:02 PM, Scott Marlowe wrote: > On Wed, Nov 16, 2011 at 7:42 PM, Tory M Blue wrote: >> On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra wrote: >>> On 17 Listopad 2011, 2:57, Scott Marlowe wrote: On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra wrote: > But you're

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Scott Marlowe
On Wed, Nov 16, 2011 at 7:42 PM, Tory M Blue wrote: > On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra wrote: >> On 17 Listopad 2011, 2:57, Scott Marlowe wrote: >>> On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra wrote: >>> But you're right - you're not bound by I/O (although I don't know what >

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tory M Blue
On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra wrote: > On 17 Listopad 2011, 2:57, Scott Marlowe wrote: >> On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra wrote: >> >>> But you're right - you're not bound by I/O (although I don't know what >>> are >>> those 15% - iowait, util or what?). The COUNT(DI

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tomas Vondra
On 17 Listopad 2011, 2:57, Scott Marlowe wrote: > On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra wrote: > >> But you're right - you're not bound by I/O (although I don't know what >> are >> those 15% - iowait, util or what?). The COUNT(DISTINCT) has to actually >> keep all the distinct values to de

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Andy Colson
On 11/16/2011 04:53 PM, Tory M Blue wrote: Linux F12 64bit Postgres 8.4.4 16 proc / 32GB 8 disk 15KRPM SAS/Raid 5 (I know!) shared_buffers = 6000MB #temp_buffers = 8MB max_prepared_transactions = 0 work_mem = 250MB

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Scott Marlowe
On Wed, Nov 16, 2011 at 4:52 PM, Samuel Gendler wrote: > Could you elaborate on this a bit, or point me at some docs?  I manage a > 600GB db which does almost nothing but aggregates on partitioned tables - > the largest of which has approx 600 million rows across all partitions. >  grouping in the

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Scott Marlowe
On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra wrote: > But you're right - you're not bound by I/O (although I don't know what are > those 15% - iowait, util or what?). The COUNT(DISTINCT) has to actually > keep all the distinct values to determine which are actually distinct. Actually I meant to

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tory M Blue
Thanks all, I misspoke on our use of the index. We do have an index on log_date and it is being used here is the explain analyze plan. 'Aggregate (cost=7266186.16..7266186.17 rows=1 width=8) (actual time=127575.030..127575.030 rows=1 loops=1)' ' -> Bitmap Heap Scan on userstats (cost=13518

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tomas Vondra
Hi. On 16 Listopad 2011, 23:53, Tory M Blue wrote: > > We now have about 180mill records in that table. The database size is > about 580GB and the userstats table which is the biggest one and the > one we query the most is 83GB. > > Just a basic query takes 4 minutes: > > For e.g. select count(dis

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Samuel Gendler
On Wed, Nov 16, 2011 at 3:32 PM, Scott Marlowe wrote: > > If the OP's considering partitioning, they should really consider > upgrading to 9.1 which has much better performance of things like > aggregates against partition tables. > > Could you elaborate on this a bit, or point me at some docs? I

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Scott Marlowe
On Wed, Nov 16, 2011 at 4:27 PM, Alan Hodgson wrote: > On November 16, 2011 02:53:17 PM Tory M Blue wrote: >> We now have about 180mill records in that table. The database size is >> about 580GB and the userstats table which is the biggest one and the >> one we query the most is 83GB. >> >> Just a

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Alan Hodgson
On November 16, 2011 02:53:17 PM Tory M Blue wrote: > We now have about 180mill records in that table. The database size is > about 580GB and the userstats table which is the biggest one and the > one we query the most is 83GB. > > Just a basic query takes 4 minutes: > > For e.g. select count(dis

[PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tory M Blue
Linux F12 64bit Postgres 8.4.4 16 proc / 32GB 8 disk 15KRPM SAS/Raid 5 (I know!) shared_buffers = 6000MB #temp_buffers = 8MB max_prepared_transactions = 0 work_mem = 250MB maintenance_work_mem = 1000MB

Re: [PERFORM] Slow queries / commits, mis-configuration or hardware issues?

2011-11-16 Thread Tomas Vondra
On 16 Listopad 2011, 18:31, Cody Caughlan wrote: > > On Nov 16, 2011, at 8:52 AM, Tomas Vondra wrote: > >> On 16 Listopad 2011, 2:21, Cody Caughlan wrote: >>> How did you build your RAID array? Maybe I have a fundamental flaw / >>> misconfiguration. I am doing it via: >>> >>> $ yes | mdadm --create

Re: [PERFORM] Slow queries / commits, mis-configuration or hardware issues?

2011-11-16 Thread Cody Caughlan
On Nov 16, 2011, at 8:52 AM, Tomas Vondra wrote: > On 16 Listopad 2011, 2:21, Cody Caughlan wrote: >> How did you build your RAID array? Maybe I have a fundamental flaw / >> misconfiguration. I am doing it via: >> >> $ yes | mdadm --create /dev/md0 --level=10 -c256 --raid-devices=4 >> /dev/xvdb

Re: [PERFORM] Slow queries / commits, mis-configuration or hardware issues?

2011-11-16 Thread Tomas Vondra
On 16 Listopad 2011, 2:21, Cody Caughlan wrote: > How did you build your RAID array? Maybe I have a fundamental flaw / > misconfiguration. I am doing it via: > > $ yes | mdadm --create /dev/md0 --level=10 -c256 --raid-devices=4 > /dev/xvdb /dev/xvdc /dev/xvdd /dev/xvde > $ pvcreate /dev/md0 > $ vgc

Re: [PERFORM] Slow queries / commits, mis-configuration or hardware issues?

2011-11-16 Thread Tomas Vondra
On 16 Listopad 2011, 5:27, Greg Smith wrote: > On 11/14/2011 01:16 PM, Cody Caughlan wrote: >> We're starting to see some slow queries, especially COMMITs that are >> happening more frequently. The slow queries are against seemingly >> well-indexed tables. >> Slow commits like: >> >> 2011-11-14 17:

Re: [PERFORM] Subquery in a JOIN not getting restricted?

2011-11-16 Thread Jay Levitt
Tom Lane wrote: Jay Levitt writes: If the query was more like select questions.id from questions join ( select sum(u.id) from users as u group by u.id ) as s on s.id = questions.user_id where questions.id = 1; would you no longer be surprised that it scanned all user rows? I