On Thu, Dec 12, 2013 at 6:16 PM, Bryce Covert < br...@brycecovertoperations.com> wrote:
> > Thanks a lot for the help. I'm not familiar with explain buffers, but > here's the results: > > Limit (cost=0.00..648.71 rows=50 width=8) (actual > time=653.681..52328.707 rows=50 loops=1) > Buffers: shared hit=7875 read=9870 > -> GroupAggregate (cost=0.00..55672.36 rows=4291 width=8) (actual > time=653.671..52328.480 rows=50 loops=1) > > Filter: ((sum(electricusage.usage) >= 3284::numeric) AND > (sum(electricusage.usage) <= 3769::numeric)) > Buffers: shared hit=7875 read=9870 > -> Nested Loop (cost=0.00..55262.93 rows=34506 width=8) (actual > time=432.129..52200.465 rows=30711 loops=1) > Buffers: shared hit=7875 read=9870 > -> Index Scan using premiseaccount_bucket_58c70392619aa36f > on premiseaccount premiseaccount (cost=0.00..15433.71 rows=4291 width=4) > (actual time=338.160..10014.780 rows=3437 loops=1) > Index Cond: ((bucket)::text = > '85349_single-family'::text) > Buffers: shared hit=744 read=2692 > -> Index Scan using electricusage_premise_account_id on > electricusage electricusage (cost=0.00..9.17 rows=9 width=8) (actual > time=11.430..12.235 rows=9 loops=3437) > Index Cond: (premise_account_id = premiseaccount.id) > Filter: (from_date >= '2012-11-20 > 00:00:00+00'::timestamp with time zone) > Buffers: shared hit=7131 read=7178 > Total runtime: 52329.028 ms > (15 rows) > > and the iostat results... > Linux 3.11.6-x86_64-linode35 (preview-aps-new) 12/12/2013 > _x86_64_ (8 CPU) > > ... > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > xvda 0.10 47.10 152.40 7.80 2.20 0.21 > 30.92 1.36 8.50 8.03 17.53 6.14 98.33 > xvdb 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 This means it's doing random I/O, and that your disk is a single 7200RPM drive (152 r/s being typical for that hardware). You can improve this by either: 1 - Turning that random I/O pattern into sequential, or 2 - Getting better I/O. I'll assume 2 isn't available to you on linode, so for 1, you could try lowering effective_cache_size substantially. It seems you're not getting nearly as much caching as you think (ie 2GB). However, I doubt there's a plan that can get you significantly better performance given your hardware. You may shave a few seconds, though, if you increase work_mem. It seems it should have used a bitmap index scan for at least one of the index scans there, and a low work_mem could be what's limiting the planner's possibilities. What are your settings in that area?