Re: [PERFORM] Hash join gets slower as work_mem increases?
Tomas Vondra wrote: > Yes, that's clearly the culprit here. In both cases we estimate here are > only ~4000 tuples in the hash, and 9.3 sizes the hash table to have at > most ~10 tuples per bucket (in a linked list). > > However we actually get ~3M rows, so there will be ~3000 tuples per > bucket, and that's extremely expensive to walk. The reason why 100MB is > faster is that it's using 2 batches, thus making the lists "just" ~1500 > tuples long. > > This is pretty much exactly the reason why I reworked hash joins in 9.5. > I'd bet it's going to be ~20x faster on that version. Thank you for the explanation! Yours, Laurenz Albe -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash join gets slower as work_mem increases?
Tomas Vondra wrote: > On 01/29/2016 04:17 PM, Albe Laurenz wrote: >> I have a query that runs *slower* if I increase work_mem. >> >> The execution plans are identical in both cases, except that a temp file >> is used when work_mem is smaller. >> What could be an explanation for this? >> Is this known behaviour? > > There is a bunch of possible causes for such behavior, but it's quite > impossible to say if this is an example of one of them as you have not > posted the interesting parts of the explain plan. Also, knowing > PostgreSQL version would be useful. > > I don't think the example you posted is due to exceeding on-CPU cache as > that's just a few MBs per socket, so the smaller work_mem is > significantly larger. > > What I'd expect to be the issue here is under-estimate of the hash table > size, resulting in too few buckets and thus long chains of tuples that > need to be searched sequentially. Smaller work_mem values usually limit > the length of those chains in favor of batching. > > Please, post the whole explain plan - especially the info about number > of buckets/batches and the Hash node details. Thanks for looking at this. Sorry, I forgot to mention that this is PostgreSQL 9.3.10. I didn't post the whole plan since it is awfully long, I'll include hyperlinks for the whole plan. work_mem = '100MB' (http://explain.depesz.com/s/7b6a): -> Hash Join (cost=46738.74..285400.61 rows=292 width=8) (actual time=4296.986..106087.683 rows=187222 loops=1) Hash Cond: ("*SELECT* 1_2".postadresse_id = p.postadresse_id) Buffers: shared hit=1181177 dirtied=1, temp read=7232 written=7230 [...] -> Hash (cost=18044.92..18044.92 rows=4014 width=8) (actual time=4206.892..4206.892 rows=3096362 loops=1) Buckets: 1024 Batches: 2 (originally 1) Memory Usage: 102401kB Buffers: shared hit=1134522 dirtied=1, temp written=5296 work_mem = '500MB' (http://explain.depesz.com/s/Cgkl): -> Hash Join (cost=46738.74..285400.61 rows=292 width=8) (actual time=3802.849..245970.049 rows=187222 loops=1) Hash Cond: ("*SELECT* 1_2".postadresse_id = p.postadresse_id) Buffers: shared hit=1181175 dirtied=111 [...] -> Hash (cost=18044.92..18044.92 rows=4014 width=8) (actual time=3709.584..3709.584 rows=3096360 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 120952kB Buffers: shared hit=1134520 dirtied=111 Does that support your theory? There is clearly an underestimate here, caused by correlated attributes, but is that the cause for the bad performance with increased work_mem? Yours, Laurenz Albe -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash join gets slower as work_mem increases?
On 02/01/2016 10:38 AM, Albe Laurenz wrote: Tomas Vondra wrote: ... I didn't post the whole plan since it is awfully long, I'll include hyperlinks for the whole plan. work_mem = '100MB' (http://explain.depesz.com/s/7b6a): -> Hash Join (cost=46738.74..285400.61 rows=292 width=8) (actual time=4296.986..106087.683 rows=187222 loops=1) Hash Cond: ("*SELECT* 1_2".postadresse_id = p.postadresse_id) Buffers: shared hit=1181177 dirtied=1, temp read=7232 written=7230 [...] -> Hash (cost=18044.92..18044.92 rows=4014 width=8) (actual time=4206.892..4206.892 rows=3096362 loops=1) Buckets: 1024 Batches: 2 (originally 1) Memory Usage: 102401kB Buffers: shared hit=1134522 dirtied=1, temp written=5296 work_mem = '500MB' (http://explain.depesz.com/s/Cgkl): -> Hash Join (cost=46738.74..285400.61 rows=292 width=8) (actual time=3802.849..245970.049 rows=187222 loops=1) Hash Cond: ("*SELECT* 1_2".postadresse_id = p.postadresse_id) Buffers: shared hit=1181175 dirtied=111 [...] -> Hash (cost=18044.92..18044.92 rows=4014 width=8) (actual time=3709.584..3709.584 rows=3096360 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 120952kB Buffers: shared hit=1134520 dirtied=111 Does that support your theory? There is clearly an underestimate here, caused by correlated attributes, but is that the cause for the bad performance with increased work_mem? Yes, that's clearly the culprit here. In both cases we estimate here are only ~4000 tuples in the hash, and 9.3 sizes the hash table to have at most ~10 tuples per bucket (in a linked list). However we actually get ~3M rows, so there will be ~3000 tuples per bucket, and that's extremely expensive to walk. The reason why 100MB is faster is that it's using 2 batches, thus making the lists "just" ~1500 tuples long. This is pretty much exactly the reason why I reworked hash joins in 9.5. I'd bet it's going to be ~20x faster on that version. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Hash join gets slower as work_mem increases?
I have a query that runs *slower* if I increase work_mem. The execution plans are identical in both cases, except that a temp file is used when work_mem is smaller. The relevant lines of EXPLAIN ANALYZE output are: With work_mem='100MB': -> Hash Join (cost=46738.74..285400.61 rows=292 width=8) (actual time=4296.986..106087.683 rows=187222 loops=1) Hash Cond: ("*SELECT* 1_2".postadresse_id = p.postadresse_id) Buffers: shared hit=1181177 dirtied=1, temp read=7232 written=7230 With work_mem='500MB': -> Hash Join (cost=46738.74..285400.61 rows=292 width=8) (actual time=3802.849..245970.049 rows=187222 loops=1) Hash Cond: ("*SELECT* 1_2".postadresse_id = p.postadresse_id) Buffers: shared hit=1181175 dirtied=111 I ran operf on both backends, and they look quite similar, except that the number of samples is different (this is "opreport -c" output): CPU: Intel Sandy Bridge microarchitecture, speed 2899.8 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 9 samples %image name symbol name --- 112 0.0019 postgres ExecProcNode 3020116 49.9904 postgres ExecScanHashBucket 3021162 50.0077 postgres ExecHashJoin 3020116 92.8440 postgres ExecScanHashBucket 3020116 49.9207 postgres ExecScanHashBucket [self] 3020116 49.9207 postgres ExecScanHashBucket 8190 0.1354 vmlinux apic_timer_interrupt What could be an explanation for this? Is this known behaviour? Yours, Laurenz Albe -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash join gets slower as work_mem increases?
Hi > I ran operf on both backends, and they look quite similar, except that the > number of samples is different (this is "opreport -c" output): > > CPU: Intel Sandy Bridge microarchitecture, speed 2899.8 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 9 > samples %image name symbol name > > --- > 112 0.0019 postgres ExecProcNode > 3020116 49.9904 postgres ExecScanHashBucket > 3021162 50.0077 postgres ExecHashJoin > 3020116 92.8440 postgres ExecScanHashBucket > 3020116 49.9207 postgres ExecScanHashBucket [self] > 3020116 49.9207 postgres ExecScanHashBucket > 8190 0.1354 vmlinux apic_timer_interrupt > > What could be an explanation for this? > Is this known behaviour? > one issue was fixed in 9.5 large hash table can introduce a lot of outs from L1, L2 caches. Pavel > > Yours, > Laurenz Albe > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >