Re: [PERFORM] View containing a recursive function

2016-02-01 Thread Tom Lane
Mathieu De Zutter  writes:
> I have a recursive part in my database logic that I want to isolate and
> reuse as a view. I had found a blog that explained how move a function
> parameter into a view. The SQL is in attachment.
> When I write a query based on that view with a fixed value (or values) for
> the (input) parameter, the planner does fine and only evaluates the
> function once.
> However, when the value of the parameter should be deduced from something
> else, the planner doesn't understand that and will evaluate the function
> for each possible value.

I do not think this has anything to do with whether the query inside the
function is recursive.  Rather, the problem is that the view has a
set-returning function in its targetlist, which prevents the view from
being flattened into the outer query, per this bit in is_simple_subquery():

/*
 * Don't pull up a subquery that has any set-returning functions in its
 * targetlist.  Otherwise we might well wind up inserting set-returning
 * functions into places where they mustn't go, such as quals of higher
 * queries.  This also ensures deletion of an empty jointree is valid.
 */
if (expression_returns_set((Node *) subquery->targetList))
return false;

Lack of flattening disables a lot of join optimizations, including the
one you want.

Assuming you have a reasonably late-model PG, you could rewrite the
view with a lateral function call:

CREATE OR REPLACE VIEW covering_works_r AS
  SELECT
w.idAS work_id,
fn.fAS covering_work_id
  FROM work w, fn_covering_works(w.id) as fn(f);

which puts the SRF into FROM where the planner can deal with it much
better.

Another problem is that you let the function default to being VOLATILE,
which would have disabled view flattening even if this didn't.  I see
no reason for this function not to be marked STABLE.

Doing both of those things gives me a plan like this:

 Nested Loop  (cost=448.24..509.53 rows=1131 width=4)
   ->  Nested Loop  (cost=0.31..16.36 rows=1 width=8)
 ->  Index Scan using work_first_release_id_idx on work w  
(cost=0.15..8.17 rows=1 width=4)
   Index Cond: (first_release_id = 4249)
 ->  Index Only Scan using work_pkey on work w_1  (cost=0.15..8.17 
rows=1 width=4)
   Index Cond: (id = w.id)
   ->  CTE Scan on func  (cost=447.93..470.55 rows=1131 width=0)
 CTE func
   ->  Recursive Union  (cost=0.00..447.93 rows=1131 width=4)
 ->  Result  (cost=0.00..0.01 rows=1 width=0)
 ->  Hash Join  (cost=0.33..42.53 rows=113 width=4)
   Hash Cond: (ad.original_id = f.work_id)
   ->  Seq Scan on adaptation ad  (cost=0.00..32.60 
rows=2260 width=8)
   ->  Hash  (cost=0.20..0.20 rows=10 width=4)
 ->  WorkTable Scan on func f  (cost=0.00..0.20 
rows=10 width=4)

which looks hairier, but that's because the function has been inlined
which is usually what you want for a SQL-language function.  The join
is happening the way you want.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash join gets slower as work_mem increases?

2016-02-01 Thread Albe Laurenz
Tomas Vondra wrote:
> Yes, that's clearly the culprit here. In both cases we estimate here are
> only ~4000 tuples in the hash, and 9.3 sizes the hash table to have at
> most ~10 tuples per bucket (in a linked list).
> 
> However we actually get ~3M rows, so there will be ~3000 tuples per
> bucket, and that's extremely expensive to walk. The reason why 100MB is
> faster is that it's using 2 batches, thus making the lists "just" ~1500
> tuples long.
> 
> This is pretty much exactly the reason why I reworked hash joins in 9.5.
> I'd bet it's going to be ~20x faster on that version.

Thank you for the explanation!

Yours,
Laurenz Albe

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash join gets slower as work_mem increases?

2016-02-01 Thread Albe Laurenz
Tomas Vondra wrote:
> On 01/29/2016 04:17 PM, Albe Laurenz wrote:
>> I have a query that runs *slower* if I increase work_mem.
>>
>> The execution plans are identical in both cases, except that a temp file
>> is used when work_mem is smaller.

>> What could be an explanation for this?
>> Is this known behaviour?
> 
> There is a bunch of possible causes for such behavior, but it's quite
> impossible to say if this is an example of one of them as you have not
> posted the interesting parts of the explain plan. Also, knowing
> PostgreSQL version would be useful.
> 
> I don't think the example you posted is due to exceeding on-CPU cache as
> that's just a few MBs per socket, so the smaller work_mem is
> significantly larger.
> 
> What I'd expect to be the issue here is under-estimate of the hash table
> size, resulting in too few buckets and thus long chains of tuples that
> need to be searched sequentially. Smaller work_mem values usually limit
> the length of those chains in favor of batching.
> 
> Please, post the whole explain plan - especially the info about number
> of buckets/batches and the Hash node details.

Thanks for looking at this.
Sorry, I forgot to mention that this is PostgreSQL 9.3.10.

I didn't post the whole plan since it is awfully long, I'll include hyperlinks
for the whole plan.

work_mem = '100MB' (http://explain.depesz.com/s/7b6a):

->  Hash Join  (cost=46738.74..285400.61 rows=292 width=8) (actual 
time=4296.986..106087.683 rows=187222 loops=1)
   Hash Cond: ("*SELECT* 1_2".postadresse_id = p.postadresse_id)
   Buffers: shared hit=1181177 dirtied=1, temp read=7232 written=7230
[...]
   ->  Hash  (cost=18044.92..18044.92 rows=4014 width=8) (actual 
time=4206.892..4206.892 rows=3096362 loops=1)
 Buckets: 1024  Batches: 2 (originally 1)  Memory Usage: 102401kB
 Buffers: shared hit=1134522 dirtied=1, temp written=5296

work_mem = '500MB' (http://explain.depesz.com/s/Cgkl):

->  Hash Join  (cost=46738.74..285400.61 rows=292 width=8) (actual 
time=3802.849..245970.049 rows=187222 loops=1)
   Hash Cond: ("*SELECT* 1_2".postadresse_id = p.postadresse_id)
   Buffers: shared hit=1181175 dirtied=111
[...]
   ->  Hash  (cost=18044.92..18044.92 rows=4014 width=8) (actual 
time=3709.584..3709.584 rows=3096360 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 120952kB
 Buffers: shared hit=1134520 dirtied=111

Does that support your theory?

There is clearly an underestimate here, caused by correlated attributes, but
is that the cause for the bad performance with increased work_mem?

Yours,
Laurenz Albe

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash join gets slower as work_mem increases?

2016-02-01 Thread Tomas Vondra

On 02/01/2016 10:38 AM, Albe Laurenz wrote:

Tomas Vondra wrote:

...

I didn't post the whole plan since it is awfully long, I'll include hyperlinks
for the whole plan.

work_mem = '100MB' (http://explain.depesz.com/s/7b6a):

->  Hash Join  (cost=46738.74..285400.61 rows=292 width=8) (actual 
time=4296.986..106087.683 rows=187222 loops=1)
Hash Cond: ("*SELECT* 1_2".postadresse_id = p.postadresse_id)
Buffers: shared hit=1181177 dirtied=1, temp read=7232 written=7230
[...]
->  Hash  (cost=18044.92..18044.92 rows=4014 width=8) (actual 
time=4206.892..4206.892 rows=3096362 loops=1)
  Buckets: 1024  Batches: 2 (originally 1)  Memory Usage: 102401kB
  Buffers: shared hit=1134522 dirtied=1, temp written=5296

work_mem = '500MB' (http://explain.depesz.com/s/Cgkl):

->  Hash Join  (cost=46738.74..285400.61 rows=292 width=8) (actual 
time=3802.849..245970.049 rows=187222 loops=1)
Hash Cond: ("*SELECT* 1_2".postadresse_id = p.postadresse_id)
Buffers: shared hit=1181175 dirtied=111
[...]
->  Hash  (cost=18044.92..18044.92 rows=4014 width=8) (actual 
time=3709.584..3709.584 rows=3096360 loops=1)
  Buckets: 1024  Batches: 1  Memory Usage: 120952kB
  Buffers: shared hit=1134520 dirtied=111

Does that support your theory?

There is clearly an underestimate here, caused by correlated attributes, but
is that the cause for the bad performance with increased work_mem?


Yes, that's clearly the culprit here. In both cases we estimate here are 
only ~4000 tuples in the hash, and 9.3 sizes the hash table to have at 
most ~10 tuples per bucket (in a linked list).


However we actually get ~3M rows, so there will be ~3000 tuples per 
bucket, and that's extremely expensive to walk. The reason why 100MB is 
faster is that it's using 2 batches, thus making the lists "just" ~1500 
tuples long.


This is pretty much exactly the reason why I reworked hash joins in 9.5. 
I'd bet it's going to be ~20x faster on that version.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance