On 23 May 2018 at 21:31, Heikki Linnakangas <hlinn...@iki.fi> wrote:
> I've been working on a patch to add a little cache to SubPlans, to speed up
> queries with correlated subqueries, where the same subquery is currently
> executed multiple times with the same parameters. The idea is to cache the
> result of the subplan, with the correlation vars as the cache key.

Hi,

This seems like an interesting area to make improvements, so I've
signed up to review the patch.

>From looking at the code I see that the caching is being done inside
nodeSubplan.c. I don't think this is the right approach to the
problem. The problem exists for any parameterized path, so I think a
more general approach would be much better.

We already have Materialize nodes to cache the results of an entire
subplan, and this seems to have quite a bit in common with that, only
we'd want to cache multiple results with a key to determine which
result set should be returned. Due to the similarities with
Materialize, I think that the cache should be a node itself and not
bury the cache logic in some other node type that's meant for some
other purpose.

"LazyMaterialize" seems like a good option for a name. It seems better
than "LazyHash" since you may not want to restrict it to a hash table
based cache in the future.  A binary search tree may be a good option
for types that cannot be hashed.

Materialize nodes are injected above the inner side node of MergeJoins
based on cost, so I think this node type could just do the same. Maybe
something like estimate_num_groups(<exprs being compared to params>) /
path->rows is below some defined constant, perhaps something like 0.5.
Although experimentation would be required. It might be good to take
into account some other cost factors too.

I imagine we'd want to only allow this optimisation for hashjoinable
types. This seems pretty natural since your cache implementation is a
hash table, so, of course, we're going to need a hash function.

Wondering your thoughts on this idea.

I'll mark as waiting on author in the meantime.

It's great to see someone working on this.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply via email to