Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

Tom Lane Tue, 14 Jul 2015 15:05:34 -0700

Robert Haas <[email protected]> writes:
> Both you and Andres have articulated the concern that CustomScan isn't
> actually useful, but I still don't really understand why not.


> I'm curious, for example, whether CustomScan would have been
> sufficient to build TABLESAMPLE, and if not, why not.  Obviously the
> syntax has to be in core,

... so you just made the point ...

> but why couldn't the syntax just call an
> extension-provided callback that returns a custom scan, instead of
> having a node just for TABLESAMPLE?

Because that only works for small values of "work".  As far as TABLESAMPLE
goes, I intend to hold Simon's feet to the fire until there's a less
cheesy answer to the problem of scan reproducibility.  Assuming we're
going to allow sample methods that can't meet the reproducibility
requirement, we need something to prevent them from producing visibly
broken query results.  Ideally, the planner would avoid putting such a
scan on the inside of a nestloop.  A CustomScan-based implementation could
not possibly arrange such a thing; we'd have to teach the core planner
about the concern.

Or, taking the example of a GpuScan node, it's essentially impossible
to persuade the planner to delegate any expensive function calculations,
aggregates, etc to such a node; much less teach it that that way is cheaper
than doing such things the usual way.  So yeah, KaiGai-san may have a
module that does a few things with a GPU, but it's far from doing all or
even very much of what one would want.

Now, as part of the upper-planner-rewrite business that I keep hoping to
get to when I'm not riding herd on bad patches, it's likely that we might
have enough new infrastructure soon that that particular problem could
be solved.  But there would just be another problem after that; a likely
example is not having adequate statistics, or sufficiently fine-grained
function cost estimates, to be able to make valid choices once there's
more than one way to do such calculations.  (I'm not really impressed by
"the GPU is *always* faster" approaches.)  Significant improvements of
that sort are going to take core-code changes.

Even worse, if there do get to be any popular custom-scan extensions,
we'll break them anytime we make any nontrivial planner changes, because
there is no arms-length API there.  A trivial example is that even adding
or changing any fields in struct Path will necessarily break custom scan
providers, because they build Paths for themselves with no interposed API.

In large part this is the same as my core concern about the TABLESAMPLE
patch: exposing dubiously-designed APIs is soon going to force us to make
choices between breaking those APIs or not being able to make changes we
need to make.  In the case of custom scans, I will not be particularly
sad when (not if) we break custom scan providers; but in other cases such
tradeoffs are going to be harder to make.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

Reply via email to