Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-15 Thread Robert Haas
On Wed, Jul 15, 2015 at 2:28 AM, Michael Paquier
 wrote:
> On Wed, Jul 15, 2015 at 10:12 AM, Kouhei Kaigai  wrote:
>> We never guarantee the interface compatibility between major version up.
>> If we add/modify interface on v9.6, it is duty for developer of extensions
>> to follow the new version, even not specific to custom-scan provider.
>> If v9.6 adds support fine-grained function cost estimation, I also have
>> to follow the feature, but it is natural.
>
> Maintaining compatibility across major versions is a best-effort and
> even if we sometimes break things across major versions, and sometimes
> even silently (take the example of 9.3's background worker that do not
> start with 9.4 as long as bgw_notify_pid is not set to 0), the
> approach is usually taken to have APIs stable and convenient able to
> cover a maximum set of cases for a given set of plugins, and this
> serves well in the long term. Now it is true that we cannot assume
> either that the version of a plugin API will be perfect forever and
> will be able to cover all the imagined test cases at first shot, still
> I'd like to think that things are broken only when necessary and with
> good reasons. A set of APIs changed at each major release tends to be
> proof that research lacked in the first version, and would surely
> demotivate its adoption by extension developers.

Maybe.  If those changes are demanded by extension developers who are
trying to do useful stuff with the APIs and finding problems, then
changing things will probably make those extension developers happy.
If the changes are necessitated by core improvements, then we might
get some complaints if those core improvements are perceived to have
small value compared to the API break.  But I think complaints of that
kind are very rare.  The only thing I can think of that falls into
approximately that category is a gripe about the replacement of
relistemp by relpersistence.  But I think that was more motivated by
the way it broke SQL monitoring queries than by the way it broke C
code.  There may be other more recent examples; but that's the only
one I can think of offhand.  I just don't see it as a big issue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-15 Thread Robert Haas
On Tue, Jul 14, 2015 at 6:04 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Both you and Andres have articulated the concern that CustomScan isn't
>> actually useful, but I still don't really understand why not.
>
>> I'm curious, for example, whether CustomScan would have been
>> sufficient to build TABLESAMPLE, and if not, why not.  Obviously the
>> syntax has to be in core,
>
> ... so you just made the point ...

If you're going to set the bar that high, there's no point in anyone
ever trying to add extensibility anywhere, because there will always
be some other place where there isn't enough extensibility someplace
else to do everything that someone might want.  The fact that the
parser isn't extensible doesn't make custom scans useless any more
than the fact that the non-extensibility of WAL-logging makes pg_am
useless.

>> but why couldn't the syntax just call an
>> extension-provided callback that returns a custom scan, instead of
>> having a node just for TABLESAMPLE?
>
> Because that only works for small values of "work".  As far as TABLESAMPLE
> goes, I intend to hold Simon's feet to the fire until there's a less
> cheesy answer to the problem of scan reproducibility.  Assuming we're
> going to allow sample methods that can't meet the reproducibility
> requirement, we need something to prevent them from producing visibly
> broken query results.  Ideally, the planner would avoid putting such a
> scan on the inside of a nestloop.  A CustomScan-based implementation could
> not possibly arrange such a thing; we'd have to teach the core planner
> about the concern.

Well, I think it would be better to change the tablesample methods as
you previously proposed, so that they are actually stable.  But if
that's not possible, then when you (or someone) makes the core planner
able to consider such matters, you could add a CUSTOM_PATH_UNSTABLE
flag that a custom path can use to notify the core system about the
problem.  Then tablesample methods that are unstable can set the flag
and those that are stable are not penalized.  Presumably we'll end up
with such a flag in the tablesample-path anyway, and a separate flag
in every kind of future scan that needs similar consideration.  If we
put it in some piece of infrastructure (like the custom-path stuff)
that is reusable, we could avoid reinventing the wheel every time.

> Or, taking the example of a GpuScan node, it's essentially impossible
> to persuade the planner to delegate any expensive function calculations,
> aggregates, etc to such a node; much less teach it that that way is cheaper
> than doing such things the usual way.  So yeah, KaiGai-san may have a
> module that does a few things with a GPU, but it's far from doing all or
> even very much of what one would want.

It's true that there are things he can't do this way, but without the
custom-scan stuff, he'd be able to do even less.

> Now, as part of the upper-planner-rewrite business that I keep hoping to
> get to when I'm not riding herd on bad patches, it's likely that we might
> have enough new infrastructure soon that that particular problem could
> be solved.  But there would just be another problem after that; a likely
> example is not having adequate statistics, or sufficiently fine-grained
> function cost estimates, to be able to make valid choices once there's
> more than one way to do such calculations.  (I'm not really impressed by
> "the GPU is *always* faster" approaches.)  Significant improvements of
> that sort are going to take core-code changes.

Probably, but giving people the ability to experiment outside core,
even if it's limited, often leads to good things.  And when it
doesn't, it doesn't really cost us anything.

> Even worse, if there do get to be any popular custom-scan extensions,
> we'll break them anytime we make any nontrivial planner changes, because
> there is no arms-length API there.  A trivial example is that even adding
> or changing any fields in struct Path will necessarily break custom scan
> providers, because they build Paths for themselves with no interposed API.

I agree that will happen, but I don't care.  This happens all the
time, and every person other than yourself who has commented on this
issue has said that they'd rather have an API exposed that will later
get whacked around without warning than have no API exposed at all,
not just with respect to custom-paths, but with a whole variety of
other things as well, like sticking PGDLLIMPORT on all of the
variables that back GUCs.  It is really far more tedious to try to
work around the lack of a PGDLLIMPORT marking (which causes a huge
annoyance now) than to adjust the code if and when somebody changes
something about the GUC (which causes a small annoyance later, and
only if there actually are changes).

> In large part this is the same as my core concern about the TABLESAMPLE
> patch: exposing dubiously-designed APIs is soon going to force us to make
> choices between breaking those APIs or not being ab

Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-15 Thread Kouhei Kaigai
> -Original Message-
> From: Michael Paquier [mailto:michael.paqu...@gmail.com]
> Sent: Wednesday, July 15, 2015 3:29 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Tom Lane; Robert Haas; Alvaro Herrera; hlinnaka; Jim Nasby; Kohei KaiGai;
> PgHacker; Simon Riggs
> Subject: Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] 
> Custom
> Plan API)
> 
> On Wed, Jul 15, 2015 at 10:12 AM, Kouhei Kaigai  wrote:
> > We never guarantee the interface compatibility between major version up.
> > If we add/modify interface on v9.6, it is duty for developer of extensions
> > to follow the new version, even not specific to custom-scan provider.
> > If v9.6 adds support fine-grained function cost estimation, I also have
> > to follow the feature, but it is natural.
> 
> Maintaining compatibility across major versions is a best-effort and
> even if we sometimes break things across major versions, and sometimes
> even silently (take the example of 9.3's background worker that do not
> start with 9.4 as long as bgw_notify_pid is not set to 0), the
> approach is usually taken to have APIs stable and convenient able to
> cover a maximum set of cases for a given set of plugins, and this
> serves well in the long term. Now it is true that we cannot assume
> either that the version of a plugin API will be perfect forever and
> will be able to cover all the imagined test cases at first shot, still
> I'd like to think that things are broken only when necessary and with
> good reasons. A set of APIs changed at each major release tends to be
> proof that research lacked in the first version, and would surely
> demotivate its adoption by extension developers.
>
Yep, I also don't want to break the interface compatibility without
something reasonable, and also think following-up new core features
is a good reason to enhance the interface in the future version.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Michael Paquier
On Wed, Jul 15, 2015 at 10:12 AM, Kouhei Kaigai  wrote:
> We never guarantee the interface compatibility between major version up.
> If we add/modify interface on v9.6, it is duty for developer of extensions
> to follow the new version, even not specific to custom-scan provider.
> If v9.6 adds support fine-grained function cost estimation, I also have
> to follow the feature, but it is natural.

Maintaining compatibility across major versions is a best-effort and
even if we sometimes break things across major versions, and sometimes
even silently (take the example of 9.3's background worker that do not
start with 9.4 as long as bgw_notify_pid is not set to 0), the
approach is usually taken to have APIs stable and convenient able to
cover a maximum set of cases for a given set of plugins, and this
serves well in the long term. Now it is true that we cannot assume
either that the version of a plugin API will be perfect forever and
will be able to cover all the imagined test cases at first shot, still
I'd like to think that things are broken only when necessary and with
good reasons. A set of APIs changed at each major release tends to be
proof that research lacked in the first version, and would surely
demotivate its adoption by extension developers.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Kouhei Kaigai
> Or, taking the example of a GpuScan node, it's essentially impossible
> to persuade the planner to delegate any expensive function calculations,
> aggregates, etc to such a node; much less teach it that that way is cheaper
> than doing such things the usual way.  So yeah, KaiGai-san may have a
> module that does a few things with a GPU, but it's far from doing all or
> even very much of what one would want.
>
Why do we need to run all the functions on GPU device? PG-Strom simply
gives up to inject CustomPath if required qualifier contains unsupported
functions or data types, thus, these workloads are executed as usual.
I don't intend 100% coverage by GPU. That's no matter. People who use
massive numerical/mathematical workload will love GPU.

> Now, as part of the upper-planner-rewrite business that I keep hoping to
> get to when I'm not riding herd on bad patches, it's likely that we might
> have enough new infrastructure soon that that particular problem could
> be solved.  But there would just be another problem after that; a likely
> example is not having adequate statistics, or sufficiently fine-grained
> function cost estimates, to be able to make valid choices once there's
> more than one way to do such calculations.  (I'm not really impressed by
> "the GPU is *always* faster" approaches.)  Significant improvements of
> that sort are going to take core-code changes.
>
We never guarantee the interface compatibility between major version up.
If we add/modify interface on v9.6, it is duty for developer of extensions
to follow the new version, even not specific to custom-scan provider.
If v9.6 adds support fine-grained function cost estimation, I also have
to follow the feature, but it is natural.

> Even worse, if there do get to be any popular custom-scan extensions,
> we'll break them anytime we make any nontrivial planner changes, because
> there is no arms-length API there.  A trivial example is that even adding
> or changing any fields in struct Path will necessarily break custom scan
> providers, because they build Paths for themselves with no interposed API.
>
I cannot understand... If Path field gets changed, it is duty of extension
to follow the core change. We usually add/modify API specifications on
major version up. For example, I remember ProcessUtility_hook has been
changed a few times in the last several years.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Kouhei Kaigai
> -Original Message-
> From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> Sent: Wednesday, July 15, 2015 5:47 AM
> To: Robert Haas
> Cc: Alvaro Herrera; hlinnaka; Kaigai Kouhei(海外 浩平); Michael Paquier; Jim
> Nasby; Kohei KaiGai; PgHacker; Simon Riggs
> Subject: Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] 
> Custom
> Plan API)
> 
> Robert Haas  writes:
> > On Tue, Jul 14, 2015 at 3:07 PM, Alvaro Herrera
> >  wrote:
> >> As a general principle, I think it's a good idea to have a module that's
> >> mostly just a skeleton that guides people into writing something real to
> >> use whatever API is being tested.  It needs to be simple enough that not
> >> much need to be deleted when writing the real thing, and complex enough
> >> to cover the parts that need covering.  If whatever replaces ctidscan is
> >> too complex, it will not serve that purpose.
> >>
> >> My guess is that something whose only purpose is to test the custom scan
> >> interface for coverage purposes can be simpler than this module.
> 
> > See, I actually think the opposite: I think we've been accumulating a
> > reasonable amount of test code that actually serves no really useful
> > purpose and is just cruft.  Stuff like test_shm_mq and test_decoding
> > seem like they actually catches bugs, so I like that, but I think
> > stuff like worker_spi is actually TOO simple to be useful in building
> > anything real, and it provides no useful test coverage, either.  But
> > this is all a matter of opinion, of course, and I'll defer to whatever
> > the consensus is.
> 
> I think this ties into my core unhappiness with the customscan stuff,
> which is that I don't believe it's *possible* to do anything of very
> great interest with it.  I think anything really useful will require
> core code modifications and/or hooks that don't exist now.  So a finger
> exercise like ctidscan, even though it might have some marginal use,
> doesn't do much to alleviate that concern.  It certainly doesn't seem
> like it's a suitable placeholder proving that we aren't breaking any
> actual use cases for the feature.
>
The ctidscan is originally designed to validate the behavior of custom-scan
interface, so it is natural this module is valuable in limited cased.

However, I don't think that anything valuable usually takes core code
enhancement and/or new hooks, because we already have various extensions
around core code that utilizes existing infrastructures (even though its
specifications may be changed on major version up).
At least, custom-scan enables to implement edge-features which are not
easy to merge the core because of various reasons; like dependency to
proprietary library, too experimental features, too large code to review
as minimum valuable portion and so on.

> (BTW, if we care about the use cases this has, such as data recovery from
> partially-corrupt tables, it would make way more sense and take way less
> net new code to just teach TidScan about it.)
>
What I discussed with Hanada-san before was, a custom-scan provider that
replaces a particular relations join by simple scan of materialized-view
transparently. It is probably one other use case. Its design is in my
brain, but time for development is missing piece for me.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Tom Lane
Robert Haas  writes:
> Both you and Andres have articulated the concern that CustomScan isn't
> actually useful, but I still don't really understand why not.

> I'm curious, for example, whether CustomScan would have been
> sufficient to build TABLESAMPLE, and if not, why not.  Obviously the
> syntax has to be in core,

... so you just made the point ...

> but why couldn't the syntax just call an
> extension-provided callback that returns a custom scan, instead of
> having a node just for TABLESAMPLE?

Because that only works for small values of "work".  As far as TABLESAMPLE
goes, I intend to hold Simon's feet to the fire until there's a less
cheesy answer to the problem of scan reproducibility.  Assuming we're
going to allow sample methods that can't meet the reproducibility
requirement, we need something to prevent them from producing visibly
broken query results.  Ideally, the planner would avoid putting such a
scan on the inside of a nestloop.  A CustomScan-based implementation could
not possibly arrange such a thing; we'd have to teach the core planner
about the concern.

Or, taking the example of a GpuScan node, it's essentially impossible
to persuade the planner to delegate any expensive function calculations,
aggregates, etc to such a node; much less teach it that that way is cheaper
than doing such things the usual way.  So yeah, KaiGai-san may have a
module that does a few things with a GPU, but it's far from doing all or
even very much of what one would want.

Now, as part of the upper-planner-rewrite business that I keep hoping to
get to when I'm not riding herd on bad patches, it's likely that we might
have enough new infrastructure soon that that particular problem could
be solved.  But there would just be another problem after that; a likely
example is not having adequate statistics, or sufficiently fine-grained
function cost estimates, to be able to make valid choices once there's
more than one way to do such calculations.  (I'm not really impressed by
"the GPU is *always* faster" approaches.)  Significant improvements of
that sort are going to take core-code changes.

Even worse, if there do get to be any popular custom-scan extensions,
we'll break them anytime we make any nontrivial planner changes, because
there is no arms-length API there.  A trivial example is that even adding
or changing any fields in struct Path will necessarily break custom scan
providers, because they build Paths for themselves with no interposed API.

In large part this is the same as my core concern about the TABLESAMPLE
patch: exposing dubiously-designed APIs is soon going to force us to make
choices between breaking those APIs or not being able to make changes we
need to make.  In the case of custom scans, I will not be particularly
sad when (not if) we break custom scan providers; but in other cases such
tradeoffs are going to be harder to make.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Robert Haas
On Tue, Jul 14, 2015 at 4:47 PM, Tom Lane  wrote:
> I think this ties into my core unhappiness with the customscan stuff,
> which is that I don't believe it's *possible* to do anything of very
> great interest with it.  I think anything really useful will require
> core code modifications and/or hooks that don't exist now.  So a finger
> exercise like ctidscan, even though it might have some marginal use,
> doesn't do much to alleviate that concern.  It certainly doesn't seem
> like it's a suitable placeholder proving that we aren't breaking any
> actual use cases for the feature.

Both you and Andres have articulated the concern that CustomScan isn't
actually useful, but I still don't really understand why not.  KaiGai
has got working code (under a GPL license) that shows that it can be
used for what he calls GpuScan and GpuJoin, and the speedups are
actually pretty cool if you happen to have the right sort of query to
take advantage of it.  That code may be buggy and the license
precludes us using it anyway, but FWICT it does seem to work. I'd be
entirely amenable to improving the infrastructure such that it could
be used for a broader array of purposes, and I'm also amenable to
adding more hooks if we need more hooks to make it useful, but I'm not
clear at all on what you think is missing.

I'm curious, for example, whether CustomScan would have been
sufficient to build TABLESAMPLE, and if not, why not.  Obviously the
syntax has to be in core, but why couldn't the syntax just call an
extension-provided callback that returns a custom scan, instead of
having a node just for TABLESAMPLE?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jul 14, 2015 at 3:07 PM, Alvaro Herrera
>  wrote:
>> As a general principle, I think it's a good idea to have a module that's
>> mostly just a skeleton that guides people into writing something real to
>> use whatever API is being tested.  It needs to be simple enough that not
>> much need to be deleted when writing the real thing, and complex enough
>> to cover the parts that need covering.  If whatever replaces ctidscan is
>> too complex, it will not serve that purpose.
>> 
>> My guess is that something whose only purpose is to test the custom scan
>> interface for coverage purposes can be simpler than this module.

> See, I actually think the opposite: I think we've been accumulating a
> reasonable amount of test code that actually serves no really useful
> purpose and is just cruft.  Stuff like test_shm_mq and test_decoding
> seem like they actually catches bugs, so I like that, but I think
> stuff like worker_spi is actually TOO simple to be useful in building
> anything real, and it provides no useful test coverage, either.  But
> this is all a matter of opinion, of course, and I'll defer to whatever
> the consensus is.

I think this ties into my core unhappiness with the customscan stuff,
which is that I don't believe it's *possible* to do anything of very
great interest with it.  I think anything really useful will require
core code modifications and/or hooks that don't exist now.  So a finger
exercise like ctidscan, even though it might have some marginal use,
doesn't do much to alleviate that concern.  It certainly doesn't seem
like it's a suitable placeholder proving that we aren't breaking any
actual use cases for the feature.

(BTW, if we care about the use cases this has, such as data recovery from
partially-corrupt tables, it would make way more sense and take way less
net new code to just teach TidScan about it.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Robert Haas
On Tue, Jul 14, 2015 at 3:07 PM, Alvaro Herrera
 wrote:
>> We don't have anything that currently tests the Custom Scan interface
>> in the tree.  The question is how important that is, and whether it's
>> worth having what's basically a toy implementation just to demonstrate
>> that the feature can work.  If so, I think ctidscan is as good a toy
>> example as any; in the interest of full disclosure, I was the one who
>> suggested it in the first place.  But I am not entirely sure it's a
>> good idea to saddle ourselves with that maintenance effort.  It would
>> be a lot more interesting if we had an example that figured to be
>> generally useful.
>
> As a general principle, I think it's a good idea to have a module that's
> mostly just a skeleton that guides people into writing something real to
> use whatever API is being tested.  It needs to be simple enough that not
> much need to be deleted when writing the real thing, and complex enough
> to cover the parts that need covering.  If whatever replaces ctidscan is
> too complex, it will not serve that purpose.
>
> My guess is that something whose only purpose is to test the custom scan
> interface for coverage purposes can be simpler than this module.

See, I actually think the opposite: I think we've been accumulating a
reasonable amount of test code that actually serves no really useful
purpose and is just cruft.  Stuff like test_shm_mq and test_decoding
seem like they actually catches bugs, so I like that, but I think
stuff like worker_spi is actually TOO simple to be useful in building
anything real, and it provides no useful test coverage, either.  But
this is all a matter of opinion, of course, and I'll defer to whatever
the consensus is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Alvaro Herrera
Robert Haas wrote:

> We don't have anything that currently tests the Custom Scan interface
> in the tree.  The question is how important that is, and whether it's
> worth having what's basically a toy implementation just to demonstrate
> that the feature can work.  If so, I think ctidscan is as good a toy
> example as any; in the interest of full disclosure, I was the one who
> suggested it in the first place.  But I am not entirely sure it's a
> good idea to saddle ourselves with that maintenance effort.  It would
> be a lot more interesting if we had an example that figured to be
> generally useful.

As a general principle, I think it's a good idea to have a module that's
mostly just a skeleton that guides people into writing something real to
use whatever API is being tested.  It needs to be simple enough that not
much need to be deleted when writing the real thing, and complex enough
to cover the parts that need covering.  If whatever replaces ctidscan is
too complex, it will not serve that purpose.

My guess is that something whose only purpose is to test the custom scan
interface for coverage purposes can be simpler than this module.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Robert Haas
On Fri, Jul 10, 2015 at 8:55 AM, Heikki Linnakangas  wrote:
>> If somebody still needs it, I'll rebase and adjust the patch towards
>> the latest custom-scan interface. However, I cannot be motivated for
>> the feature nobody wants.
>
> Robert, can you weigh in on this? Do we currently have anything in the
> tree that tests the Custom Scan interface? If not, would this be helpful
> for that purpose?

We don't have anything that currently tests the Custom Scan interface
in the tree.  The question is how important that is, and whether it's
worth having what's basically a toy implementation just to demonstrate
that the feature can work.  If so, I think ctidscan is as good a toy
example as any; in the interest of full disclosure, I was the one who
suggested it in the first place.  But I am not entirely sure it's a
good idea to saddle ourselves with that maintenance effort.  It would
be a lot more interesting if we had an example that figured to be
generally useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-10 Thread Heikki Linnakangas
On 07/02/2015 08:21 AM, Kouhei Kaigai wrote:
> Folks,
> 
>> Moved this patch to next CF 2015-02 because of lack of review(ers).
>>
> Do we still need this patch as contrib module?
> It was originally required it as example of custom-scan interface last
> summer, however, here was no strong requirement after that, then, it
> was bumped to v9.6 development cycle.
> 
> If somebody still needs it, I'll rebase and adjust the patch towards
> the latest custom-scan interface. However, I cannot be motivated for
> the feature nobody wants.

Robert, can you weigh in on this? Do we currently have anything in the
tree that tests the Custom Scan interface? If not, would this be helpful
for that purpose?

- Heikki



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-01 Thread Kouhei Kaigai
Folks,

> Moved this patch to next CF 2015-02 because of lack of review(ers).
>
Do we still need this patch as contrib module?
It was originally required it as example of custom-scan interface last
summer, however, here was no strong requirement after that, then, it
was bumped to v9.6 development cycle.

If somebody still needs it, I'll rebase and adjust the patch towards
the latest custom-scan interface. However, I cannot be motivated for
the feature nobody wants.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier
> Sent: Friday, February 13, 2015 4:41 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Jim Nasby; Robert Haas; Simon Riggs; Kohei KaiGai; PgHacker
> Subject: Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] 
> Custom
> Plan API)
> 
> 
> 
> On Tue, Jan 6, 2015 at 10:51 PM, Kouhei Kaigai  wrote:
> 
> 
>   Jim, Thanks for your reviewing the patch.
> 
>   The attached patch is revised one according to your suggestion,
>   and also includes bug fix I could found.
> 
>   * Definitions of TIDOperator was moved to pg_operator.h
> as other operator doing.
>   * Support the case of "ctid (operator) Param" expression.
>   * Add checks if commutator of TID was not found.
>   * Qualifiers gets extracted on plan stage, not path stage.
>   * Adjust cost estimation logic to fit SeqScan manner.
>   * Add some new test cases for regression test.
> 
>   > On 12/24/14, 7:54 PM, Kouhei Kaigai wrote:
>   > >>> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai
>   > >>> 
>   > >> wrote:
>   > >>>> I'm not certain whether we should have this functionality in
>   > >>>> contrib from the perspective of workload that can help, but its
>   > >>>> major worth is for an example of custom-scan interface.
>   > >>> worker_spi is now in src/test/modules. We may add it there as 
> well,
>   > no?
>   > >>>
>   > >> Hmm, it makes sense for me. Does it also make sense to add a
>   > >> test-case to the core regression test cases?
>   > >>
>   > > The attached patch adds ctidscan module at test/module instead of
> contrib.
>   > > Basic portion is not changed from the previous post, but file
>   > > locations and test cases in regression test are changed.
>   >
>   > First, I'm glad for this. It will be VERY valuable for anyone trying
> to
>   > clean up the end of a majorly bloated table.
>   >
>   > Here's a partial review...
>   >
>   > +++ b/src/test/modules/ctidscan/ctidscan.c
>   >
>   > +/* missing declaration in pg_proc.h */
>   > +#ifndef TIDGreaterOperator
>   > +#define TIDGreaterOperator   2800
>   > ...
>   > If we're calling that out here, should we have a corresponding comment
> in
>   > pg_proc.h, in case these ever get renumbered?
>   >
>   It was a quick hack when I moved this module out of the tree.
>   Yep, I should revert when I submit this patch again.
> 
>   > +CTidQualFromExpr(Node *expr, int varno)
>   > ...
>   > + if (IsCTIDVar(arg1, varno))
>   > + other = arg2;
>   > + else if (IsCTIDVar(arg2, varno))
>   > + other = arg1;
>   > + else
>   > + return NULL;
>   > + if (exprType(other) != TIDOID)
>   > + return NULL;/* should not happen */
>   > + /* The other argument must be a pseudoconstant */
>   > + if (!is_pseudo_constant_clause(other))
>   > + return NULL;
>   >
>   > I think this needs some additional blank lines...
>   >
>   OK. And, I also noticed the coding style around this function is
>   different from other built-in plans, so I redefined the role of
>   this function just to check whether the supplied RestrictInfo is
>   OpExpr that involves TID inequality operator, or not.
> 
>   Expression node shall be extracted when Plan node is created
>   from Path node.
> 
>   > + if (IsCTIDVar(arg1, varno))
>   > + other = arg2;
>   > + else if (IsCTIDVar(arg2, v

Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-02-12 Thread Michael Paquier
On Tue, Jan 6, 2015 at 10:51 PM, Kouhei Kaigai  wrote:

> Jim, Thanks for your reviewing the patch.
>
> The attached patch is revised one according to your suggestion,
> and also includes bug fix I could found.
>
> * Definitions of TIDOperator was moved to pg_operator.h
>   as other operator doing.
> * Support the case of "ctid (operator) Param" expression.
> * Add checks if commutator of TID was not found.
> * Qualifiers gets extracted on plan stage, not path stage.
> * Adjust cost estimation logic to fit SeqScan manner.
> * Add some new test cases for regression test.
>
> > On 12/24/14, 7:54 PM, Kouhei Kaigai wrote:
> > >>> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai
> > >>> 
> > >> wrote:
> >  I'm not certain whether we should have this functionality in
> >  contrib from the perspective of workload that can help, but its
> >  major worth is for an example of custom-scan interface.
> > >>> worker_spi is now in src/test/modules. We may add it there as well,
> > no?
> > >>>
> > >> Hmm, it makes sense for me. Does it also make sense to add a
> > >> test-case to the core regression test cases?
> > >>
> > > The attached patch adds ctidscan module at test/module instead of
> contrib.
> > > Basic portion is not changed from the previous post, but file
> > > locations and test cases in regression test are changed.
> >
> > First, I'm glad for this. It will be VERY valuable for anyone trying to
> > clean up the end of a majorly bloated table.
> >
> > Here's a partial review...
> >
> > +++ b/src/test/modules/ctidscan/ctidscan.c
> >
> > +/* missing declaration in pg_proc.h */
> > +#ifndef TIDGreaterOperator
> > +#define TIDGreaterOperator   2800
> > ...
> > If we're calling that out here, should we have a corresponding comment in
> > pg_proc.h, in case these ever get renumbered?
> >
> It was a quick hack when I moved this module out of the tree.
> Yep, I should revert when I submit this patch again.
>
> > +CTidQualFromExpr(Node *expr, int varno)
> > ...
> > + if (IsCTIDVar(arg1, varno))
> > + other = arg2;
> > + else if (IsCTIDVar(arg2, varno))
> > + other = arg1;
> > + else
> > + return NULL;
> > + if (exprType(other) != TIDOID)
> > + return NULL;/* should not happen */
> > + /* The other argument must be a pseudoconstant */
> > + if (!is_pseudo_constant_clause(other))
> > + return NULL;
> >
> > I think this needs some additional blank lines...
> >
> OK. And, I also noticed the coding style around this function is
> different from other built-in plans, so I redefined the role of
> this function just to check whether the supplied RestrictInfo is
> OpExpr that involves TID inequality operator, or not.
>
> Expression node shall be extracted when Plan node is created
> from Path node.
>
> > + if (IsCTIDVar(arg1, varno))
> > + other = arg2;
> > + else if (IsCTIDVar(arg2, varno))
> > + other = arg1;
> > + else
> > + return NULL;
> > +
> > + if (exprType(other) != TIDOID)
> > + return NULL;/* should not happen */
> > +
> > + /* The other argument must be a pseudoconstant */
> > + if (!is_pseudo_constant_clause(other))
> > + return NULL;
> >
> > + * CTidEstimateCosts
> > + *
> > + * It estimates cost to scan the target relation according to the given
> > + * restriction clauses. Its logic to scan relations are almost same as
> > + * SeqScan doing, because it uses regular heap_getnext(), except for
> > + * the number of tuples to be scanned if restriction clauses work well.
> >
> > That should read "same as what SeqScan is doing"... however,
> what
> > actual function are you talking about? I couldn't find
> SeqScanEstimateCosts
> > (or anything ending EstimateCosts).
> >
> It is cost_seqscan(). But I don't put a raw function name in the source
> code comments in other portion, because nobody can guarantee it is up to
> date in the future...
>
> > BTW, there's other grammar issues but it'd be best to handle those all at
> > once after all the code stuff is done.
> >
> Yep. Help by native English speaker is very helpful for us.
>
> > + opno = get_commutator(op->opno);
> > What happens if there's no commutator? Perhaps best to Assert(opno !=
> > InvalidOid) at the end of that if block.
> >
> Usually, commutator operator of TID is defined on initdb time, however,
> nobody can guarantee a mad superuser doesn't drop it.
> So, I added elog(ERROR,...) here.
>
>
> > Though, it seems things will fall apart anyway if ctid_quals isn't
> exactly
> > what we're expecting; I don't know if that's OK or not.
> >
> No worry, it was already checked on planning stage.
>
> > + /* disk costs --- assume each tuple on a different page */
> > + run

Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-01-06 Thread Kouhei Kaigai
Jim, Thanks for your reviewing the patch.

The attached patch is revised one according to your suggestion,
and also includes bug fix I could found.

* Definitions of TIDOperator was moved to pg_operator.h
  as other operator doing.
* Support the case of "ctid (operator) Param" expression.
* Add checks if commutator of TID was not found.
* Qualifiers gets extracted on plan stage, not path stage.
* Adjust cost estimation logic to fit SeqScan manner.
* Add some new test cases for regression test.

> On 12/24/14, 7:54 PM, Kouhei Kaigai wrote:
> >>> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai
> >>> 
> >> wrote:
>  I'm not certain whether we should have this functionality in
>  contrib from the perspective of workload that can help, but its
>  major worth is for an example of custom-scan interface.
> >>> worker_spi is now in src/test/modules. We may add it there as well,
> no?
> >>>
> >> Hmm, it makes sense for me. Does it also make sense to add a
> >> test-case to the core regression test cases?
> >>
> > The attached patch adds ctidscan module at test/module instead of contrib.
> > Basic portion is not changed from the previous post, but file
> > locations and test cases in regression test are changed.
> 
> First, I'm glad for this. It will be VERY valuable for anyone trying to
> clean up the end of a majorly bloated table.
> 
> Here's a partial review...
> 
> +++ b/src/test/modules/ctidscan/ctidscan.c
> 
> +/* missing declaration in pg_proc.h */
> +#ifndef TIDGreaterOperator
> +#define TIDGreaterOperator   2800
> ...
> If we're calling that out here, should we have a corresponding comment in
> pg_proc.h, in case these ever get renumbered?
> 
It was a quick hack when I moved this module out of the tree.
Yep, I should revert when I submit this patch again.

> +CTidQualFromExpr(Node *expr, int varno)
> ...
> + if (IsCTIDVar(arg1, varno))
> + other = arg2;
> + else if (IsCTIDVar(arg2, varno))
> + other = arg1;
> + else
> + return NULL;
> + if (exprType(other) != TIDOID)
> + return NULL;/* should not happen */
> + /* The other argument must be a pseudoconstant */
> + if (!is_pseudo_constant_clause(other))
> + return NULL;
> 
> I think this needs some additional blank lines...
> 
OK. And, I also noticed the coding style around this function is
different from other built-in plans, so I redefined the role of
this function just to check whether the supplied RestrictInfo is
OpExpr that involves TID inequality operator, or not.

Expression node shall be extracted when Plan node is created
from Path node.

> + if (IsCTIDVar(arg1, varno))
> + other = arg2;
> + else if (IsCTIDVar(arg2, varno))
> + other = arg1;
> + else
> + return NULL;
> +
> + if (exprType(other) != TIDOID)
> + return NULL;/* should not happen */
> +
> + /* The other argument must be a pseudoconstant */
> + if (!is_pseudo_constant_clause(other))
> + return NULL;
> 
> + * CTidEstimateCosts
> + *
> + * It estimates cost to scan the target relation according to the given
> + * restriction clauses. Its logic to scan relations are almost same as
> + * SeqScan doing, because it uses regular heap_getnext(), except for
> + * the number of tuples to be scanned if restriction clauses work well.
> 
> That should read "same as what SeqScan is doing"... however, what
> actual function are you talking about? I couldn't find SeqScanEstimateCosts
> (or anything ending EstimateCosts).
> 
It is cost_seqscan(). But I don't put a raw function name in the source
code comments in other portion, because nobody can guarantee it is up to
date in the future...

> BTW, there's other grammar issues but it'd be best to handle those all at
> once after all the code stuff is done.
> 
Yep. Help by native English speaker is very helpful for us.

> + opno = get_commutator(op->opno);
> What happens if there's no commutator? Perhaps best to Assert(opno !=
> InvalidOid) at the end of that if block.
> 
Usually, commutator operator of TID is defined on initdb time, however,
nobody can guarantee a mad superuser doesn't drop it.
So, I added elog(ERROR,...) here.


> Though, it seems things will fall apart anyway if ctid_quals isn't exactly
> what we're expecting; I don't know if that's OK or not.
> 
No worry, it was already checked on planning stage.

> + /* disk costs --- assume each tuple on a different page */
> + run_cost += spc_random_page_cost * ntuples;
> Isn't that extremely pessimistic?
> 
Probably. I follow the manner of SeqScan.

> I'm not familiar enough with the custom-scan stuff to really comment past
> this point, and I could certainly be missing some details about planning
> and ex

Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2014-12-29 Thread Jim Nasby
On 12/24/14, 7:54 PM, Kouhei Kaigai wrote:
>>> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai 
>> wrote:
 I'm not certain whether we should have this functionality in contrib
 from the perspective of workload that can help, but its major worth
 is for an example of custom-scan interface.
>>> worker_spi is now in src/test/modules. We may add it there as well, no?
>>>
>> Hmm, it makes sense for me. Does it also make sense to add a test-case to
>> the core regression test cases?
>>
> The attached patch adds ctidscan module at test/module instead of contrib.
> Basic portion is not changed from the previous post, but file locations
> and test cases in regression test are changed.

First, I'm glad for this. It will be VERY valuable for anyone trying to clean 
up the end of a majorly bloated table.

Here's a partial review...

+++ b/src/test/modules/ctidscan/ctidscan.c

+/* missing declaration in pg_proc.h */
+#ifndef TIDGreaterOperator
+#define TIDGreaterOperator 2800
...
If we're calling that out here, should we have a corresponding comment in 
pg_proc.h, in case these ever get renumbered?

+CTidQualFromExpr(Node *expr, int varno)
...
+   if (IsCTIDVar(arg1, varno))
+   other = arg2;
+   else if (IsCTIDVar(arg2, varno))
+   other = arg1;
+   else
+   return NULL;
+   if (exprType(other) != TIDOID)
+   return NULL;/* should not happen */
+   /* The other argument must be a pseudoconstant */
+   if (!is_pseudo_constant_clause(other))
+   return NULL;

I think this needs some additional blank lines...

+   if (IsCTIDVar(arg1, varno))
+   other = arg2;
+   else if (IsCTIDVar(arg2, varno))
+   other = arg1;
+   else
+   return NULL;
+
+   if (exprType(other) != TIDOID)
+   return NULL;/* should not happen */
+
+   /* The other argument must be a pseudoconstant */
+   if (!is_pseudo_constant_clause(other))
+   return NULL;

+ * CTidEstimateCosts
+ *
+ * It estimates cost to scan the target relation according to the given
+ * restriction clauses. Its logic to scan relations are almost same as
+ * SeqScan doing, because it uses regular heap_getnext(), except for
+ * the number of tuples to be scanned if restriction clauses work well.

That should read "same as what SeqScan is doing"... however, what 
actual function are you talking about? I couldn't find SeqScanEstimateCosts (or 
anything ending EstimateCosts).

BTW, there's other grammar issues but it'd be best to handle those all at once 
after all the code stuff is done.

+   opno = get_commutator(op->opno);
What happens if there's no commutator? Perhaps best to Assert(opno != 
InvalidOid) at the end of that if block.

Though, it seems things will fall apart anyway if ctid_quals isn't exactly what 
we're expecting; I don't know if that's OK or not.

+   /* disk costs --- assume each tuple on a different page */
+   run_cost += spc_random_page_cost * ntuples;
Isn't that extremely pessimistic?

I'm not familiar enough with the custom-scan stuff to really comment past this 
point, and I could certainly be missing some details about planning and 
execution.

I do have some concerns about the regression test, but perhaps I'm just being 
paranoid:

- The EXPLAIN tests don't cover >. They do cover <= and >= via BETWEEN.
- Nothing tests the case of '...'::tid op ctid; only lvalue cases are tested.

Also, it seems that we don't handle joining on a ctid qual... is that 
intentional? I know that sounds silly, but you'd probably want to do that if 
you're trying to move tuples off the end of a bloated table. You could work 
around it by constructing a dynamic SQL string, but it'd be easier to do 
something like:

UPDATE table1 SET ...
  WHERE ctid >= (SELECT '(' || relpages || ',0)' FROM pg_class WHERE oid = 
'table1'::regclass)
;

in some kind of loop.

Obviously better to only handle what you already are then not get this in at 
all though... :)
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2014-12-24 Thread Kouhei Kaigai
> > On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai 
> wrote:
> > > I'm not certain whether we should have this functionality in contrib
> > > from the perspective of workload that can help, but its major worth
> > > is for an example of custom-scan interface.
> > worker_spi is now in src/test/modules. We may add it there as well, no?
> >
> Hmm, it makes sense for me. Does it also make sense to add a test-case to
> the core regression test cases?
> 
The attached patch adds ctidscan module at test/module instead of contrib.
Basic portion is not changed from the previous post, but file locations
and test cases in regression test are changed.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei 


pgsql-v9.5-test-ctidscan.v2.patch
Description: pgsql-v9.5-test-ctidscan.v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2014-12-15 Thread Kouhei Kaigai
> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai  wrote:
> > I'm not certain whether we should have this functionality in contrib
> > from the perspective of workload that can help, but its major worth is
> > for an example of custom-scan interface.
> worker_spi is now in src/test/modules. We may add it there as well, no?
>
Hmm, it makes sense for me. Does it also make sense to add a test-case to
the core regression test cases?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2014-12-15 Thread Michael Paquier
On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai  wrote:
> I'm not certain whether we should have this functionality in contrib
> from the perspective of workload that can help, but its major worth is
> for an example of custom-scan interface.
worker_spi is now in src/test/modules. We may add it there as well, no?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers