Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-15 Thread Kouhei Kaigai
 -Original Message-
 From: Michael Paquier [mailto:michael.paqu...@gmail.com]
 Sent: Wednesday, July 15, 2015 3:29 PM
 To: Kaigai Kouhei(海外 浩平)
 Cc: Tom Lane; Robert Haas; Alvaro Herrera; hlinnaka; Jim Nasby; Kohei KaiGai;
 PgHacker; Simon Riggs
 Subject: Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] 
 Custom
 Plan API)
 
 On Wed, Jul 15, 2015 at 10:12 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
  We never guarantee the interface compatibility between major version up.
  If we add/modify interface on v9.6, it is duty for developer of extensions
  to follow the new version, even not specific to custom-scan provider.
  If v9.6 adds support fine-grained function cost estimation, I also have
  to follow the feature, but it is natural.
 
 Maintaining compatibility across major versions is a best-effort and
 even if we sometimes break things across major versions, and sometimes
 even silently (take the example of 9.3's background worker that do not
 start with 9.4 as long as bgw_notify_pid is not set to 0), the
 approach is usually taken to have APIs stable and convenient able to
 cover a maximum set of cases for a given set of plugins, and this
 serves well in the long term. Now it is true that we cannot assume
 either that the version of a plugin API will be perfect forever and
 will be able to cover all the imagined test cases at first shot, still
 I'd like to think that things are broken only when necessary and with
 good reasons. A set of APIs changed at each major release tends to be
 proof that research lacked in the first version, and would surely
 demotivate its adoption by extension developers.

Yep, I also don't want to break the interface compatibility without
something reasonable, and also think following-up new core features
is a good reason to enhance the interface in the future version.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-15 Thread Michael Paquier
On Wed, Jul 15, 2015 at 10:12 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 We never guarantee the interface compatibility between major version up.
 If we add/modify interface on v9.6, it is duty for developer of extensions
 to follow the new version, even not specific to custom-scan provider.
 If v9.6 adds support fine-grained function cost estimation, I also have
 to follow the feature, but it is natural.

Maintaining compatibility across major versions is a best-effort and
even if we sometimes break things across major versions, and sometimes
even silently (take the example of 9.3's background worker that do not
start with 9.4 as long as bgw_notify_pid is not set to 0), the
approach is usually taken to have APIs stable and convenient able to
cover a maximum set of cases for a given set of plugins, and this
serves well in the long term. Now it is true that we cannot assume
either that the version of a plugin API will be perfect forever and
will be able to cover all the imagined test cases at first shot, still
I'd like to think that things are broken only when necessary and with
good reasons. A set of APIs changed at each major release tends to be
proof that research lacked in the first version, and would surely
demotivate its adoption by extension developers.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-15 Thread Robert Haas
On Tue, Jul 14, 2015 at 6:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Both you and Andres have articulated the concern that CustomScan isn't
 actually useful, but I still don't really understand why not.

 I'm curious, for example, whether CustomScan would have been
 sufficient to build TABLESAMPLE, and if not, why not.  Obviously the
 syntax has to be in core,

 ... so you just made the point ...

If you're going to set the bar that high, there's no point in anyone
ever trying to add extensibility anywhere, because there will always
be some other place where there isn't enough extensibility someplace
else to do everything that someone might want.  The fact that the
parser isn't extensible doesn't make custom scans useless any more
than the fact that the non-extensibility of WAL-logging makes pg_am
useless.

 but why couldn't the syntax just call an
 extension-provided callback that returns a custom scan, instead of
 having a node just for TABLESAMPLE?

 Because that only works for small values of work.  As far as TABLESAMPLE
 goes, I intend to hold Simon's feet to the fire until there's a less
 cheesy answer to the problem of scan reproducibility.  Assuming we're
 going to allow sample methods that can't meet the reproducibility
 requirement, we need something to prevent them from producing visibly
 broken query results.  Ideally, the planner would avoid putting such a
 scan on the inside of a nestloop.  A CustomScan-based implementation could
 not possibly arrange such a thing; we'd have to teach the core planner
 about the concern.

Well, I think it would be better to change the tablesample methods as
you previously proposed, so that they are actually stable.  But if
that's not possible, then when you (or someone) makes the core planner
able to consider such matters, you could add a CUSTOM_PATH_UNSTABLE
flag that a custom path can use to notify the core system about the
problem.  Then tablesample methods that are unstable can set the flag
and those that are stable are not penalized.  Presumably we'll end up
with such a flag in the tablesample-path anyway, and a separate flag
in every kind of future scan that needs similar consideration.  If we
put it in some piece of infrastructure (like the custom-path stuff)
that is reusable, we could avoid reinventing the wheel every time.

 Or, taking the example of a GpuScan node, it's essentially impossible
 to persuade the planner to delegate any expensive function calculations,
 aggregates, etc to such a node; much less teach it that that way is cheaper
 than doing such things the usual way.  So yeah, KaiGai-san may have a
 module that does a few things with a GPU, but it's far from doing all or
 even very much of what one would want.

It's true that there are things he can't do this way, but without the
custom-scan stuff, he'd be able to do even less.

 Now, as part of the upper-planner-rewrite business that I keep hoping to
 get to when I'm not riding herd on bad patches, it's likely that we might
 have enough new infrastructure soon that that particular problem could
 be solved.  But there would just be another problem after that; a likely
 example is not having adequate statistics, or sufficiently fine-grained
 function cost estimates, to be able to make valid choices once there's
 more than one way to do such calculations.  (I'm not really impressed by
 the GPU is *always* faster approaches.)  Significant improvements of
 that sort are going to take core-code changes.

Probably, but giving people the ability to experiment outside core,
even if it's limited, often leads to good things.  And when it
doesn't, it doesn't really cost us anything.

 Even worse, if there do get to be any popular custom-scan extensions,
 we'll break them anytime we make any nontrivial planner changes, because
 there is no arms-length API there.  A trivial example is that even adding
 or changing any fields in struct Path will necessarily break custom scan
 providers, because they build Paths for themselves with no interposed API.

I agree that will happen, but I don't care.  This happens all the
time, and every person other than yourself who has commented on this
issue has said that they'd rather have an API exposed that will later
get whacked around without warning than have no API exposed at all,
not just with respect to custom-paths, but with a whole variety of
other things as well, like sticking PGDLLIMPORT on all of the
variables that back GUCs.  It is really far more tedious to try to
work around the lack of a PGDLLIMPORT marking (which causes a huge
annoyance now) than to adjust the code if and when somebody changes
something about the GUC (which causes a small annoyance later, and
only if there actually are changes).

 In large part this is the same as my core concern about the TABLESAMPLE
 patch: exposing dubiously-designed APIs is soon going to force us to make
 choices between breaking those APIs or not being able to make 

Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-15 Thread Robert Haas
On Wed, Jul 15, 2015 at 2:28 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Wed, Jul 15, 2015 at 10:12 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 We never guarantee the interface compatibility between major version up.
 If we add/modify interface on v9.6, it is duty for developer of extensions
 to follow the new version, even not specific to custom-scan provider.
 If v9.6 adds support fine-grained function cost estimation, I also have
 to follow the feature, but it is natural.

 Maintaining compatibility across major versions is a best-effort and
 even if we sometimes break things across major versions, and sometimes
 even silently (take the example of 9.3's background worker that do not
 start with 9.4 as long as bgw_notify_pid is not set to 0), the
 approach is usually taken to have APIs stable and convenient able to
 cover a maximum set of cases for a given set of plugins, and this
 serves well in the long term. Now it is true that we cannot assume
 either that the version of a plugin API will be perfect forever and
 will be able to cover all the imagined test cases at first shot, still
 I'd like to think that things are broken only when necessary and with
 good reasons. A set of APIs changed at each major release tends to be
 proof that research lacked in the first version, and would surely
 demotivate its adoption by extension developers.

Maybe.  If those changes are demanded by extension developers who are
trying to do useful stuff with the APIs and finding problems, then
changing things will probably make those extension developers happy.
If the changes are necessitated by core improvements, then we might
get some complaints if those core improvements are perceived to have
small value compared to the API break.  But I think complaints of that
kind are very rare.  The only thing I can think of that falls into
approximately that category is a gripe about the replacement of
relistemp by relpersistence.  But I think that was more motivated by
the way it broke SQL monitoring queries than by the way it broke C
code.  There may be other more recent examples; but that's the only
one I can think of offhand.  I just don't see it as a big issue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Robert Haas
On Fri, Jul 10, 2015 at 8:55 AM, Heikki Linnakangas hlinn...@iki.fi wrote:
 If somebody still needs it, I'll rebase and adjust the patch towards
 the latest custom-scan interface. However, I cannot be motivated for
 the feature nobody wants.

 Robert, can you weigh in on this? Do we currently have anything in the
 tree that tests the Custom Scan interface? If not, would this be helpful
 for that purpose?

We don't have anything that currently tests the Custom Scan interface
in the tree.  The question is how important that is, and whether it's
worth having what's basically a toy implementation just to demonstrate
that the feature can work.  If so, I think ctidscan is as good a toy
example as any; in the interest of full disclosure, I was the one who
suggested it in the first place.  But I am not entirely sure it's a
good idea to saddle ourselves with that maintenance effort.  It would
be a lot more interesting if we had an example that figured to be
generally useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Robert Haas
On Tue, Jul 14, 2015 at 3:07 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 We don't have anything that currently tests the Custom Scan interface
 in the tree.  The question is how important that is, and whether it's
 worth having what's basically a toy implementation just to demonstrate
 that the feature can work.  If so, I think ctidscan is as good a toy
 example as any; in the interest of full disclosure, I was the one who
 suggested it in the first place.  But I am not entirely sure it's a
 good idea to saddle ourselves with that maintenance effort.  It would
 be a lot more interesting if we had an example that figured to be
 generally useful.

 As a general principle, I think it's a good idea to have a module that's
 mostly just a skeleton that guides people into writing something real to
 use whatever API is being tested.  It needs to be simple enough that not
 much need to be deleted when writing the real thing, and complex enough
 to cover the parts that need covering.  If whatever replaces ctidscan is
 too complex, it will not serve that purpose.

 My guess is that something whose only purpose is to test the custom scan
 interface for coverage purposes can be simpler than this module.

See, I actually think the opposite: I think we've been accumulating a
reasonable amount of test code that actually serves no really useful
purpose and is just cruft.  Stuff like test_shm_mq and test_decoding
seem like they actually catches bugs, so I like that, but I think
stuff like worker_spi is actually TOO simple to be useful in building
anything real, and it provides no useful test coverage, either.  But
this is all a matter of opinion, of course, and I'll defer to whatever
the consensus is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Both you and Andres have articulated the concern that CustomScan isn't
 actually useful, but I still don't really understand why not.

 I'm curious, for example, whether CustomScan would have been
 sufficient to build TABLESAMPLE, and if not, why not.  Obviously the
 syntax has to be in core,

... so you just made the point ...

 but why couldn't the syntax just call an
 extension-provided callback that returns a custom scan, instead of
 having a node just for TABLESAMPLE?

Because that only works for small values of work.  As far as TABLESAMPLE
goes, I intend to hold Simon's feet to the fire until there's a less
cheesy answer to the problem of scan reproducibility.  Assuming we're
going to allow sample methods that can't meet the reproducibility
requirement, we need something to prevent them from producing visibly
broken query results.  Ideally, the planner would avoid putting such a
scan on the inside of a nestloop.  A CustomScan-based implementation could
not possibly arrange such a thing; we'd have to teach the core planner
about the concern.

Or, taking the example of a GpuScan node, it's essentially impossible
to persuade the planner to delegate any expensive function calculations,
aggregates, etc to such a node; much less teach it that that way is cheaper
than doing such things the usual way.  So yeah, KaiGai-san may have a
module that does a few things with a GPU, but it's far from doing all or
even very much of what one would want.

Now, as part of the upper-planner-rewrite business that I keep hoping to
get to when I'm not riding herd on bad patches, it's likely that we might
have enough new infrastructure soon that that particular problem could
be solved.  But there would just be another problem after that; a likely
example is not having adequate statistics, or sufficiently fine-grained
function cost estimates, to be able to make valid choices once there's
more than one way to do such calculations.  (I'm not really impressed by
the GPU is *always* faster approaches.)  Significant improvements of
that sort are going to take core-code changes.

Even worse, if there do get to be any popular custom-scan extensions,
we'll break them anytime we make any nontrivial planner changes, because
there is no arms-length API there.  A trivial example is that even adding
or changing any fields in struct Path will necessarily break custom scan
providers, because they build Paths for themselves with no interposed API.

In large part this is the same as my core concern about the TABLESAMPLE
patch: exposing dubiously-designed APIs is soon going to force us to make
choices between breaking those APIs or not being able to make changes we
need to make.  In the case of custom scans, I will not be particularly
sad when (not if) we break custom scan providers; but in other cases such
tradeoffs are going to be harder to make.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Robert Haas
On Tue, Jul 14, 2015 at 4:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I think this ties into my core unhappiness with the customscan stuff,
 which is that I don't believe it's *possible* to do anything of very
 great interest with it.  I think anything really useful will require
 core code modifications and/or hooks that don't exist now.  So a finger
 exercise like ctidscan, even though it might have some marginal use,
 doesn't do much to alleviate that concern.  It certainly doesn't seem
 like it's a suitable placeholder proving that we aren't breaking any
 actual use cases for the feature.

Both you and Andres have articulated the concern that CustomScan isn't
actually useful, but I still don't really understand why not.  KaiGai
has got working code (under a GPL license) that shows that it can be
used for what he calls GpuScan and GpuJoin, and the speedups are
actually pretty cool if you happen to have the right sort of query to
take advantage of it.  That code may be buggy and the license
precludes us using it anyway, but FWICT it does seem to work. I'd be
entirely amenable to improving the infrastructure such that it could
be used for a broader array of purposes, and I'm also amenable to
adding more hooks if we need more hooks to make it useful, but I'm not
clear at all on what you think is missing.

I'm curious, for example, whether CustomScan would have been
sufficient to build TABLESAMPLE, and if not, why not.  Obviously the
syntax has to be in core, but why couldn't the syntax just call an
extension-provided callback that returns a custom scan, instead of
having a node just for TABLESAMPLE?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Alvaro Herrera
Robert Haas wrote:

 We don't have anything that currently tests the Custom Scan interface
 in the tree.  The question is how important that is, and whether it's
 worth having what's basically a toy implementation just to demonstrate
 that the feature can work.  If so, I think ctidscan is as good a toy
 example as any; in the interest of full disclosure, I was the one who
 suggested it in the first place.  But I am not entirely sure it's a
 good idea to saddle ourselves with that maintenance effort.  It would
 be a lot more interesting if we had an example that figured to be
 generally useful.

As a general principle, I think it's a good idea to have a module that's
mostly just a skeleton that guides people into writing something real to
use whatever API is being tested.  It needs to be simple enough that not
much need to be deleted when writing the real thing, and complex enough
to cover the parts that need covering.  If whatever replaces ctidscan is
too complex, it will not serve that purpose.

My guess is that something whose only purpose is to test the custom scan
interface for coverage purposes can be simpler than this module.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jul 14, 2015 at 3:07 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 As a general principle, I think it's a good idea to have a module that's
 mostly just a skeleton that guides people into writing something real to
 use whatever API is being tested.  It needs to be simple enough that not
 much need to be deleted when writing the real thing, and complex enough
 to cover the parts that need covering.  If whatever replaces ctidscan is
 too complex, it will not serve that purpose.
 
 My guess is that something whose only purpose is to test the custom scan
 interface for coverage purposes can be simpler than this module.

 See, I actually think the opposite: I think we've been accumulating a
 reasonable amount of test code that actually serves no really useful
 purpose and is just cruft.  Stuff like test_shm_mq and test_decoding
 seem like they actually catches bugs, so I like that, but I think
 stuff like worker_spi is actually TOO simple to be useful in building
 anything real, and it provides no useful test coverage, either.  But
 this is all a matter of opinion, of course, and I'll defer to whatever
 the consensus is.

I think this ties into my core unhappiness with the customscan stuff,
which is that I don't believe it's *possible* to do anything of very
great interest with it.  I think anything really useful will require
core code modifications and/or hooks that don't exist now.  So a finger
exercise like ctidscan, even though it might have some marginal use,
doesn't do much to alleviate that concern.  It certainly doesn't seem
like it's a suitable placeholder proving that we aren't breaking any
actual use cases for the feature.

(BTW, if we care about the use cases this has, such as data recovery from
partially-corrupt tables, it would make way more sense and take way less
net new code to just teach TidScan about it.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Kouhei Kaigai
 -Original Message-
 From: Tom Lane [mailto:t...@sss.pgh.pa.us]
 Sent: Wednesday, July 15, 2015 5:47 AM
 To: Robert Haas
 Cc: Alvaro Herrera; hlinnaka; Kaigai Kouhei(海外 浩平); Michael Paquier; Jim
 Nasby; Kohei KaiGai; PgHacker; Simon Riggs
 Subject: Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] 
 Custom
 Plan API)
 
 Robert Haas robertmh...@gmail.com writes:
  On Tue, Jul 14, 2015 at 3:07 PM, Alvaro Herrera
  alvhe...@2ndquadrant.com wrote:
  As a general principle, I think it's a good idea to have a module that's
  mostly just a skeleton that guides people into writing something real to
  use whatever API is being tested.  It needs to be simple enough that not
  much need to be deleted when writing the real thing, and complex enough
  to cover the parts that need covering.  If whatever replaces ctidscan is
  too complex, it will not serve that purpose.
 
  My guess is that something whose only purpose is to test the custom scan
  interface for coverage purposes can be simpler than this module.
 
  See, I actually think the opposite: I think we've been accumulating a
  reasonable amount of test code that actually serves no really useful
  purpose and is just cruft.  Stuff like test_shm_mq and test_decoding
  seem like they actually catches bugs, so I like that, but I think
  stuff like worker_spi is actually TOO simple to be useful in building
  anything real, and it provides no useful test coverage, either.  But
  this is all a matter of opinion, of course, and I'll defer to whatever
  the consensus is.
 
 I think this ties into my core unhappiness with the customscan stuff,
 which is that I don't believe it's *possible* to do anything of very
 great interest with it.  I think anything really useful will require
 core code modifications and/or hooks that don't exist now.  So a finger
 exercise like ctidscan, even though it might have some marginal use,
 doesn't do much to alleviate that concern.  It certainly doesn't seem
 like it's a suitable placeholder proving that we aren't breaking any
 actual use cases for the feature.

The ctidscan is originally designed to validate the behavior of custom-scan
interface, so it is natural this module is valuable in limited cased.

However, I don't think that anything valuable usually takes core code
enhancement and/or new hooks, because we already have various extensions
around core code that utilizes existing infrastructures (even though its
specifications may be changed on major version up).
At least, custom-scan enables to implement edge-features which are not
easy to merge the core because of various reasons; like dependency to
proprietary library, too experimental features, too large code to review
as minimum valuable portion and so on.

 (BTW, if we care about the use cases this has, such as data recovery from
 partially-corrupt tables, it would make way more sense and take way less
 net new code to just teach TidScan about it.)

What I discussed with Hanada-san before was, a custom-scan provider that
replaces a particular relations join by simple scan of materialized-view
transparently. It is probably one other use case. Its design is in my
brain, but time for development is missing piece for me.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-14 Thread Kouhei Kaigai
 Or, taking the example of a GpuScan node, it's essentially impossible
 to persuade the planner to delegate any expensive function calculations,
 aggregates, etc to such a node; much less teach it that that way is cheaper
 than doing such things the usual way.  So yeah, KaiGai-san may have a
 module that does a few things with a GPU, but it's far from doing all or
 even very much of what one would want.

Why do we need to run all the functions on GPU device? PG-Strom simply
gives up to inject CustomPath if required qualifier contains unsupported
functions or data types, thus, these workloads are executed as usual.
I don't intend 100% coverage by GPU. That's no matter. People who use
massive numerical/mathematical workload will love GPU.

 Now, as part of the upper-planner-rewrite business that I keep hoping to
 get to when I'm not riding herd on bad patches, it's likely that we might
 have enough new infrastructure soon that that particular problem could
 be solved.  But there would just be another problem after that; a likely
 example is not having adequate statistics, or sufficiently fine-grained
 function cost estimates, to be able to make valid choices once there's
 more than one way to do such calculations.  (I'm not really impressed by
 the GPU is *always* faster approaches.)  Significant improvements of
 that sort are going to take core-code changes.

We never guarantee the interface compatibility between major version up.
If we add/modify interface on v9.6, it is duty for developer of extensions
to follow the new version, even not specific to custom-scan provider.
If v9.6 adds support fine-grained function cost estimation, I also have
to follow the feature, but it is natural.

 Even worse, if there do get to be any popular custom-scan extensions,
 we'll break them anytime we make any nontrivial planner changes, because
 there is no arms-length API there.  A trivial example is that even adding
 or changing any fields in struct Path will necessarily break custom scan
 providers, because they build Paths for themselves with no interposed API.

I cannot understand... If Path field gets changed, it is duty of extension
to follow the core change. We usually add/modify API specifications on
major version up. For example, I remember ProcessUtility_hook has been
changed a few times in the last several years.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-10 Thread Heikki Linnakangas
On 07/02/2015 08:21 AM, Kouhei Kaigai wrote:
 Folks,
 
 Moved this patch to next CF 2015-02 because of lack of review(ers).

 Do we still need this patch as contrib module?
 It was originally required it as example of custom-scan interface last
 summer, however, here was no strong requirement after that, then, it
 was bumped to v9.6 development cycle.
 
 If somebody still needs it, I'll rebase and adjust the patch towards
 the latest custom-scan interface. However, I cannot be motivated for
 the feature nobody wants.

Robert, can you weigh in on this? Do we currently have anything in the
tree that tests the Custom Scan interface? If not, would this be helpful
for that purpose?

- Heikki



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-07-01 Thread Kouhei Kaigai
Folks,

 Moved this patch to next CF 2015-02 because of lack of review(ers).

Do we still need this patch as contrib module?
It was originally required it as example of custom-scan interface last
summer, however, here was no strong requirement after that, then, it
was bumped to v9.6 development cycle.

If somebody still needs it, I'll rebase and adjust the patch towards
the latest custom-scan interface. However, I cannot be motivated for
the feature nobody wants.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org
 [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier
 Sent: Friday, February 13, 2015 4:41 PM
 To: Kaigai Kouhei(海外 浩平)
 Cc: Jim Nasby; Robert Haas; Simon Riggs; Kohei KaiGai; PgHacker
 Subject: Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] 
 Custom
 Plan API)
 
 
 
 On Tue, Jan 6, 2015 at 10:51 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 
 
   Jim, Thanks for your reviewing the patch.
 
   The attached patch is revised one according to your suggestion,
   and also includes bug fix I could found.
 
   * Definitions of TIDOperator was moved to pg_operator.h
 as other operator doing.
   * Support the case of ctid (operator) Param expression.
   * Add checks if commutator of TID was not found.
   * Qualifiers gets extracted on plan stage, not path stage.
   * Adjust cost estimation logic to fit SeqScan manner.
   * Add some new test cases for regression test.
 
On 12/24/14, 7:54 PM, Kouhei Kaigai wrote:
 On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai
 kai...@ak.jp.nec.com
 wrote:
 I'm not certain whether we should have this functionality in
 contrib from the perspective of workload that can help, but its
 major worth is for an example of custom-scan interface.
 worker_spi is now in src/test/modules. We may add it there as 
 well,
no?

 Hmm, it makes sense for me. Does it also make sense to add a
 test-case to the core regression test cases?

 The attached patch adds ctidscan module at test/module instead of
 contrib.
 Basic portion is not changed from the previous post, but file
 locations and test cases in regression test are changed.
   
First, I'm glad for this. It will be VERY valuable for anyone trying
 to
clean up the end of a majorly bloated table.
   
Here's a partial review...
   
+++ b/src/test/modules/ctidscan/ctidscan.c
   
+/* missing declaration in pg_proc.h */
+#ifndef TIDGreaterOperator
+#define TIDGreaterOperator   2800
...
If we're calling that out here, should we have a corresponding comment
 in
pg_proc.h, in case these ever get renumbered?
   
   It was a quick hack when I moved this module out of the tree.
   Yep, I should revert when I submit this patch again.
 
+CTidQualFromExpr(Node *expr, int varno)
...
+ if (IsCTIDVar(arg1, varno))
+ other = arg2;
+ else if (IsCTIDVar(arg2, varno))
+ other = arg1;
+ else
+ return NULL;
+ if (exprType(other) != TIDOID)
+ return NULL;/* should not happen */
+ /* The other argument must be a pseudoconstant */
+ if (!is_pseudo_constant_clause(other))
+ return NULL;
   
I think this needs some additional blank lines...
   
   OK. And, I also noticed the coding style around this function is
   different from other built-in plans, so I redefined the role of
   this function just to check whether the supplied RestrictInfo is
   OpExpr that involves TID inequality operator, or not.
 
   Expression node shall be extracted when Plan node is created
   from Path node.
 
+ if (IsCTIDVar(arg1, varno))
+ other = arg2;
+ else if (IsCTIDVar(arg2, varno))
+ other = arg1;
+ else
+ return NULL;
+
+ if (exprType(other) != TIDOID)
+ return NULL;/* should not happen */
+
+ /* The other argument must be a pseudoconstant */
+ if (!is_pseudo_constant_clause(other))
+ return NULL;
   
+ * CTidEstimateCosts
+ *
+ * It estimates cost to scan the target relation according to the 
 given
+ * restriction clauses. Its logic to scan relations are almost same
 as
+ * SeqScan doing, because it uses

Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2015-02-12 Thread Michael Paquier
On Tue, Jan 6, 2015 at 10:51 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:

 Jim, Thanks for your reviewing the patch.

 The attached patch is revised one according to your suggestion,
 and also includes bug fix I could found.

 * Definitions of TIDOperator was moved to pg_operator.h
   as other operator doing.
 * Support the case of ctid (operator) Param expression.
 * Add checks if commutator of TID was not found.
 * Qualifiers gets extracted on plan stage, not path stage.
 * Adjust cost estimation logic to fit SeqScan manner.
 * Add some new test cases for regression test.

  On 12/24/14, 7:54 PM, Kouhei Kaigai wrote:
   On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai
   kai...@ak.jp.nec.com
   wrote:
   I'm not certain whether we should have this functionality in
   contrib from the perspective of workload that can help, but its
   major worth is for an example of custom-scan interface.
   worker_spi is now in src/test/modules. We may add it there as well,
  no?
  
   Hmm, it makes sense for me. Does it also make sense to add a
   test-case to the core regression test cases?
  
   The attached patch adds ctidscan module at test/module instead of
 contrib.
   Basic portion is not changed from the previous post, but file
   locations and test cases in regression test are changed.
 
  First, I'm glad for this. It will be VERY valuable for anyone trying to
  clean up the end of a majorly bloated table.
 
  Here's a partial review...
 
  +++ b/src/test/modules/ctidscan/ctidscan.c
 
  +/* missing declaration in pg_proc.h */
  +#ifndef TIDGreaterOperator
  +#define TIDGreaterOperator   2800
  ...
  If we're calling that out here, should we have a corresponding comment in
  pg_proc.h, in case these ever get renumbered?
 
 It was a quick hack when I moved this module out of the tree.
 Yep, I should revert when I submit this patch again.

  +CTidQualFromExpr(Node *expr, int varno)
  ...
  + if (IsCTIDVar(arg1, varno))
  + other = arg2;
  + else if (IsCTIDVar(arg2, varno))
  + other = arg1;
  + else
  + return NULL;
  + if (exprType(other) != TIDOID)
  + return NULL;/* should not happen */
  + /* The other argument must be a pseudoconstant */
  + if (!is_pseudo_constant_clause(other))
  + return NULL;
 
  I think this needs some additional blank lines...
 
 OK. And, I also noticed the coding style around this function is
 different from other built-in plans, so I redefined the role of
 this function just to check whether the supplied RestrictInfo is
 OpExpr that involves TID inequality operator, or not.

 Expression node shall be extracted when Plan node is created
 from Path node.

  + if (IsCTIDVar(arg1, varno))
  + other = arg2;
  + else if (IsCTIDVar(arg2, varno))
  + other = arg1;
  + else
  + return NULL;
  +
  + if (exprType(other) != TIDOID)
  + return NULL;/* should not happen */
  +
  + /* The other argument must be a pseudoconstant */
  + if (!is_pseudo_constant_clause(other))
  + return NULL;
 
  + * CTidEstimateCosts
  + *
  + * It estimates cost to scan the target relation according to the given
  + * restriction clauses. Its logic to scan relations are almost same as
  + * SeqScan doing, because it uses regular heap_getnext(), except for
  + * the number of tuples to be scanned if restriction clauses work well.
 
  grammarThat should read same as what SeqScan is doing... however,
 what
  actual function are you talking about? I couldn't find
 SeqScanEstimateCosts
  (or anything ending EstimateCosts).
 
 It is cost_seqscan(). But I don't put a raw function name in the source
 code comments in other portion, because nobody can guarantee it is up to
 date in the future...

  BTW, there's other grammar issues but it'd be best to handle those all at
  once after all the code stuff is done.
 
 Yep. Help by native English speaker is very helpful for us.

  + opno = get_commutator(op-opno);
  What happens if there's no commutator? Perhaps best to Assert(opno !=
  InvalidOid) at the end of that if block.
 
 Usually, commutator operator of TID is defined on initdb time, however,
 nobody can guarantee a mad superuser doesn't drop it.
 So, I added elog(ERROR,...) here.


  Though, it seems things will fall apart anyway if ctid_quals isn't
 exactly
  what we're expecting; I don't know if that's OK or not.
 
 No worry, it was already checked on planning stage.

  + /* disk costs --- assume each tuple on a different page */
  + run_cost += spc_random_page_cost * ntuples;
  Isn't that extremely pessimistic?
 
 Probably. I follow the manner of SeqScan.

  I'm not familiar enough with the custom-scan stuff to really comment past
  

Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2014-12-24 Thread Kouhei Kaigai
  On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai kai...@ak.jp.nec.com
 wrote:
   I'm not certain whether we should have this functionality in contrib
   from the perspective of workload that can help, but its major worth
   is for an example of custom-scan interface.
  worker_spi is now in src/test/modules. We may add it there as well, no?
 
 Hmm, it makes sense for me. Does it also make sense to add a test-case to
 the core regression test cases?
 
The attached patch adds ctidscan module at test/module instead of contrib.
Basic portion is not changed from the previous post, but file locations
and test cases in regression test are changed.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


pgsql-v9.5-test-ctidscan.v2.patch
Description: pgsql-v9.5-test-ctidscan.v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2014-12-15 Thread Michael Paquier
On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 I'm not certain whether we should have this functionality in contrib
 from the perspective of workload that can help, but its major worth is
 for an example of custom-scan interface.
worker_spi is now in src/test/modules. We may add it there as well, no?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2014-12-15 Thread Kouhei Kaigai
 On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
  I'm not certain whether we should have this functionality in contrib
  from the perspective of workload that can help, but its major worth is
  for an example of custom-scan interface.
 worker_spi is now in src/test/modules. We may add it there as well, no?

Hmm, it makes sense for me. Does it also make sense to add a test-case to
the core regression test cases?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2014-12-14 Thread Kouhei Kaigai
This attached patch adds a code example of custom-scan interface.

This custom-scan provider (ctidscan) performs almost same as
built-in SeqScan plan, but can produce same results with less
page scan in case when qualifier of relation scan has inequality
operators (, =,  or =) on ctid system column, therefore,
range to be scanned is less than case of full-table scan.

Below is an example:

postgres=# EXPLAIN SELECT * FROM t1 WHERE ctid  '(10,0)'::tid AND b like 
'%abc%';
 QUERY PLAN
-
 Seq Scan on t1  (cost=0.00..10.00 rows=1 width=37)
   Filter: ((ctid  '(10,0)'::tid) AND (b ~~ '%abc%'::text))
(2 rows)

Once ctidscan is loaded, it can provide cheaper cost to scan
the t1 table than SeqScan, so planner chooses the custom logic.

postgres=# LOAD 'ctidscan';
LOAD
postgres=# EXPLAIN SELECT * FROM t1 WHERE ctid  '(10,0)'::tid AND b like 
'%abc%';
   QUERY PLAN
-
 Custom Scan (ctidscan) on t1  (cost=0.00..5.00 rows=1 width=37)
   Filter: ((ctid  '(10,0)'::tid) AND (b ~~ '%abc%'::text))
   ctid quals: (ctid  '(10,0)'::tid)
(3 rows)

Like other query execution logic, it also provides enable_ctidscan
parameter to turn on/off this feature.

I'm not certain whether we should have this functionality in contrib
from the perspective of workload that can help, but its major worth is
for an example of custom-scan interface.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


 -Original Message-
 From: Robert Haas [mailto:robertmh...@gmail.com]
 Sent: Thursday, December 11, 2014 11:58 AM
 To: Simon Riggs
 Cc: Kaigai Kouhei(海外 浩平); Thom Brown; Kohei KaiGai; Tom Lane; Alvaro
 Herrera; Shigeru Hanada; Stephen Frost; Andres Freund; PgHacker; Jim
 Mlodgenski; Peter Eisentraut
 Subject: ##freemail## Re: [HACKERS] [v9.5] Custom Plan API
 
 On Tue, Dec 9, 2014 at 3:24 AM, Simon Riggs si...@2ndquadrant.com wrote:
  Feedback I am receiving is that the API is unusable. That could be
  because it is impenetrable, or because it is unusable. I'm not sure it
  matters which.
 
 It would be nice to here what someone is trying to use it for and what 
 problems
 that person is encountering.  Without that, it's pretty much impossible
 for anyone to fix anything.
 
 As for sample code, KaiGai had a working example, which of course got broken
 when Tom changed the API, but it didn't look to me like Tom's changes would
 have made anything impossible that was possible before.
 I'm frankly kind of astonished by the tenor of this entire conversation;
 there is certainly plenty of code in the backend that is less
 self-documenting than this is; and KaiGai did already put up a wiki page
 with documentation as you requested.  From his response, it sounds like
 he has updated the ctidscan code, too.
 
 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
 Company


pgsql-v9.5-contrib-ctidscan.v1.patch
Description: pgsql-v9.5-contrib-ctidscan.v1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers