2014-03-04 23:10 GMT+09:00 Stephen Frost <sfr...@snowman.net>:
>> The "cache_scan" module that I and Haribabu are discussing in another
>> thread also might be a good demonstration for custom-scan interface,
>> however, its code scale is a bit larger than ctidscan.
> That does sound interesting though I'm curious about the specifics...
This module caches a part of columns, but not all, thus allows to hold
much larger number of records for a particular amount of RAM than the
standard buffer cache.
It is constructed on top of custom-scan node, and also performs a new
hook for a callback on page vacuuming to invalidate its cache entry.
(I originally designed this module for demonstration of on-vacuum hook
because I already made ctidscan and postgres_fdw enhancement for
custom-scan node, by the way.)

>> > For one thing, an example where you could have this CustomScan node calling
>> > other nodes underneath would be interesting.  I realize the CTID scan can't
>> > do that directly but I would think your GPU-based system could; after all,
>> > if you're running a join or an aggregate with the GPU, the rows could come
>> > from nearly anything.  Have you considered that, or is the expectation that
>> > users will just go off and access the heap and/or whatever indexes 
>> > directly,
>> > like ctidscan does?  How would such a requirement be handled?
>> >
>> In case when custom-scan node has underlying nodes, it shall be invoked using
>> ExecProcNode as built-in node doing, then it will be able to fetch tuples
>> come from underlying nodes. Of course, custom-scan provider can perform the
>> tuples come from somewhere as if it came from underlying relation. It is
>> responsibility of extension module. In some cases, it shall be required to
>> return junk system attribute, like ctid, for row-level locks or table 
>> updating.
>> It is also responsibility of the extension module (or, should not add custom-
>> path if this custom-scan provider cannot perform as required).
> Right, tons of work to do to make it all fit together and play nice-
> what I was trying to get at is: has this actually been done?  Is the GPU
> extension that you're talking about as the use-case for this been
> written?
Its chicken-and-egg problem, because implementation of the extension module
fully depends on the interface from the backend. Unlike commit-fest, here is no
deadline for my extension module, so I put higher priority on the submission of
custom-scan node, than the extension.
However, GPU extension is not fully theoretical stuff. I had implemented
a prototype using FDW APIs, and it allowed to accelerate sequential scan if
query has enough complicated qualifiers.

See the movie (from 2:45). The table t1 is a regular table, and t2 is a foreign
table. Both of them has same contents, however, response time of the query
is much faster, if GPU acceleration is working.
So, I'm confident that GPU acceleration will have performance gain once it
can run regular tables, not only foreign tables.

> How does it handle all of the above?  Or are we going through
> all these gyrations in vain hope that it'll actually all work when
> someone tries to use it for something real?
I don't talk something difficult. If junk attribute requires to return "ctid" of
the tuple, custom-scan provider reads a tuple of underlying relation then
includes a correct item pointer. If this custom-scan is designed to run on
the cache, all it needs to do is reconstruct a tuple with correct item-pointer
(thus this cache needs to have ctid also). It's all I did in the cache_scan

KaiGai Kohei <kai...@kaigai.gr.jp>

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to