Hi Robert and others,
First, I currently don't know the postgresql code well enough yet. I still
hope my toughts are usefull.
Robert Haas wrote:
> It is unclear to me how useful this is beyond ForeignScan, Gather, and
> Append. MergeAppend's ordering constraint makes it less useful; we
> can asynchronously kick off the request for the next tuple before
> returning the previous one, but we're going to need to have that tuple
> before we can return the next one. But it could be done. It could
> potentially even be applied to seq scans or index scans using some set
> of asynchronous I/O interfaces, but I don't see how it could be
> applied to joins or aggregates, which typically can't really proceed
> until they get the next tuple. They could be plugged into this
> interface easily enough but it would only help to the extent that it
> enabled asynchrony elsewhere in the plan tree to be pulled up towards
> the root.
As far as I understand, this comes down to a number of subplans. The subplan
can be asked to return a tuple directly or at some later point in time
(asynchrone). Conceptionally, the subplan goes in a "tuples wanted" state
and it starts it works that need to be done to receive that tuple. Later, it
either returns the tuple or the message that there are no tuples.
I see opportunities to use this feature to make some queryplans less memory
intensive without increasing the total amount of work to be done. I think
the same subplan can be placed at several places in the execution plan. If
the subplan ends with a tuple store, then if a row is requested, it can
either return it from store, or generate it in the subplan. If both outputs
of the subplan request tuples at around the same rate, the tuple store size
is limited where we currently need to save all the tuples or generate the
tuples multiple times. I think this can be potentionally beneficial for
I also think the planner can predict what the chances are that a subplan can
be reused. If both subplans are on a different side of a hash join, it can
know that one output will be exhausted before the second output will request
the first row. That would mean that the everything needs to be stored. Also,
not every callback needs to be invoked at the same rate: tuple storage might
be avoided by choosing the callback to invoke wisely.
I am a little bit worried about the performance as a result of context
switching. I think it is a good idea to only register the callback if it
actually hits a point where the tuple can't be generated directly.
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: