Thank you for the thought.

This is at PoC level so I'd be grateful for this kind of
fundamental comments.

At Wed, 28 Jun 2017 20:22:24 +0200, Antonin Houska <> wrote in 
> Kyotaro HORIGUCHI <> wrote:
> > The patch got conflicted. This is a new version just rebased to
> > the current master. Furtuer amendment will be taken later.
> Just one idea that I had while reading the code.
> In ExecAsyncEventLoop you iterate estate->es_pending_async, then move the
> complete requests to the end and finaly adjust estate->es_num_pending_async so
> that the array no longer contains the complete requests. I think the point is
> that then you can add new requests to the end of the array.
> I wonder if a set (Bitmapset) of incomplete requests would make the code more
> efficient. The set would contain position of each incomplete request in
> estate->es_num_pending_async (I think it's the myindex field of
> PendingAsyncRequest). If ExecAsyncEventLoop used this set to retrieve the
> requests subject to ExecAsyncNotify etc, then the compaction of
> estate->es_pending_async wouldn't be necessary.
> ExecAsyncRequest would use the set to look for space for new requests by
> iterating it and trying to find the first gap (which corresponds to completed
> request).
> And finally, item would be removed from the set at the moment the request
> state is being set to ASYNCREQ_COMPLETE.

Effectively it is a waiting-queue followed by a
completed-list. The point of the compaction is keeping the order
of waiting or not-yet-completed requests, which is crucial to
avoid kind-a precedence inversion. We cannot keep the order by
using bitmapset in such way.

The current code waits all waiters at once and processes all
fired events at once. The order in the waiting-queue is
inessential in the case. On the other hand I suppoese waiting on
several-tens to near-hundred remote hosts is in a realistic
target range. Keeping the order could be crucial if we process a
part of the queue at once in the case.

Putting siginificance on the deviation of response time of
remotes, process-all-at-once is effective. In turn we should
consider the effectiveness of the lifecycle of the larger wait
event set.

Sorry for the discursive discussion but in short, I have noticed
that I have a lot to consider on this:p Thanks!


Kyotaro Horiguchi
NTT Open Source Software Center

Sent via pgsql-hackers mailing list (
To make changes to your subscription:

Reply via email to