Hello,

At Tue, 11 Jul 2017 10:28:51 +0200, Antonin Houska <a...@cybertec.at> wrote in 
<6448.1499761731@localhost>
> Kyotaro HORIGUCHI <horiguchi.kyot...@lab.ntt.co.jp> wrote:
> > Effectively it is a waiting-queue followed by a
> > completed-list. The point of the compaction is keeping the order
> > of waiting or not-yet-completed requests, which is crucial to
> > avoid kind-a precedence inversion. We cannot keep the order by
> > using bitmapset in such way.
> 
> > The current code waits all waiters at once and processes all
> > fired events at once. The order in the waiting-queue is
> > inessential in the case. On the other hand I suppoese waiting on
> > several-tens to near-hundred remote hosts is in a realistic
> > target range. Keeping the order could be crucial if we process a
> > part of the queue at once in the case.
> > 
> > Putting siginificance on the deviation of response time of
> > remotes, process-all-at-once is effective. In turn we should
> > consider the effectiveness of the lifecycle of the larger wait
> > event set.
> 
> ok, I missed the fact that the order of es_pending_async entries is
> important. I think this is worth adding a comment.

I'll put an upper limit to the number of waiters processed at
once. Then add a comment like that.

> Actually the reason I thought of simplification was that I noticed small
> inefficiency in the way you do the compaction. In particular, I think it's not
> always necessary to swap the tail and head entries. Would something like this
> make sense?

I'm not sure, but I suppose that it is rare that all of the first
many elements in the array are not COMPLETE. In most cases the
first element gets a response first.

> 
>               /* If any node completed, compact the array. */
>               if (any_node_done)
>               {
...
>                       for (tidx = 0; tidx < estate->es_num_pending_async; 
> ++tidx)
>                       {
...
>                               if (tail->state == ASYNCREQ_COMPLETE)
>                                       continue;
> 
>                               /*
>                                * If the array starts with one or more 
> incomplete requests,
>                                * both head and tail point at the same item, 
> so there's no
>                                * point in swapping.
>                                */
>                               if (tidx > hidx)
>                               {

This works to skip the first several elements when all of them
are ASYNCREQ_COMPLETE. I think it makes sense as long as it
doesn't harm the loop. The optimization is more effective by
putting out of the loop like this.

|  for (tidx = 0; tidx < estate->es_num_pending_async &&
                  estate->es_pending_async[tidx] == ASYNCREQ_COMPLETE; ++tidx);
|  for (; tidx < estate->es_num_pending_async; ++tidx)
...


> And besides that, I think it'd be more intuitive if the meaning of "head" and
> "tail" was reversed: if the array is iterated from lower to higher positions,
> then I'd consider head to be at higher position, not tail.

Yeah, but maybe the "head" is still confusing even if reversed
because it is still not a head of something.  It might be less
confusing by rewriting it in more verbose-but-straightforwad way.


|      int npending = 0;
| 
|      /* Skip over not-completed items at the beginning */
|      while (npending < estate->es_num_pending_async &&
|             estate->es_pending_async[npending] != ASYNCREQ_COMPLETE)
|        npending++;
| 
|      /* Scan over the rest for not-completed items */
|      for (i = npending + 1 ; i < estate->es_num_pending_async; ++i)
|      {
|        PendingAsyncRequest *tmp;
|        PendingAsyncRequest *curr = estate->es_pending_async[i];
|
|        if (curr->state == ASYNCREQ_COMPLETE)
|          continue;
|
|        /* Move the not-completed item to the tail of the first chunk */
|        tmp = estate->es_pending_async[i];
|        estate->es_pending_async[nepending] = tmp;
|        estate->es_pending_async[i] = tmp;
|        ++npending;
|      }


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to