bmi_tcp: bmi-tcp.c sockio.c

Rob Ross Tue, 11 Dec 2007 09:23:14 -0800

i was more concerned that we not lose track of awaiting data becauseof not retrying, but i'm woefully uninformed on how poll()/epoll dotheir thing at this point.


the solution you chose (in later email) seems good to me.


rob

On Dec 11, 2007, at 10:52 AM, Sam Lang wrote:

That seems possible. I did some reading and couldn't find anyobvious reasons the kernel does this, but I think the basic answeris that it doesn't break the semantics of recv, as there aren't anysemantic guarantees between results from poll and calls to recv.To investigate further I think would require looking at the kernelcode, and while I'm interested in what's going on there, its notsomething I want to dig into right now. The answer isn't going tochange the behavior of the functions in any case. :-)
I think Kevin's and maybe Rob's concerns are that recv would loopforever returning EAGAIN, and just from empirical evidence, itdoesn't appear to, so I would go with that as the current solution.
-sam


On Dec 11, 2007, at 10:46 AM, Walter B. Ligon III wrote:
Shooting from the hip here, but is it possible EAGAIN mightindicate there is a structure locked in the socket - say if apacket receive handler is running - which would block the call,even though there ARE bytes in the socket?
Walt

Sam Lang wrote:
I agree Pete -- its messy. Just by the names of errnos, it seemsappropriate to return what's been completed if we getEWOULDBLOCK, while EAGAIN suggests we can just call recv againand get what we want. But as you point out they're the samevalue. According to the opengroup, impls _may_ assign the samevalue to both:
http://www.opengroup.org/pubs/online/7908799/xsh/errors.html
Strictly from a linux implementation perspective, epoll/poll tellus that the bytes are on the socket, so even when EAGAIN isreturned, we can call recv again and get what we wanted. I'vetested this a bunch, and when EAGAIN is returned (which isinfrequent), the next call invariably returns successfully.There were two instances where the code looped up-to around 200times on EAGAIN under heavy load. But looping does turn nbrecvinto more of a brecv, although we avoid all the fcntl calls toturn the socket into a blocking one just for the recv call.
With the socket in non-blocking mode, the conditional:
      if (ret == -1 && errno == EWOULDBLOCK)
      {
          return (len - comp);        /* return amount completed */
      }
Just doesn't work. It causes the caller to error and close thesocket. Not what we want.
I think we can get away with doing:
      if (!ret)       /* socket closed */
      {
          errno = EPIPE;
          return (-1);
      }
if (ret == -1 && (errno == EINTR || errno == EAGAIN ||errno == EWOULDBLOCK))
      {
          goto nbrecv_restart;
      }
      else if (ret == -1)
      {
          return (-1);
      }
From a practical perspective, this seems to work, and animplementation that has poll telling us that bytes are ready, butrecv returning EWOULDBLOCK because of anything other than smalltiming issues in the kernel seems broken anyway.The alternative is to return the bytes received with the errno,and on EAGAIN, we would have to add the operation back onto theop queue with a state variable of how much was received. Thecode is designed to avoid doing this in the first place bypolling until the bytes we need are ready, so doing this wouldprobably be messy.
-sam
On Dec 10, 2007, at 11:47 PM, Pete Wyckoff wrote:
[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> wrote on Mon, 10Dec 2007 21:19 -0600:
while a loop will fix it, it would be really nice to understandhow we get
EAGAIN when we think that there are bytes there...
[..]
On Dec 7, 2007, at 4:55 PM, Sam Lang wrote:
I'm seeing recv on a socket in non-blocking mode returning EAGAIN
occasionally, even though epoll has just told us there's byteswaiting. Iguess that's why the call was initially a blocking recv. Ican add a looparound the non-blocking recv while it returns EAGAIN, unlesssomeone can
think of a better work around.
The function is getting a bit messy.  I'm all for looping on E* and
thought Sam's original mail made sense.  But on second glance:

int BMI_sockio_nbrecv(int s,
         void *buf,
         int len)
{
  int ret, comp = len;

  assert(fcntl(s, F_GETFL, 0) & O_NONBLOCK);

  while (comp)
  {
    nbrecv_restart:
      ret = recv(s, buf, comp, DEFAULT_MSG_FLAGS);
      if (!ret)       /* socket closed */
      {
          errno = EPIPE;
          return (-1);
      }
      if (ret == -1 && errno == EWOULDBLOCK)
      {
          return (len - comp);        /* return amount completed */
      }
      if (ret == -1 && (errno == EINTR || errno == EAGAIN))
      {
          goto nbrecv_restart;
      }
      else if (ret == -1)
      {
          return (-1);
      }
      comp -= ret;
      buf = (char *)buf + ret;
  }
  return (len - comp);
}

Note that we get from standard headers:
/usr/include/asm-generic/errno.h:#define EWOULDBLOCKEAGAIN /* Operation would block */
But maybe there are some systems where this is not true?  Not ones
that use glibc, apparently.

Anyway, the first use of EWOULDBLOCK runs us back to the poll
loop, which is the right thing to do.  The second use of EAGAIN
would lead to a busy loop on recv()->EAGAIN that isn't quite so
nice.  But that code never gets hit.
I'm not sure that a poll readable result necessarily means we'llget
any bytes on the socket.  There are numerous ways in which things
can get messy.

-- Pete
------------------------------------------------------------------------
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers


_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] Re: [Pvfs2-cvs] commit by slang in pvfs2/src/io/bmi/bmi_tcp: bmi-tcp.c sockio.c

Reply via email to