[PATCH/RFC] Handle EFAULT in partial recvmmsg was Re: [PATCH/RFC] Re: recvmmsg() timeout behavior strangeness [RESEND]

2014-05-29 Thread 'Arnaldo Carvalho de Melo'
Em Thu, May 29, 2014 at 11:17:05AM -0300, 'Arnaldo Carvalho de Melo' escreveu:
> Em Thu, May 29, 2014 at 02:06:04PM +, David Laight escreveu:
> > From: 'Arnaldo Carvalho de Melo'
> > ...
> > > > I remember some discussions from an XNET standards meeting (I've 
> > > > forgotten
> > > > exactly which errors on which calls were being discussed).
> > > > My recollection is that you return success with a partial transfer
> > > > count for ANY error that happens after some data has been transferred.
> > > > The actual error will be returned when it happens again on the next
> > > > system call - Note the AGAIN, not a saved error.
 
> > > A saved error, for the right entity, in the recvmmsg case, that
> > > basically is batching multiple recvmsg syscalls, doesn't sound like a
> > > problem, i.e. the idea is to, as much as possible, mimic what multiple
> > > recvmsg calls would do, but reduce its in/out kernel (and inside kernel
> > > subsystems) overhead.
 
> > > Perhaps we can have something in between, i.e. for things like EFAULT,
> > > we should report straight away, effectively dropping whatever datagrams
> > > successfully received in the current batch, do you agree?
  
> > Not unreasonable - EFAULT shouldn't happen unless the application
> > is buggy.
 
> Ok.

So the patch below should handle it, and record that the packets were
dropped, not at the transport level, like UDP_MIB_INERRORS, for
instance, would indicate, but at the batching, recvmmsg level, so
perhaps we'll need a MIB variable for that.

Also a counterpart to the trace_kfree_skb(skb, udp_recvmsg) tracepoint
for dropwatch and similar tools to use, Neil?

I'm keeping this separate from the timeout update patch.

- Arnaldo

diff --git a/net/socket.c b/net/socket.c
index abf56b2a14f9..63491f015912 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -2415,13 +2415,17 @@ out_put:
return datagrams;
 
if (datagrams != 0) {
+   if (err == -EFAULT) {
+   atomic_add(datagrams, >sk->sk_drops);
+   return -EFAULT;
+   }
/*
 * We may return less entries than requested (vlen) if the
 * sock is non block and there aren't enough datagrams...
 */
if (err != -EAGAIN) {
/*
-* ... or  if recvmsg returns an error after we
+* ... or if recvmsg returns a socket error after we
 * received some datagrams, where we record the
 * error to return on the next call or if the
 * app asks about it using getsockopt(SO_ERROR).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH/RFC] Handle EFAULT in partial recvmmsg was Re: [PATCH/RFC] Re: recvmmsg() timeout behavior strangeness [RESEND]

2014-05-29 Thread 'Arnaldo Carvalho de Melo'
Em Thu, May 29, 2014 at 11:17:05AM -0300, 'Arnaldo Carvalho de Melo' escreveu:
 Em Thu, May 29, 2014 at 02:06:04PM +, David Laight escreveu:
  From: 'Arnaldo Carvalho de Melo'
  ...
I remember some discussions from an XNET standards meeting (I've 
forgotten
exactly which errors on which calls were being discussed).
My recollection is that you return success with a partial transfer
count for ANY error that happens after some data has been transferred.
The actual error will be returned when it happens again on the next
system call - Note the AGAIN, not a saved error.
 
   A saved error, for the right entity, in the recvmmsg case, that
   basically is batching multiple recvmsg syscalls, doesn't sound like a
   problem, i.e. the idea is to, as much as possible, mimic what multiple
   recvmsg calls would do, but reduce its in/out kernel (and inside kernel
   subsystems) overhead.
 
   Perhaps we can have something in between, i.e. for things like EFAULT,
   we should report straight away, effectively dropping whatever datagrams
   successfully received in the current batch, do you agree?
  
  Not unreasonable - EFAULT shouldn't happen unless the application
  is buggy.
 
 Ok.

So the patch below should handle it, and record that the packets were
dropped, not at the transport level, like UDP_MIB_INERRORS, for
instance, would indicate, but at the batching, recvmmsg level, so
perhaps we'll need a MIB variable for that.

Also a counterpart to the trace_kfree_skb(skb, udp_recvmsg) tracepoint
for dropwatch and similar tools to use, Neil?

I'm keeping this separate from the timeout update patch.

- Arnaldo

diff --git a/net/socket.c b/net/socket.c
index abf56b2a14f9..63491f015912 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -2415,13 +2415,17 @@ out_put:
return datagrams;
 
if (datagrams != 0) {
+   if (err == -EFAULT) {
+   atomic_add(datagrams, sock-sk-sk_drops);
+   return -EFAULT;
+   }
/*
 * We may return less entries than requested (vlen) if the
 * sock is non block and there aren't enough datagrams...
 */
if (err != -EAGAIN) {
/*
-* ... or  if recvmsg returns an error after we
+* ... or if recvmsg returns a socket error after we
 * received some datagrams, where we record the
 * error to return on the next call or if the
 * app asks about it using getsockopt(SO_ERROR).
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/