Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled

2007-06-27 Thread Oleg Nesterov
On 06/27, Satyam Sharma wrote:

 Thanks for your comments, I'm still not convinced, however.

An perhaps you are right. I don't have a very strong opinion on that.
Still I can't understand why it is better if kthread_stop() sends a
signal as well. Contrary, I believe we should avoid signals when it
comes to kernel threads.

One can always use force_sig() or allow_signal() + send_sig() when
it is really needed, like cifs does.

 On 6/26/07, Oleg Nesterov [EMAIL PROTECTED] wrote:
 
 Personally, I don't think we should do this.
 
 kthread_stop() doesn't always mean kill this thread asap. Suppose that
 CPU_DOWN does kthread_stop(workqueue-thread) but doesn't flush the queue
 before that (we did so before 2.6.22 and perhaps we will do again). Now
 work_struct-func() doing tcp_recvmsg() or wait_event_interruptible()
 fails,
 but this is probably not that we want.

 Anyway, I think _all_ usages of kthread_stop() in the kernel *do* want
 the thread to stop *right then*. After all, kthread_stop() doesn't even
 return (gets blocked on wait_for_completion()) till it knows the target
 kthread *has* exited completely.

Yes, kthread_stop(k) means that k should exit eventually, but I don't
think that kthread_stop() should try to force the exit.

 And if a workqueue is blocked on tcp_recvmsg() or skb_recv_datagram()
 or some such, I don't see how that flush_workqueue (if that is what you
 meant) would succeed anyway (unless you do send the signal too),

timeout, but this was just a silly example. I am talking about the case
when wait_event_interruptible() should not fail (unless something bad
happens) inside the while (!kthread_should_stop()) loop.

Note also that kthread could use TASK_INTERRUPTIBLE sleep because it
doesn't want to contribute to loadavg, and because it knows that all
signals are ignored.

 Note that the exact scenario you're talking about wouldn't mean the
 kthread getting killed before it's supposed to be stopped anyway.

Yes sure, we can't kill the kernel thread via signal. I meant we can have
some unexpected failure.

 (offtopic)
 
 cifs_mount:
 
 send_sig(SIGKILL,srvTcp-tsk,1);
 tsk = srvTcp-tsk;
 if(tsk)
 kthread_stop(tsk);
 
 This if(tsk) looks wrong to me.

 I think it's bogus myself. [ Added [EMAIL PROTECTED] to Cc:
 ]

 Can srvTcp-tsk be NULL? If yes, send_sig()
 is not safe. Can srvTcp-tsk become NULL after send_sig() ? If yes, this
 check is racy, and kthread_stop() is not safe.

 That's again something the atomicity I proposed above could avoid?

I think this if(tsk) is just bogus, and should be killed.

Oleg.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled

2007-06-27 Thread Satyam Sharma

Hi Oleg,

On 6/27/07, Oleg Nesterov [EMAIL PROTECTED] wrote:

On 06/27, Satyam Sharma wrote:

 Thanks for your comments, I'm still not convinced, however.

An perhaps you are right. I don't have a very strong opinion on that.
Still I can't understand why it is better if kthread_stop() sends a
signal as well.
[...]
One can always use force_sig() or allow_signal() + send_sig() when
it is really needed, like cifs does.


The way I look at it, this is about an API being the one with least surprise.
So when one writes a kthread using its API, one would expect
kthread_stop() to dtrt and just work, which it will not, if the kthread has one
of such functions (which are just like any other, after all). Also, imagine
such a function being added to a kthread that didn't have it previously ...
the solution to which (sending a signal before kthread_stop) is un-intuitive.
Hence, why not make it part of the API itself ... It's not like the signal
would do any harm to other kthreads (that don't use such function) either.


Contrary, I believe we should avoid signals when it
comes to kernel threads.


And I agree, but there's quite a subtle difference between signals being
used like they normally are, and this case here. Fact is, there is simply
no other way to break out of certain functions (if there was, I would've
preferred that myself).

In fact, what I'm proposing is that kthreads should *not* be tinkering
with (flushing, handling, dequeueing, whatever) signals at all, because
like you mentioned, if they do that, it's possible that the TIF_SIGPENDING
could get lost.


 Anyway, I think _all_ usages of kthread_stop() in the kernel *do* want
 the thread to stop *right then*. After all, kthread_stop() doesn't even
 return (gets blocked on wait_for_completion()) till it knows the target
 kthread *has* exited completely.

Yes, kthread_stop(k) means that k should exit eventually, but I don't
think that kthread_stop() should try to force the exit.


Well, it's just an additional notice (apart from setting kthread_stop_info)
sent to the target kthread that its time has come ... I'm not sure how a
kthread would have exit forced upon it just by sending it a signal.
Also, note that _not_ using a signal would in fact mean that the kthread
_never_ exits at all (forget asap).


I am talking about the case
when wait_event_interruptible() should not fail (unless something bad
happens) inside the while (!kthread_should_stop()) loop.
Note also that kthread could use TASK_INTERRUPTIBLE sleep
[...] and because it knows that all signals are ignored.


Ok, I think you're saying that if a kthread used wait_event_interruptible
(and was not expecting signals, because it ignores them), then bad
things (say exit in inconsistent state, etc) will happen if we break that
wait_event_interruptible unexpectedly.

First, such an error ought to be handled gracefully by kthread itself
anyway -- anybody who doesn't check the return of _interruptible()
functions is just asking for trouble.

Second, the kthread must expect that the stop notification _could_
have come during that sleep (in fact all good kthreads I've seen do
always put a kthread_should_stop() after all such blocking functions,
and not only in the while loop's condition) -- but even if it doesn't check
explicitly, what do we lose? If the kthread code _after_ the
wait_event_interruptible is written such that it assumes that the wait
condition has become true (as I mentioned above), then that code is
inherently buggy.

And thirdly, what I'm proposing is putting the check for checking the
SIGKILL in kthread_should_stop itself, in /addition/ to the
kthread_stop_info.k == current check. So the kthread will check
should_stop(), and if true, then exiting cleanly -- this is something that
all existing kthreads would do already (if some kthread out there exits
_uncleanly_ even after seeing seeing kthread_should_stop == true,
then it needs fixing anyway).

Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled

2007-06-26 Thread Satyam Sharma

Hi Oleg,

Thanks for your comments, I'm still not convinced, however.

On 6/26/07, Oleg Nesterov [EMAIL PROTECTED] wrote:

On 06/26, Satyam Sharma wrote:

 Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process()
 in kthread_stop() itself?

Personally, I don't think we should do this.

kthread_stop() doesn't always mean kill this thread asap. Suppose that
CPU_DOWN does kthread_stop(workqueue-thread) but doesn't flush the queue
before that (we did so before 2.6.22 and perhaps we will do again). Now
work_struct-func() doing tcp_recvmsg() or wait_event_interruptible() fails,
but this is probably not that we want.


[ Well, first of all, anybody who sends a possibly-blocking-forever
function like tcp_recvmsg() to a *workqueue* needs to get his head
checked. ]

Anyway, I think _all_ usages of kthread_stop() in the kernel *do* want
the thread to stop *right then*. After all, kthread_stop() doesn't even
return (gets blocked on wait_for_completion()) till it knows the target
kthread *has* exited completely.

And if a workqueue is blocked on tcp_recvmsg() or skb_recv_datagram()
or some such, I don't see how that flush_workqueue (if that is what you
meant) would succeed anyway (unless you do send the signal too),
and we'll actually end up having a nice little situation on our hands if
we make the mistake of calling flush_workqueue on such a wq.

Note that the exact scenario you're talking about wouldn't mean the
kthread getting killed before it's supposed to be stopped anyway.
force_sig is not a synchronous wakeup, and also note that tcp_recvmsg()
or skb_recv_datagram() etc will exit (and are supposed to exit) cleanly on
seeing a signal.


 So could we have signals in _addition_ to kthread_stop_info and change
 kthread_should_stop() to check for both:

 kthread_stop_info.k == current  signal_pending(current)

No, this can't work in general. Some kthreads do flush_signals/dequeue_signal,
so TIF_SIGPENDING can be lost anyway.


Yup, I had thought of precisely this issue yesterday as well. The mental note
I made to myself was that the force_sig(SIGKILL) and wake_up_process() in
kthread_stop() must be atomic so that the following race is not possible:

Say:
#1 - thread that invokes kthread_stop()
#2 - kthread to be stopped, (may be) currently in wait_event_interruptible(),
 such that there is a bigger loop over the wait_event_interruptible()
 itself, which puts task back to sleep if this was a spurious wake up
 (if _not_ due to a signal).

Thread #1   Thread #2
=   =

skb_recv_datagram() -
wait_for_packet()
sleeping

...
force_sig(SIGKILL)
scheduled out

wakes up, sees the pending signal,
breaks out of wait_for_packet()
and skb_recv_datagram() back out to
our kthread code itself, but there
we see that kthread_should_stop() is
NOT yet true, we also see this
spurious signal, flush it, and call
skb_recv_datagram() all over again
...
skb_recv_datagram() -
wait_for_packet()
sleeping

scheduled in
kthread_stop() - wake_up_process()

this time we don't even break out of
the skb_recv_datagram() either, as
no signals are pending any more

i.e. thread #2 still does not exit cleanly. The root of the problem is that
functions such as skb_recv_datagram() - wait_for_packet() handle spurious
wakeups *internally* by themselves, so our kthread does not get a chance to
check for kthread_should_stop().

Of course, above race is true only for kthreads that do flush signals on
seeing spurious ones periodically. If it did not, then skb_recv_datagram()
called second time above would again have broken out because of
signal_pending() and we wouldn't have gone back to sleep. But we have to
be on safer side and avoid races *irrespective* of what the kthread might
or might not do, so let's _not_ depend on _assumed kthread behaviour_.

I suspect the above race be avoided by making force_sig() and
wake_up_process() atomic in kthread_stop() itself, please correct me
if I'm horribly wrong.


I personally think Jeff's idea to use force_sig() is right. kthread_create()
doesn't use CLONE_SIGHAND, so it is safe to change -sighand-actionp[].

(offtopic)

cifs_mount:

send_sig(SIGKILL,srvTcp-tsk,1);
tsk = 

Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled

2007-06-25 Thread Satyam Sharma

Hi,

On 6/9/07, Jeff Layton [EMAIL PROTECTED] wrote:

On Sat, 09 Jun 2007 11:30:04 +1000
Herbert Xu [EMAIL PROTECTED] wrote:

 Please cc networking patches to [EMAIL PROTECTED]

 Jeff Layton [EMAIL PROTECTED] wrote:
 
  The following patch is a first stab at removing this need. It makes it
  so that in tcp_recvmsg() we also check kthread_should_stop() at any
  point where we currently check to see if the task was signalled. If
  that returns true, then it acts as if it were signalled and returns to
  the calling function.


I got bit by the same thing when I was implementing a kthread that sleeps
on skb_recv_datagram() (= wait_for_packet) with noblock = 0 over a netlink
socket. I need to use the same SIGKILL hack before kthread_stop() to ensure
the kthread does wake up *and* unblock from skb_recv_datagram() when the
module is being unloaded. Searched hard, but just couldn't find a prettier
solution (if someone knows, please let me know).


 This just doesn't seem to fit.  Why should networking care about kthreads?


I agree.


 Perhaps you can get kthread_stop to send a signal instead?


Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process()
in kthread_stop() itself?

Looking at some happily out-of-date comments in the kthread code, I can
guess that at some point of time (perhaps very early drafts) Rusty actually
*did* implement the whole kthread_stop() functionality using signals, but
I suspect it might've been discarded and the kthread_stop_info approach
used instead to protect from spurious signals from userspace. (?)

So could we have signals in _addition_ to kthread_stop_info and change
kthread_should_stop() to check for both:

kthread_stop_info.k == current  signal_pending(current)

If !kthread_should_stop()  signal_pending(current) = spurious signal,
so just flush and discard (in the kthread).


The problem there is that we still have to make the kthread let signals
through. The nice thing about this approach is that we can make the
kthread ignore signals, but still allow it to break out of kernel_recvmsg
when a kthread_stop is done.


Why is it wrong for kthreads to let signals through? We can ignore out
all signals we're not interested in, and flush the spurious ones ...
otherwise there really isn't much those kthreads can do that get blocked
in such functions, is there?

Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled

2007-06-25 Thread Jeff Layton
On Tue, 26 Jun 2007 01:11:20 +0530
Satyam Sharma [EMAIL PROTECTED] wrote:

 Hi,
 
 On 6/9/07, Jeff Layton [EMAIL PROTECTED] wrote:
  On Sat, 09 Jun 2007 11:30:04 +1000
  Herbert Xu [EMAIL PROTECTED] wrote:
 
   Please cc networking patches to [EMAIL PROTECTED]
  
   Jeff Layton [EMAIL PROTECTED] wrote:
   
The following patch is a first stab at removing this need. It makes it
so that in tcp_recvmsg() we also check kthread_should_stop() at any
point where we currently check to see if the task was signalled. If
that returns true, then it acts as if it were signalled and returns to
the calling function.
 
 I got bit by the same thing when I was implementing a kthread that sleeps
 on skb_recv_datagram() (= wait_for_packet) with noblock = 0 over a netlink
 socket. I need to use the same SIGKILL hack before kthread_stop() to ensure
 the kthread does wake up *and* unblock from skb_recv_datagram() when the
 module is being unloaded. Searched hard, but just couldn't find a prettier
 solution (if someone knows, please let me know).
 
   This just doesn't seem to fit.  Why should networking care about kthreads?
 
 I agree.
 
   Perhaps you can get kthread_stop to send a signal instead?
 
 Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process()
 in kthread_stop() itself?
 
 Looking at some happily out-of-date comments in the kthread code, I can
 guess that at some point of time (perhaps very early drafts) Rusty actually
 *did* implement the whole kthread_stop() functionality using signals, but
 I suspect it might've been discarded and the kthread_stop_info approach
 used instead to protect from spurious signals from userspace. (?)
 
 So could we have signals in _addition_ to kthread_stop_info and change
 kthread_should_stop() to check for both:
 
 kthread_stop_info.k == current  signal_pending(current)
 
 If !kthread_should_stop()  signal_pending(current) = spurious signal,
 so just flush and discard (in the kthread).
 
  The problem there is that we still have to make the kthread let signals
  through. The nice thing about this approach is that we can make the
  kthread ignore signals, but still allow it to break out of kernel_recvmsg
  when a kthread_stop is done.
 
 Why is it wrong for kthreads to let signals through? We can ignore out
 all signals we're not interested in, and flush the spurious ones ...
 otherwise there really isn't much those kthreads can do that get blocked
 in such functions, is there?
 
 Satyam

Yes, after I wrote that I began to question that assumption too. I was
pretty much going on a statement by Christoph Hellwig on an earlier
patch that I did:

-[snip]--
The right way to fix this is to stop sending signals at all and have
a kernel-internal way to get out of kernel_recvmsg.  Uses of signals by
kernel thread generally are bugs.
-[snip]--

Though this makes no sense to me. I don't see any reason why kthreads
can't use signals, and hacking support for breaking out of sleeping
functions seems redundant.

My latest patch for cifsd has it block all signals from userspace
and uses force_sig() instead of send_sig() when trying to stop the
thread. This seems to work pretty well and  still insulates the thread
from userspace signals.

-- 
Jeff Layton [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled

2007-06-09 Thread Jeff Layton
On Sat, 09 Jun 2007 11:30:04 +1000
Herbert Xu [EMAIL PROTECTED] wrote:

 Please cc networking patches to [EMAIL PROTECTED]
 
 Jeff Layton [EMAIL PROTECTED] wrote:
  
  The following patch is a first stab at removing this need. It makes it
  so that in tcp_recvmsg() we also check kthread_should_stop() at any
  point where we currently check to see if the task was signalled. If
  that returns true, then it acts as if it were signalled and returns to
  the calling function.
 
 This just doesn't seem to fit.  Why should networking care about kthreads?
 
 Perhaps you can get kthread_stop to send a signal instead?
 

The problem there is that we still have to make the kthread let signals
through. The nice thing about this approach is that we can make the
kthread ignore signals, but still allow it to break out of kernel_recvmsg
when a kthread_stop is done.

Though I will confess that you have a point about this feeling like a
layering violation...

-- 
Jeff Layton [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled

2007-06-08 Thread Jeff Layton
Already sent this to several lists, but forgot netdev ;-)...

This one's sort of outside my normal area of expertise so sending this
as an RFC to gather feedback on the idea.

Some background:

The cifs_mount() and cifs_umount() functions currently send a signal to
the cifsd kthread prior to calling kthread_stop on it. The reasoning is
apparently that it's likely that cifsd will have called kernel_recvmsg()
and if it doesn't do this there can be a rather long delay when a
filesystem is unmounted.

The following patch is a first stab at removing this need. It makes it
so that in tcp_recvmsg() we also check kthread_should_stop() at any
point where we currently check to see if the task was signalled. If
that returns true, then it acts as if it were signalled.

I've tested this on a fairly recent kernel with a cifs module that
doesn't send signals on unmount and it seems to work as expected. I'm
just not clear on whether it will have any adverse side-effects.

Obviously if this approach is OK then we'll probably also want to fix
up other recvmsg functions (udp_recvmsg, etc).

Anyone care to comment?

Thanks,

Signed-off-by: Jeff Layton [EMAIL PROTECTED]

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index bd4c295..1ad91fa 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -258,6 +258,7 @@
 #include linux/cache.h
 #include linux/err.h
 #include linux/crypto.h
+#include linux/kthread.h
 
 #include net/icmp.h
 #include net/tcp.h
@@ -1154,7 +1155,7 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, 
struct msghdr *msg,
if (tp-urg_data  tp-urg_seq == *seq) {
if (copied)
break;
-   if (signal_pending(current)) {
+   if (signal_pending(current) || kthread_should_stop()) {
copied = timeo ? sock_intr_errno(timeo) : 
-EAGAIN;
break;
}
@@ -1197,6 +1198,7 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, 
struct msghdr *msg,
(sk-sk_shutdown  RCV_SHUTDOWN) ||
!timeo ||
signal_pending(current) ||
+   kthread_should_stop() ||
(flags  MSG_PEEK))
break;
} else {
@@ -1227,7 +1229,7 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, 
struct msghdr *msg,
break;
}
 
-   if (signal_pending(current)) {
+   if (signal_pending(current) || kthread_should_stop()) {
copied = sock_intr_errno(timeo);
break;
}
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled

2007-06-08 Thread Herbert Xu
Please cc networking patches to [EMAIL PROTECTED]

Jeff Layton [EMAIL PROTECTED] wrote:
 
 The following patch is a first stab at removing this need. It makes it
 so that in tcp_recvmsg() we also check kthread_should_stop() at any
 point where we currently check to see if the task was signalled. If
 that returns true, then it acts as if it were signalled and returns to
 the calling function.

This just doesn't seem to fit.  Why should networking care about kthreads?

Perhaps you can get kthread_stop to send a signal instead?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html