Re: [PATCH] task_work: add a scheduling point in task_work_run()
On Wed, Aug 22, 2012 at 01:27:21PM +0800, Michael Wang wrote: > And can we make sure that it is safe to sleep(schedule) at this point? > It may need some totally testing to cover all the situation... task_work callback can bloody well block, so yes, it is safe. Hell, we are doing final close from that; that can lead to any amount of IO, up to and including on-disk file freeing and, in case of vfsmount kept alive by an opened file after we'd done umount -l, actual final unmount of a filesystem. That can more than just block, that can block for a long time if that's a network filesystem... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] task_work: add a scheduling point in task_work_run()
Hi, Eric On 08/21/2012 09:05 PM, Eric Dumazet wrote: > From: Eric Dumazet > > It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced > the problem addressed in commit 944be0b2 (close_files(): add scheduling > point) > > If a server process with a lot of files (say 2 million tcp sockets) > is killed, we can spend a lot of time in task_work_run() and trigger > a soft lockup. The thread will be rescheduled if we support kernel preempt, so this change may only help the case we haven't enabled CONFIG_PREEMPT, isn't it? What about using ifndef? And can we make sure that it is safe to sleep(schedule) at this point? It may need some totally testing to cover all the situation... Regards, Michael Wang > > Signed-off-by: Eric Dumazet > --- > kernel/task_work.c |1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/task_work.c b/kernel/task_work.c > index 91d4e17..d320d44 100644 > --- a/kernel/task_work.c > +++ b/kernel/task_work.c > @@ -75,6 +75,7 @@ void task_work_run(void) > p = q->next; > q->func(q); > q = p; > + cond_resched(); > } > } > } > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] task_work: add a scheduling point in task_work_run()
On Tue, 2012-08-21 at 23:32 +0200, Eric Dumazet wrote: > On Tue, 2012-08-21 at 16:37 -0400, Mimi Zohar wrote: > > > We're here, because fput() called schedule_work() to delay the last > > fput(). The execution needs to take place before the syscall returns to > > userspace. Need to read __schedule()... Do you know if cond_resched() > > can guarantee that it will be executed before the return to userspace? > > Some clarifications : > > - fput() does not call schedule_work() in this case but task_work_add() > > - cond_resched() wont return to userspace. Thanks for the clarification. Mimi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] task_work: add a scheduling point in task_work_run()
On Tue, 2012-08-21 at 16:37 -0400, Mimi Zohar wrote: > We're here, because fput() called schedule_work() to delay the last > fput(). The execution needs to take place before the syscall returns to > userspace. Need to read __schedule()... Do you know if cond_resched() > can guarantee that it will be executed before the return to userspace? Some clarifications : - fput() does not call schedule_work() in this case but task_work_add() - cond_resched() wont return to userspace. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] task_work: add a scheduling point in task_work_run()
On Tue, 2012-08-21 at 15:05 +0200, Eric Dumazet wrote: > From: Eric Dumazet > > It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced > the problem addressed in commit 944be0b2 (close_files(): add scheduling > point) > > If a server process with a lot of files (say 2 million tcp sockets) > is killed, we can spend a lot of time in task_work_run() and trigger > a soft lockup. > > Signed-off-by: Eric Dumazet > --- > kernel/task_work.c |1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/task_work.c b/kernel/task_work.c > index 91d4e17..d320d44 100644 > --- a/kernel/task_work.c > +++ b/kernel/task_work.c > @@ -75,6 +75,7 @@ void task_work_run(void) > p = q->next; > q->func(q); > q = p; > + cond_resched(); > } > } > } We're here, because fput() called schedule_work() to delay the last fput(). The execution needs to take place before the syscall returns to userspace. Need to read __schedule()... Do you know if cond_resched() can guarantee that it will be executed before the return to userspace? thanks, Mimi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] task_work: add a scheduling point in task_work_run()
From: Eric Dumazet It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced the problem addressed in commit 944be0b2 (close_files(): add scheduling point) If a server process with a lot of files (say 2 million tcp sockets) is killed, we can spend a lot of time in task_work_run() and trigger a soft lockup. Signed-off-by: Eric Dumazet --- kernel/task_work.c |1 + 1 file changed, 1 insertion(+) diff --git a/kernel/task_work.c b/kernel/task_work.c index 91d4e17..d320d44 100644 --- a/kernel/task_work.c +++ b/kernel/task_work.c @@ -75,6 +75,7 @@ void task_work_run(void) p = q->next; q->func(q); q = p; + cond_resched(); } } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] task_work: add a scheduling point in task_work_run()
On Tue, 2012-08-21 at 16:37 -0400, Mimi Zohar wrote: We're here, because fput() called schedule_work() to delay the last fput(). The execution needs to take place before the syscall returns to userspace. Need to read __schedule()... Do you know if cond_resched() can guarantee that it will be executed before the return to userspace? Some clarifications : - fput() does not call schedule_work() in this case but task_work_add() - cond_resched() wont return to userspace. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] task_work: add a scheduling point in task_work_run()
On Tue, 2012-08-21 at 23:32 +0200, Eric Dumazet wrote: On Tue, 2012-08-21 at 16:37 -0400, Mimi Zohar wrote: We're here, because fput() called schedule_work() to delay the last fput(). The execution needs to take place before the syscall returns to userspace. Need to read __schedule()... Do you know if cond_resched() can guarantee that it will be executed before the return to userspace? Some clarifications : - fput() does not call schedule_work() in this case but task_work_add() - cond_resched() wont return to userspace. Thanks for the clarification. Mimi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] task_work: add a scheduling point in task_work_run()
Hi, Eric On 08/21/2012 09:05 PM, Eric Dumazet wrote: From: Eric Dumazet eduma...@google.com It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced the problem addressed in commit 944be0b2 (close_files(): add scheduling point) If a server process with a lot of files (say 2 million tcp sockets) is killed, we can spend a lot of time in task_work_run() and trigger a soft lockup. The thread will be rescheduled if we support kernel preempt, so this change may only help the case we haven't enabled CONFIG_PREEMPT, isn't it? What about using ifndef? And can we make sure that it is safe to sleep(schedule) at this point? It may need some totally testing to cover all the situation... Regards, Michael Wang Signed-off-by: Eric Dumazet eduma...@google.com --- kernel/task_work.c |1 + 1 file changed, 1 insertion(+) diff --git a/kernel/task_work.c b/kernel/task_work.c index 91d4e17..d320d44 100644 --- a/kernel/task_work.c +++ b/kernel/task_work.c @@ -75,6 +75,7 @@ void task_work_run(void) p = q-next; q-func(q); q = p; + cond_resched(); } } } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] task_work: add a scheduling point in task_work_run()
On Wed, Aug 22, 2012 at 01:27:21PM +0800, Michael Wang wrote: And can we make sure that it is safe to sleep(schedule) at this point? It may need some totally testing to cover all the situation... task_work callback can bloody well block, so yes, it is safe. Hell, we are doing final close from that; that can lead to any amount of IO, up to and including on-disk file freeing and, in case of vfsmount kept alive by an opened file after we'd done umount -l, actual final unmount of a filesystem. That can more than just block, that can block for a long time if that's a network filesystem... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] task_work: add a scheduling point in task_work_run()
From: Eric Dumazet eduma...@google.com It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced the problem addressed in commit 944be0b2 (close_files(): add scheduling point) If a server process with a lot of files (say 2 million tcp sockets) is killed, we can spend a lot of time in task_work_run() and trigger a soft lockup. Signed-off-by: Eric Dumazet eduma...@google.com --- kernel/task_work.c |1 + 1 file changed, 1 insertion(+) diff --git a/kernel/task_work.c b/kernel/task_work.c index 91d4e17..d320d44 100644 --- a/kernel/task_work.c +++ b/kernel/task_work.c @@ -75,6 +75,7 @@ void task_work_run(void) p = q-next; q-func(q); q = p; + cond_resched(); } } } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] task_work: add a scheduling point in task_work_run()
On Tue, 2012-08-21 at 15:05 +0200, Eric Dumazet wrote: From: Eric Dumazet eduma...@google.com It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced the problem addressed in commit 944be0b2 (close_files(): add scheduling point) If a server process with a lot of files (say 2 million tcp sockets) is killed, we can spend a lot of time in task_work_run() and trigger a soft lockup. Signed-off-by: Eric Dumazet eduma...@google.com --- kernel/task_work.c |1 + 1 file changed, 1 insertion(+) diff --git a/kernel/task_work.c b/kernel/task_work.c index 91d4e17..d320d44 100644 --- a/kernel/task_work.c +++ b/kernel/task_work.c @@ -75,6 +75,7 @@ void task_work_run(void) p = q-next; q-func(q); q = p; + cond_resched(); } } } We're here, because fput() called schedule_work() to delay the last fput(). The execution needs to take place before the syscall returns to userspace. Need to read __schedule()... Do you know if cond_resched() can guarantee that it will be executed before the return to userspace? thanks, Mimi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/