Re: [PATCH] task_work: add a scheduling point in task_work_run()

2012-08-21 Thread Al Viro
On Wed, Aug 22, 2012 at 01:27:21PM +0800, Michael Wang wrote:

> And can we make sure that it is safe to sleep(schedule) at this point?
> It may need some totally testing to cover all the situation...

task_work callback can bloody well block, so yes, it is safe.  Hell,
we are doing final close from that; that can lead to any amount of
IO, up to and including on-disk file freeing and, in case of vfsmount
kept alive by an opened file after we'd done umount -l, actual final
unmount of a filesystem.  That can more than just block, that can block
for a long time if that's a network filesystem...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_work: add a scheduling point in task_work_run()

2012-08-21 Thread Michael Wang
Hi, Eric

On 08/21/2012 09:05 PM, Eric Dumazet wrote:
> From: Eric Dumazet 
> 
> It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced
> the problem addressed in commit 944be0b2 (close_files(): add scheduling
> point)
> 
> If a server process with a lot of files (say 2 million tcp sockets)
> is killed, we can spend a lot of time in task_work_run() and trigger
> a soft lockup.

The thread will be rescheduled if we support kernel preempt, so this
change may only help the case we haven't enabled CONFIG_PREEMPT, isn't
it? What about using ifndef?

And can we make sure that it is safe to sleep(schedule) at this point?
It may need some totally testing to cover all the situation...

Regards,
Michael Wang

> 
> Signed-off-by: Eric Dumazet 
> ---
>  kernel/task_work.c |1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/task_work.c b/kernel/task_work.c
> index 91d4e17..d320d44 100644
> --- a/kernel/task_work.c
> +++ b/kernel/task_work.c
> @@ -75,6 +75,7 @@ void task_work_run(void)
>   p = q->next;
>   q->func(q);
>   q = p;
> + cond_resched();
>   }
>   }
>  }
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_work: add a scheduling point in task_work_run()

2012-08-21 Thread Mimi Zohar
On Tue, 2012-08-21 at 23:32 +0200, Eric Dumazet wrote:
> On Tue, 2012-08-21 at 16:37 -0400, Mimi Zohar wrote:
> 
> > We're here, because fput() called schedule_work() to delay the last
> > fput().  The execution needs to take place before the syscall returns to
> > userspace.  Need to read __schedule()...  Do you know if cond_resched()
> > can guarantee that it will be executed before the return to userspace? 
> 
> Some clarifications : 
> 
> - fput() does not call schedule_work() in this case but task_work_add()
> 
> - cond_resched() wont return to userspace.

Thanks for the clarification.

Mimi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_work: add a scheduling point in task_work_run()

2012-08-21 Thread Eric Dumazet
On Tue, 2012-08-21 at 16:37 -0400, Mimi Zohar wrote:

> We're here, because fput() called schedule_work() to delay the last
> fput().  The execution needs to take place before the syscall returns to
> userspace.  Need to read __schedule()...  Do you know if cond_resched()
> can guarantee that it will be executed before the return to userspace? 

Some clarifications : 

- fput() does not call schedule_work() in this case but task_work_add()

- cond_resched() wont return to userspace.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_work: add a scheduling point in task_work_run()

2012-08-21 Thread Mimi Zohar
On Tue, 2012-08-21 at 15:05 +0200, Eric Dumazet wrote:
> From: Eric Dumazet 
> 
> It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced
> the problem addressed in commit 944be0b2 (close_files(): add scheduling
> point)
> 
> If a server process with a lot of files (say 2 million tcp sockets)
> is killed, we can spend a lot of time in task_work_run() and trigger
> a soft lockup.
> 
> Signed-off-by: Eric Dumazet 
> ---
>  kernel/task_work.c |1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/task_work.c b/kernel/task_work.c
> index 91d4e17..d320d44 100644
> --- a/kernel/task_work.c
> +++ b/kernel/task_work.c
> @@ -75,6 +75,7 @@ void task_work_run(void)
>   p = q->next;
>   q->func(q);
>   q = p;
> + cond_resched();
>   }
>   }
>  }

We're here, because fput() called schedule_work() to delay the last
fput().  The execution needs to take place before the syscall returns to
userspace.  Need to read __schedule()...  Do you know if cond_resched()
can guarantee that it will be executed before the return to userspace? 

thanks,

Mimi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] task_work: add a scheduling point in task_work_run()

2012-08-21 Thread Eric Dumazet
From: Eric Dumazet 

It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced
the problem addressed in commit 944be0b2 (close_files(): add scheduling
point)

If a server process with a lot of files (say 2 million tcp sockets)
is killed, we can spend a lot of time in task_work_run() and trigger
a soft lockup.

Signed-off-by: Eric Dumazet 
---
 kernel/task_work.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/task_work.c b/kernel/task_work.c
index 91d4e17..d320d44 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -75,6 +75,7 @@ void task_work_run(void)
p = q->next;
q->func(q);
q = p;
+   cond_resched();
}
}
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_work: add a scheduling point in task_work_run()

2012-08-21 Thread Eric Dumazet
On Tue, 2012-08-21 at 16:37 -0400, Mimi Zohar wrote:

 We're here, because fput() called schedule_work() to delay the last
 fput().  The execution needs to take place before the syscall returns to
 userspace.  Need to read __schedule()...  Do you know if cond_resched()
 can guarantee that it will be executed before the return to userspace? 

Some clarifications : 

- fput() does not call schedule_work() in this case but task_work_add()

- cond_resched() wont return to userspace.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_work: add a scheduling point in task_work_run()

2012-08-21 Thread Mimi Zohar
On Tue, 2012-08-21 at 23:32 +0200, Eric Dumazet wrote:
 On Tue, 2012-08-21 at 16:37 -0400, Mimi Zohar wrote:
 
  We're here, because fput() called schedule_work() to delay the last
  fput().  The execution needs to take place before the syscall returns to
  userspace.  Need to read __schedule()...  Do you know if cond_resched()
  can guarantee that it will be executed before the return to userspace? 
 
 Some clarifications : 
 
 - fput() does not call schedule_work() in this case but task_work_add()
 
 - cond_resched() wont return to userspace.

Thanks for the clarification.

Mimi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_work: add a scheduling point in task_work_run()

2012-08-21 Thread Michael Wang
Hi, Eric

On 08/21/2012 09:05 PM, Eric Dumazet wrote:
 From: Eric Dumazet eduma...@google.com
 
 It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced
 the problem addressed in commit 944be0b2 (close_files(): add scheduling
 point)
 
 If a server process with a lot of files (say 2 million tcp sockets)
 is killed, we can spend a lot of time in task_work_run() and trigger
 a soft lockup.

The thread will be rescheduled if we support kernel preempt, so this
change may only help the case we haven't enabled CONFIG_PREEMPT, isn't
it? What about using ifndef?

And can we make sure that it is safe to sleep(schedule) at this point?
It may need some totally testing to cover all the situation...

Regards,
Michael Wang

 
 Signed-off-by: Eric Dumazet eduma...@google.com
 ---
  kernel/task_work.c |1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/kernel/task_work.c b/kernel/task_work.c
 index 91d4e17..d320d44 100644
 --- a/kernel/task_work.c
 +++ b/kernel/task_work.c
 @@ -75,6 +75,7 @@ void task_work_run(void)
   p = q-next;
   q-func(q);
   q = p;
 + cond_resched();
   }
   }
  }
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_work: add a scheduling point in task_work_run()

2012-08-21 Thread Al Viro
On Wed, Aug 22, 2012 at 01:27:21PM +0800, Michael Wang wrote:

 And can we make sure that it is safe to sleep(schedule) at this point?
 It may need some totally testing to cover all the situation...

task_work callback can bloody well block, so yes, it is safe.  Hell,
we are doing final close from that; that can lead to any amount of
IO, up to and including on-disk file freeing and, in case of vfsmount
kept alive by an opened file after we'd done umount -l, actual final
unmount of a filesystem.  That can more than just block, that can block
for a long time if that's a network filesystem...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] task_work: add a scheduling point in task_work_run()

2012-08-21 Thread Eric Dumazet
From: Eric Dumazet eduma...@google.com

It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced
the problem addressed in commit 944be0b2 (close_files(): add scheduling
point)

If a server process with a lot of files (say 2 million tcp sockets)
is killed, we can spend a lot of time in task_work_run() and trigger
a soft lockup.

Signed-off-by: Eric Dumazet eduma...@google.com
---
 kernel/task_work.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/task_work.c b/kernel/task_work.c
index 91d4e17..d320d44 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -75,6 +75,7 @@ void task_work_run(void)
p = q-next;
q-func(q);
q = p;
+   cond_resched();
}
}
 }


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_work: add a scheduling point in task_work_run()

2012-08-21 Thread Mimi Zohar
On Tue, 2012-08-21 at 15:05 +0200, Eric Dumazet wrote:
 From: Eric Dumazet eduma...@google.com
 
 It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced
 the problem addressed in commit 944be0b2 (close_files(): add scheduling
 point)
 
 If a server process with a lot of files (say 2 million tcp sockets)
 is killed, we can spend a lot of time in task_work_run() and trigger
 a soft lockup.
 
 Signed-off-by: Eric Dumazet eduma...@google.com
 ---
  kernel/task_work.c |1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/kernel/task_work.c b/kernel/task_work.c
 index 91d4e17..d320d44 100644
 --- a/kernel/task_work.c
 +++ b/kernel/task_work.c
 @@ -75,6 +75,7 @@ void task_work_run(void)
   p = q-next;
   q-func(q);
   q = p;
 + cond_resched();
   }
   }
  }

We're here, because fput() called schedule_work() to delay the last
fput().  The execution needs to take place before the syscall returns to
userspace.  Need to read __schedule()...  Do you know if cond_resched()
can guarantee that it will be executed before the return to userspace? 

thanks,

Mimi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/