Re: [PATCH] TASK_KILLABLE version 2

2007-12-06 Thread Matthew Wilcox
On Mon, Sep 24, 2007 at 04:16:48PM -0400, Bob Bell wrote:
> I've been testing this patch on my systems.  It's working for me when
> I read() a file.  Asynchronous write()s seem okay, too.  However,
> synchronous writes (caused by either calling fsync() or fcntl() to
> release a lock) prevent the process from being killed when the NFS
> server goes down.
> 
> When the process is sent SIGKILL, it's waiting with the following call
> tree:
>do_fsync
>nfs_fsync
>nfs_wb_all
>nfs_sync_mapping_wait
>nfs_wait_on_requests_locks (I believe)
>nfs_wait_on_request
>out_of_line_wait_on_bit
>__wait_on_bit
>nfs_wait_bit_interruptible
>schedule
> 
> When the process is later viewed after being deemed "stuck", it's
> waiting with the following call tree:
>do_fsync
>filemap_fdatawait
>wait_on_page_writeback_range
>wait_on_page_writeback
>wait_on_page_bit
>__wait_on_bit
>sync_page
>io_schedule
>schedule
> 
> If I hazard a guess as to what might be wrong here, I believe that when
> the processes catches SIGKILL, nfs_wait_bit_interruptible is returning
> -ERESTARTSYS.  That error bubbles back up to nfs_fsync.  However,
> nfs_fsync returns ctx->error, not -ERESTARTSYS, and ctx->error is 0.
> do_fsync proceeds to call filemap_fdatawait.  I question whether 
> nfs_sync should return an error, and if do_fsync should skip 
> filemap_fdatawait if the fsync op returned an error.
> 
> I did try replacing the call to sync_page in __wait_on_bit with 
> sync_page_killable and replacing TASK_UNINTERRUPTIBLE with 
> TASK_KILLABLE.  That seemed to work once, but then really screwed things 
> up on subsequent attempts.

Hi Bob,

The patch I posted earlier today (a somewhat cleaned-up version of a
patch originally from Liam Howlett) converts NFS to use TASK_KILLABLE.
Could you have another shot at breaking it?  You can find it here:

http://marc.info/?l=linux-fsdevel=119698206729969=2

(It has some prerequisites that are in the git tree, so probably the
easiest thing is just to pull that git tree).

Liam notes that there's still problems with programs that call *stat*(),
but I haven't looked into that issue yet.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] TASK_KILLABLE version 2

2007-12-06 Thread Matthew Wilcox
On Mon, Sep 24, 2007 at 04:16:48PM -0400, Bob Bell wrote:
 I've been testing this patch on my systems.  It's working for me when
 I read() a file.  Asynchronous write()s seem okay, too.  However,
 synchronous writes (caused by either calling fsync() or fcntl() to
 release a lock) prevent the process from being killed when the NFS
 server goes down.
 
 When the process is sent SIGKILL, it's waiting with the following call
 tree:
do_fsync
nfs_fsync
nfs_wb_all
nfs_sync_mapping_wait
nfs_wait_on_requests_locks (I believe)
nfs_wait_on_request
out_of_line_wait_on_bit
__wait_on_bit
nfs_wait_bit_interruptible
schedule
 
 When the process is later viewed after being deemed stuck, it's
 waiting with the following call tree:
do_fsync
filemap_fdatawait
wait_on_page_writeback_range
wait_on_page_writeback
wait_on_page_bit
__wait_on_bit
sync_page
io_schedule
schedule
 
 If I hazard a guess as to what might be wrong here, I believe that when
 the processes catches SIGKILL, nfs_wait_bit_interruptible is returning
 -ERESTARTSYS.  That error bubbles back up to nfs_fsync.  However,
 nfs_fsync returns ctx-error, not -ERESTARTSYS, and ctx-error is 0.
 do_fsync proceeds to call filemap_fdatawait.  I question whether 
 nfs_sync should return an error, and if do_fsync should skip 
 filemap_fdatawait if the fsync op returned an error.
 
 I did try replacing the call to sync_page in __wait_on_bit with 
 sync_page_killable and replacing TASK_UNINTERRUPTIBLE with 
 TASK_KILLABLE.  That seemed to work once, but then really screwed things 
 up on subsequent attempts.

Hi Bob,

The patch I posted earlier today (a somewhat cleaned-up version of a
patch originally from Liam Howlett) converts NFS to use TASK_KILLABLE.
Could you have another shot at breaking it?  You can find it here:

http://marc.info/?l=linux-fsdevelm=119698206729969w=2

(It has some prerequisites that are in the git tree, so probably the
easiest thing is just to pull that git tree).

Liam notes that there's still problems with programs that call *stat*(),
but I haven't looked into that issue yet.

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] TASK_KILLABLE version 2

2007-09-28 Thread Nick Piggin
On Wednesday 26 September 2007 21:57, Ric Wheeler wrote:
> Bob Bell wrote:
> > On Sat, Sep 01, 2007 at 08:43:49PM -0600, Matthew Wilcox wrote:
> >> Here's the second version of TASK_KILLABLE.  A few changes since
> >> version 1:
> >
> > 
> >
> >> I obviously haven't covered every place that can result in a process
> >> sleeping uninterruptibly while attempting an operation.  But sync_page
> >> (patch 4/5) covers about 90% of the times I've attempted to kill cat,
> >> and I hope that by providing the two examples, I can help other people
> >> to fix the cases that they find interesting.
> >
> > I've been testing this patch on my systems.  It's working for me when
> > I read() a file.  Asynchronous write()s seem okay, too.  However,
> > synchronous writes (caused by either calling fsync() or fcntl() to
> > release a lock) prevent the process from being killed when the NFS
> > server goes down.
>
> After hearing again last month about how few people actually read every
> lkml thread, I  wanted to point you all at this thread explicitly since
> it seems that we are getting somewhat close to having a forced unmount
> that actually is usable by real applications, something that we seem to
> have been talking about for many years ;-)
>
> With Matthew's original TASK_KILLABLE patch, we have a solution for a
> task read, but still have some holes (fsync & fcntl, others?) that need
> fixed as well for NFS clients.
>
> Is this patch going in the right direction?

FWIW, I do think it seems like a good idea to work towards better
interruptibility in various potentially long-blocking paths like these.
I think Andrea's recent work to solve some oom killer deadlocks
probably has some requirements in common with this patch.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] TASK_KILLABLE version 2

2007-09-28 Thread Nick Piggin
On Wednesday 26 September 2007 21:57, Ric Wheeler wrote:
 Bob Bell wrote:
  On Sat, Sep 01, 2007 at 08:43:49PM -0600, Matthew Wilcox wrote:
  Here's the second version of TASK_KILLABLE.  A few changes since
  version 1:
 
  snip
 
  I obviously haven't covered every place that can result in a process
  sleeping uninterruptibly while attempting an operation.  But sync_page
  (patch 4/5) covers about 90% of the times I've attempted to kill cat,
  and I hope that by providing the two examples, I can help other people
  to fix the cases that they find interesting.
 
  I've been testing this patch on my systems.  It's working for me when
  I read() a file.  Asynchronous write()s seem okay, too.  However,
  synchronous writes (caused by either calling fsync() or fcntl() to
  release a lock) prevent the process from being killed when the NFS
  server goes down.

 After hearing again last month about how few people actually read every
 lkml thread, I  wanted to point you all at this thread explicitly since
 it seems that we are getting somewhat close to having a forced unmount
 that actually is usable by real applications, something that we seem to
 have been talking about for many years ;-)

 With Matthew's original TASK_KILLABLE patch, we have a solution for a
 task read, but still have some holes (fsync  fcntl, others?) that need
 fixed as well for NFS clients.

 Is this patch going in the right direction?

FWIW, I do think it seems like a good idea to work towards better
interruptibility in various potentially long-blocking paths like these.
I think Andrea's recent work to solve some oom killer deadlocks
probably has some requirements in common with this patch.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] TASK_KILLABLE version 2

2007-09-26 Thread Ric Wheeler

Bob Bell wrote:

On Sat, Sep 01, 2007 at 08:43:49PM -0600, Matthew Wilcox wrote:
Here's the second version of TASK_KILLABLE.  A few changes since 
version 1:



I obviously haven't covered every place that can result in a process
sleeping uninterruptibly while attempting an operation.  But sync_page
(patch 4/5) covers about 90% of the times I've attempted to kill cat,
and I hope that by providing the two examples, I can help other people
to fix the cases that they find interesting.


I've been testing this patch on my systems.  It's working for me when
I read() a file.  Asynchronous write()s seem okay, too.  However,
synchronous writes (caused by either calling fsync() or fcntl() to
release a lock) prevent the process from being killed when the NFS
server goes down.


After hearing again last month about how few people actually read every 
lkml thread, I  wanted to point you all at this thread explicitly since 
it seems that we are getting somewhat close to having a forced unmount 
that actually is usable by real applications, something that we seem to 
have been talking about for many years ;-)


With Matthew's original TASK_KILLABLE patch, we have a solution for a 
task read, but still have some holes (fsync & fcntl, others?) that need 
fixed as well for NFS clients.


Is this patch going in the right direction?

ric



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] TASK_KILLABLE version 2

2007-09-26 Thread Ric Wheeler

Bob Bell wrote:

On Sat, Sep 01, 2007 at 08:43:49PM -0600, Matthew Wilcox wrote:
Here's the second version of TASK_KILLABLE.  A few changes since 
version 1:

snip

I obviously haven't covered every place that can result in a process
sleeping uninterruptibly while attempting an operation.  But sync_page
(patch 4/5) covers about 90% of the times I've attempted to kill cat,
and I hope that by providing the two examples, I can help other people
to fix the cases that they find interesting.


I've been testing this patch on my systems.  It's working for me when
I read() a file.  Asynchronous write()s seem okay, too.  However,
synchronous writes (caused by either calling fsync() or fcntl() to
release a lock) prevent the process from being killed when the NFS
server goes down.


After hearing again last month about how few people actually read every 
lkml thread, I  wanted to point you all at this thread explicitly since 
it seems that we are getting somewhat close to having a forced unmount 
that actually is usable by real applications, something that we seem to 
have been talking about for many years ;-)


With Matthew's original TASK_KILLABLE patch, we have a solution for a 
task read, but still have some holes (fsync  fcntl, others?) that need 
fixed as well for NFS clients.


Is this patch going in the right direction?

ric



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] TASK_KILLABLE version 2

2007-09-24 Thread Bob Bell

On Sat, Sep 01, 2007 at 08:43:49PM -0600, Matthew Wilcox wrote:

Here's the second version of TASK_KILLABLE.  A few changes since version 1:



I obviously haven't covered every place that can result in a process
sleeping uninterruptibly while attempting an operation.  But sync_page
(patch 4/5) covers about 90% of the times I've attempted to kill cat,
and I hope that by providing the two examples, I can help other people
to fix the cases that they find interesting.


I've been testing this patch on my systems.  It's working for me when
I read() a file.  Asynchronous write()s seem okay, too.  However,
synchronous writes (caused by either calling fsync() or fcntl() to
release a lock) prevent the process from being killed when the NFS
server goes down.

When the process is sent SIGKILL, it's waiting with the following call
tree:
   do_fsync
   nfs_fsync
   nfs_wb_all
   nfs_sync_mapping_wait
   nfs_wait_on_requests_locks (I believe)
   nfs_wait_on_request
   out_of_line_wait_on_bit
   __wait_on_bit
   nfs_wait_bit_interruptible
   schedule

When the process is later viewed after being deemed "stuck", it's
waiting with the following call tree:
   do_fsync
   filemap_fdatawait
   wait_on_page_writeback_range
   wait_on_page_writeback
   wait_on_page_bit
   __wait_on_bit
   sync_page
   io_schedule
   schedule

If I hazard a guess as to what might be wrong here, I believe that when
the processes catches SIGKILL, nfs_wait_bit_interruptible is returning
-ERESTARTSYS.  That error bubbles back up to nfs_fsync.  However,
nfs_fsync returns ctx->error, not -ERESTARTSYS, and ctx->error is 0.
do_fsync proceeds to call filemap_fdatawait.  I question whether 
nfs_sync should return an error, and if do_fsync should skip 
filemap_fdatawait if the fsync op returned an error.


I did try replacing the call to sync_page in __wait_on_bit with 
sync_page_killable and replacing TASK_UNINTERRUPTIBLE with 
TASK_KILLABLE.  That seemed to work once, but then really screwed things 
up on subsequent attempts.


--
Bob Bell
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] TASK_KILLABLE version 2

2007-09-24 Thread Bob Bell

On Sat, Sep 01, 2007 at 08:43:49PM -0600, Matthew Wilcox wrote:

Here's the second version of TASK_KILLABLE.  A few changes since version 1:

snip

I obviously haven't covered every place that can result in a process
sleeping uninterruptibly while attempting an operation.  But sync_page
(patch 4/5) covers about 90% of the times I've attempted to kill cat,
and I hope that by providing the two examples, I can help other people
to fix the cases that they find interesting.


I've been testing this patch on my systems.  It's working for me when
I read() a file.  Asynchronous write()s seem okay, too.  However,
synchronous writes (caused by either calling fsync() or fcntl() to
release a lock) prevent the process from being killed when the NFS
server goes down.

When the process is sent SIGKILL, it's waiting with the following call
tree:
   do_fsync
   nfs_fsync
   nfs_wb_all
   nfs_sync_mapping_wait
   nfs_wait_on_requests_locks (I believe)
   nfs_wait_on_request
   out_of_line_wait_on_bit
   __wait_on_bit
   nfs_wait_bit_interruptible
   schedule

When the process is later viewed after being deemed stuck, it's
waiting with the following call tree:
   do_fsync
   filemap_fdatawait
   wait_on_page_writeback_range
   wait_on_page_writeback
   wait_on_page_bit
   __wait_on_bit
   sync_page
   io_schedule
   schedule

If I hazard a guess as to what might be wrong here, I believe that when
the processes catches SIGKILL, nfs_wait_bit_interruptible is returning
-ERESTARTSYS.  That error bubbles back up to nfs_fsync.  However,
nfs_fsync returns ctx-error, not -ERESTARTSYS, and ctx-error is 0.
do_fsync proceeds to call filemap_fdatawait.  I question whether 
nfs_sync should return an error, and if do_fsync should skip 
filemap_fdatawait if the fsync op returned an error.


I did try replacing the call to sync_page in __wait_on_bit with 
sync_page_killable and replacing TASK_UNINTERRUPTIBLE with 
TASK_KILLABLE.  That seemed to work once, but then really screwed things 
up on subsequent attempts.


--
Bob Bell
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] TASK_KILLABLE version 2

2007-09-01 Thread Matthew Wilcox

Here's the second version of TASK_KILLABLE.  A few changes since version 1:

 - Don't split up TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE.
   TASK_WAKESIGNAL and TASK_LOADAVG were pretty much equivalent, and since
   we had to keep __TASK_{UN,}INTERRUPTIBLE anyway, splitting them made
   little sense.
 - Instead, I've added some is_task_*() predicates.  I think they
   achieve what Linus wanted without consuming flag bits and without
   checking a metric tonne of code to see whether it wanted
   TASK_UNINTERRUPTIBLE or __TASK_UNINTERRUPTIBLE.
 - Renamed try_lock_page() to lock_page_killable() and change the sense.
   I had envisioned uses being like spin_trylock(), but that turned out
   not to make sense.
 - Fix the bug Trond noticed in wait_on_retry_sync_kiocb() by having the
   callers handle it returning -EINTR
 - Fix the bug I noticed in sync_pages_killable() by adding the new
   function fatal_signal_pending()
 - Audit a lot of uses of TASK_*.  A few seemed dubious and were fixed.

I'd like to see this spend a bit of time in the -mm tree.  I've wasted
a couple of days trying to track down why my machine no longer does
much after starting init, so clearly I'm tampering with stuff I don't
quite understand.

Having said that, I ditched that version of the code and started again
with a much more minimalist approach.  I've also split the patch into
five independent pieces, each of which accomplishes a logical step,
for ease of bisection.

I think patch 1/5 is clearly Right and should be merged ASAP.  Patch 2/5
and 3/5 carry some risk.  4/5 and 5/5 seem less risky to me (and are
independent of each other; both rely on 1-3 being applied first)

I obviously haven't covered every place that can result in a process
sleeping uninterruptibly while attempting an operation.  But sync_page
(patch 4/5) covers about 90% of the times I've attempted to kill cat,
and I hope that by providing the two examples, I can help other people
to fix the cases that they find interesting.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] TASK_KILLABLE version 2

2007-09-01 Thread Matthew Wilcox

Here's the second version of TASK_KILLABLE.  A few changes since version 1:

 - Don't split up TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE.
   TASK_WAKESIGNAL and TASK_LOADAVG were pretty much equivalent, and since
   we had to keep __TASK_{UN,}INTERRUPTIBLE anyway, splitting them made
   little sense.
 - Instead, I've added some is_task_*() predicates.  I think they
   achieve what Linus wanted without consuming flag bits and without
   checking a metric tonne of code to see whether it wanted
   TASK_UNINTERRUPTIBLE or __TASK_UNINTERRUPTIBLE.
 - Renamed try_lock_page() to lock_page_killable() and change the sense.
   I had envisioned uses being like spin_trylock(), but that turned out
   not to make sense.
 - Fix the bug Trond noticed in wait_on_retry_sync_kiocb() by having the
   callers handle it returning -EINTR
 - Fix the bug I noticed in sync_pages_killable() by adding the new
   function fatal_signal_pending()
 - Audit a lot of uses of TASK_*.  A few seemed dubious and were fixed.

I'd like to see this spend a bit of time in the -mm tree.  I've wasted
a couple of days trying to track down why my machine no longer does
much after starting init, so clearly I'm tampering with stuff I don't
quite understand.

Having said that, I ditched that version of the code and started again
with a much more minimalist approach.  I've also split the patch into
five independent pieces, each of which accomplishes a logical step,
for ease of bisection.

I think patch 1/5 is clearly Right and should be merged ASAP.  Patch 2/5
and 3/5 carry some risk.  4/5 and 5/5 seem less risky to me (and are
independent of each other; both rely on 1-3 being applied first)

I obviously haven't covered every place that can result in a process
sleeping uninterruptibly while attempting an operation.  But sync_page
(patch 4/5) covers about 90% of the times I've attempted to kill cat,
and I hope that by providing the two examples, I can help other people
to fix the cases that they find interesting.

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/