Re: [PATCH] TASK_KILLABLE version 2
On Mon, Sep 24, 2007 at 04:16:48PM -0400, Bob Bell wrote: > I've been testing this patch on my systems. It's working for me when > I read() a file. Asynchronous write()s seem okay, too. However, > synchronous writes (caused by either calling fsync() or fcntl() to > release a lock) prevent the process from being killed when the NFS > server goes down. > > When the process is sent SIGKILL, it's waiting with the following call > tree: >do_fsync >nfs_fsync >nfs_wb_all >nfs_sync_mapping_wait >nfs_wait_on_requests_locks (I believe) >nfs_wait_on_request >out_of_line_wait_on_bit >__wait_on_bit >nfs_wait_bit_interruptible >schedule > > When the process is later viewed after being deemed "stuck", it's > waiting with the following call tree: >do_fsync >filemap_fdatawait >wait_on_page_writeback_range >wait_on_page_writeback >wait_on_page_bit >__wait_on_bit >sync_page >io_schedule >schedule > > If I hazard a guess as to what might be wrong here, I believe that when > the processes catches SIGKILL, nfs_wait_bit_interruptible is returning > -ERESTARTSYS. That error bubbles back up to nfs_fsync. However, > nfs_fsync returns ctx->error, not -ERESTARTSYS, and ctx->error is 0. > do_fsync proceeds to call filemap_fdatawait. I question whether > nfs_sync should return an error, and if do_fsync should skip > filemap_fdatawait if the fsync op returned an error. > > I did try replacing the call to sync_page in __wait_on_bit with > sync_page_killable and replacing TASK_UNINTERRUPTIBLE with > TASK_KILLABLE. That seemed to work once, but then really screwed things > up on subsequent attempts. Hi Bob, The patch I posted earlier today (a somewhat cleaned-up version of a patch originally from Liam Howlett) converts NFS to use TASK_KILLABLE. Could you have another shot at breaking it? You can find it here: http://marc.info/?l=linux-fsdevel=119698206729969=2 (It has some prerequisites that are in the git tree, so probably the easiest thing is just to pull that git tree). Liam notes that there's still problems with programs that call *stat*(), but I haven't looked into that issue yet. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] TASK_KILLABLE version 2
On Mon, Sep 24, 2007 at 04:16:48PM -0400, Bob Bell wrote: I've been testing this patch on my systems. It's working for me when I read() a file. Asynchronous write()s seem okay, too. However, synchronous writes (caused by either calling fsync() or fcntl() to release a lock) prevent the process from being killed when the NFS server goes down. When the process is sent SIGKILL, it's waiting with the following call tree: do_fsync nfs_fsync nfs_wb_all nfs_sync_mapping_wait nfs_wait_on_requests_locks (I believe) nfs_wait_on_request out_of_line_wait_on_bit __wait_on_bit nfs_wait_bit_interruptible schedule When the process is later viewed after being deemed stuck, it's waiting with the following call tree: do_fsync filemap_fdatawait wait_on_page_writeback_range wait_on_page_writeback wait_on_page_bit __wait_on_bit sync_page io_schedule schedule If I hazard a guess as to what might be wrong here, I believe that when the processes catches SIGKILL, nfs_wait_bit_interruptible is returning -ERESTARTSYS. That error bubbles back up to nfs_fsync. However, nfs_fsync returns ctx-error, not -ERESTARTSYS, and ctx-error is 0. do_fsync proceeds to call filemap_fdatawait. I question whether nfs_sync should return an error, and if do_fsync should skip filemap_fdatawait if the fsync op returned an error. I did try replacing the call to sync_page in __wait_on_bit with sync_page_killable and replacing TASK_UNINTERRUPTIBLE with TASK_KILLABLE. That seemed to work once, but then really screwed things up on subsequent attempts. Hi Bob, The patch I posted earlier today (a somewhat cleaned-up version of a patch originally from Liam Howlett) converts NFS to use TASK_KILLABLE. Could you have another shot at breaking it? You can find it here: http://marc.info/?l=linux-fsdevelm=119698206729969w=2 (It has some prerequisites that are in the git tree, so probably the easiest thing is just to pull that git tree). Liam notes that there's still problems with programs that call *stat*(), but I haven't looked into that issue yet. -- Intel are signing my paycheques ... these opinions are still mine Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] TASK_KILLABLE version 2
On Wednesday 26 September 2007 21:57, Ric Wheeler wrote: > Bob Bell wrote: > > On Sat, Sep 01, 2007 at 08:43:49PM -0600, Matthew Wilcox wrote: > >> Here's the second version of TASK_KILLABLE. A few changes since > >> version 1: > > > > > > > >> I obviously haven't covered every place that can result in a process > >> sleeping uninterruptibly while attempting an operation. But sync_page > >> (patch 4/5) covers about 90% of the times I've attempted to kill cat, > >> and I hope that by providing the two examples, I can help other people > >> to fix the cases that they find interesting. > > > > I've been testing this patch on my systems. It's working for me when > > I read() a file. Asynchronous write()s seem okay, too. However, > > synchronous writes (caused by either calling fsync() or fcntl() to > > release a lock) prevent the process from being killed when the NFS > > server goes down. > > After hearing again last month about how few people actually read every > lkml thread, I wanted to point you all at this thread explicitly since > it seems that we are getting somewhat close to having a forced unmount > that actually is usable by real applications, something that we seem to > have been talking about for many years ;-) > > With Matthew's original TASK_KILLABLE patch, we have a solution for a > task read, but still have some holes (fsync & fcntl, others?) that need > fixed as well for NFS clients. > > Is this patch going in the right direction? FWIW, I do think it seems like a good idea to work towards better interruptibility in various potentially long-blocking paths like these. I think Andrea's recent work to solve some oom killer deadlocks probably has some requirements in common with this patch. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] TASK_KILLABLE version 2
On Wednesday 26 September 2007 21:57, Ric Wheeler wrote: Bob Bell wrote: On Sat, Sep 01, 2007 at 08:43:49PM -0600, Matthew Wilcox wrote: Here's the second version of TASK_KILLABLE. A few changes since version 1: snip I obviously haven't covered every place that can result in a process sleeping uninterruptibly while attempting an operation. But sync_page (patch 4/5) covers about 90% of the times I've attempted to kill cat, and I hope that by providing the two examples, I can help other people to fix the cases that they find interesting. I've been testing this patch on my systems. It's working for me when I read() a file. Asynchronous write()s seem okay, too. However, synchronous writes (caused by either calling fsync() or fcntl() to release a lock) prevent the process from being killed when the NFS server goes down. After hearing again last month about how few people actually read every lkml thread, I wanted to point you all at this thread explicitly since it seems that we are getting somewhat close to having a forced unmount that actually is usable by real applications, something that we seem to have been talking about for many years ;-) With Matthew's original TASK_KILLABLE patch, we have a solution for a task read, but still have some holes (fsync fcntl, others?) that need fixed as well for NFS clients. Is this patch going in the right direction? FWIW, I do think it seems like a good idea to work towards better interruptibility in various potentially long-blocking paths like these. I think Andrea's recent work to solve some oom killer deadlocks probably has some requirements in common with this patch. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] TASK_KILLABLE version 2
Bob Bell wrote: On Sat, Sep 01, 2007 at 08:43:49PM -0600, Matthew Wilcox wrote: Here's the second version of TASK_KILLABLE. A few changes since version 1: I obviously haven't covered every place that can result in a process sleeping uninterruptibly while attempting an operation. But sync_page (patch 4/5) covers about 90% of the times I've attempted to kill cat, and I hope that by providing the two examples, I can help other people to fix the cases that they find interesting. I've been testing this patch on my systems. It's working for me when I read() a file. Asynchronous write()s seem okay, too. However, synchronous writes (caused by either calling fsync() or fcntl() to release a lock) prevent the process from being killed when the NFS server goes down. After hearing again last month about how few people actually read every lkml thread, I wanted to point you all at this thread explicitly since it seems that we are getting somewhat close to having a forced unmount that actually is usable by real applications, something that we seem to have been talking about for many years ;-) With Matthew's original TASK_KILLABLE patch, we have a solution for a task read, but still have some holes (fsync & fcntl, others?) that need fixed as well for NFS clients. Is this patch going in the right direction? ric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] TASK_KILLABLE version 2
Bob Bell wrote: On Sat, Sep 01, 2007 at 08:43:49PM -0600, Matthew Wilcox wrote: Here's the second version of TASK_KILLABLE. A few changes since version 1: snip I obviously haven't covered every place that can result in a process sleeping uninterruptibly while attempting an operation. But sync_page (patch 4/5) covers about 90% of the times I've attempted to kill cat, and I hope that by providing the two examples, I can help other people to fix the cases that they find interesting. I've been testing this patch on my systems. It's working for me when I read() a file. Asynchronous write()s seem okay, too. However, synchronous writes (caused by either calling fsync() or fcntl() to release a lock) prevent the process from being killed when the NFS server goes down. After hearing again last month about how few people actually read every lkml thread, I wanted to point you all at this thread explicitly since it seems that we are getting somewhat close to having a forced unmount that actually is usable by real applications, something that we seem to have been talking about for many years ;-) With Matthew's original TASK_KILLABLE patch, we have a solution for a task read, but still have some holes (fsync fcntl, others?) that need fixed as well for NFS clients. Is this patch going in the right direction? ric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] TASK_KILLABLE version 2
On Sat, Sep 01, 2007 at 08:43:49PM -0600, Matthew Wilcox wrote: Here's the second version of TASK_KILLABLE. A few changes since version 1: I obviously haven't covered every place that can result in a process sleeping uninterruptibly while attempting an operation. But sync_page (patch 4/5) covers about 90% of the times I've attempted to kill cat, and I hope that by providing the two examples, I can help other people to fix the cases that they find interesting. I've been testing this patch on my systems. It's working for me when I read() a file. Asynchronous write()s seem okay, too. However, synchronous writes (caused by either calling fsync() or fcntl() to release a lock) prevent the process from being killed when the NFS server goes down. When the process is sent SIGKILL, it's waiting with the following call tree: do_fsync nfs_fsync nfs_wb_all nfs_sync_mapping_wait nfs_wait_on_requests_locks (I believe) nfs_wait_on_request out_of_line_wait_on_bit __wait_on_bit nfs_wait_bit_interruptible schedule When the process is later viewed after being deemed "stuck", it's waiting with the following call tree: do_fsync filemap_fdatawait wait_on_page_writeback_range wait_on_page_writeback wait_on_page_bit __wait_on_bit sync_page io_schedule schedule If I hazard a guess as to what might be wrong here, I believe that when the processes catches SIGKILL, nfs_wait_bit_interruptible is returning -ERESTARTSYS. That error bubbles back up to nfs_fsync. However, nfs_fsync returns ctx->error, not -ERESTARTSYS, and ctx->error is 0. do_fsync proceeds to call filemap_fdatawait. I question whether nfs_sync should return an error, and if do_fsync should skip filemap_fdatawait if the fsync op returned an error. I did try replacing the call to sync_page in __wait_on_bit with sync_page_killable and replacing TASK_UNINTERRUPTIBLE with TASK_KILLABLE. That seemed to work once, but then really screwed things up on subsequent attempts. -- Bob Bell - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] TASK_KILLABLE version 2
On Sat, Sep 01, 2007 at 08:43:49PM -0600, Matthew Wilcox wrote: Here's the second version of TASK_KILLABLE. A few changes since version 1: snip I obviously haven't covered every place that can result in a process sleeping uninterruptibly while attempting an operation. But sync_page (patch 4/5) covers about 90% of the times I've attempted to kill cat, and I hope that by providing the two examples, I can help other people to fix the cases that they find interesting. I've been testing this patch on my systems. It's working for me when I read() a file. Asynchronous write()s seem okay, too. However, synchronous writes (caused by either calling fsync() or fcntl() to release a lock) prevent the process from being killed when the NFS server goes down. When the process is sent SIGKILL, it's waiting with the following call tree: do_fsync nfs_fsync nfs_wb_all nfs_sync_mapping_wait nfs_wait_on_requests_locks (I believe) nfs_wait_on_request out_of_line_wait_on_bit __wait_on_bit nfs_wait_bit_interruptible schedule When the process is later viewed after being deemed stuck, it's waiting with the following call tree: do_fsync filemap_fdatawait wait_on_page_writeback_range wait_on_page_writeback wait_on_page_bit __wait_on_bit sync_page io_schedule schedule If I hazard a guess as to what might be wrong here, I believe that when the processes catches SIGKILL, nfs_wait_bit_interruptible is returning -ERESTARTSYS. That error bubbles back up to nfs_fsync. However, nfs_fsync returns ctx-error, not -ERESTARTSYS, and ctx-error is 0. do_fsync proceeds to call filemap_fdatawait. I question whether nfs_sync should return an error, and if do_fsync should skip filemap_fdatawait if the fsync op returned an error. I did try replacing the call to sync_page in __wait_on_bit with sync_page_killable and replacing TASK_UNINTERRUPTIBLE with TASK_KILLABLE. That seemed to work once, but then really screwed things up on subsequent attempts. -- Bob Bell - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] TASK_KILLABLE version 2
Here's the second version of TASK_KILLABLE. A few changes since version 1: - Don't split up TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE. TASK_WAKESIGNAL and TASK_LOADAVG were pretty much equivalent, and since we had to keep __TASK_{UN,}INTERRUPTIBLE anyway, splitting them made little sense. - Instead, I've added some is_task_*() predicates. I think they achieve what Linus wanted without consuming flag bits and without checking a metric tonne of code to see whether it wanted TASK_UNINTERRUPTIBLE or __TASK_UNINTERRUPTIBLE. - Renamed try_lock_page() to lock_page_killable() and change the sense. I had envisioned uses being like spin_trylock(), but that turned out not to make sense. - Fix the bug Trond noticed in wait_on_retry_sync_kiocb() by having the callers handle it returning -EINTR - Fix the bug I noticed in sync_pages_killable() by adding the new function fatal_signal_pending() - Audit a lot of uses of TASK_*. A few seemed dubious and were fixed. I'd like to see this spend a bit of time in the -mm tree. I've wasted a couple of days trying to track down why my machine no longer does much after starting init, so clearly I'm tampering with stuff I don't quite understand. Having said that, I ditched that version of the code and started again with a much more minimalist approach. I've also split the patch into five independent pieces, each of which accomplishes a logical step, for ease of bisection. I think patch 1/5 is clearly Right and should be merged ASAP. Patch 2/5 and 3/5 carry some risk. 4/5 and 5/5 seem less risky to me (and are independent of each other; both rely on 1-3 being applied first) I obviously haven't covered every place that can result in a process sleeping uninterruptibly while attempting an operation. But sync_page (patch 4/5) covers about 90% of the times I've attempted to kill cat, and I hope that by providing the two examples, I can help other people to fix the cases that they find interesting. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] TASK_KILLABLE version 2
Here's the second version of TASK_KILLABLE. A few changes since version 1: - Don't split up TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE. TASK_WAKESIGNAL and TASK_LOADAVG were pretty much equivalent, and since we had to keep __TASK_{UN,}INTERRUPTIBLE anyway, splitting them made little sense. - Instead, I've added some is_task_*() predicates. I think they achieve what Linus wanted without consuming flag bits and without checking a metric tonne of code to see whether it wanted TASK_UNINTERRUPTIBLE or __TASK_UNINTERRUPTIBLE. - Renamed try_lock_page() to lock_page_killable() and change the sense. I had envisioned uses being like spin_trylock(), but that turned out not to make sense. - Fix the bug Trond noticed in wait_on_retry_sync_kiocb() by having the callers handle it returning -EINTR - Fix the bug I noticed in sync_pages_killable() by adding the new function fatal_signal_pending() - Audit a lot of uses of TASK_*. A few seemed dubious and were fixed. I'd like to see this spend a bit of time in the -mm tree. I've wasted a couple of days trying to track down why my machine no longer does much after starting init, so clearly I'm tampering with stuff I don't quite understand. Having said that, I ditched that version of the code and started again with a much more minimalist approach. I've also split the patch into five independent pieces, each of which accomplishes a logical step, for ease of bisection. I think patch 1/5 is clearly Right and should be merged ASAP. Patch 2/5 and 3/5 carry some risk. 4/5 and 5/5 seem less risky to me (and are independent of each other; both rely on 1-3 being applied first) I obviously haven't covered every place that can result in a process sleeping uninterruptibly while attempting an operation. But sync_page (patch 4/5) covers about 90% of the times I've attempted to kill cat, and I hope that by providing the two examples, I can help other people to fix the cases that they find interesting. -- Intel are signing my paycheques ... these opinions are still mine Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/