Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-30 Thread Paul Walmsley
On Wed, 13 Mar 2013, Jeff Layton wrote: > Of course, this is all a lot of work, and not something we can shove > into the kernel for 3.9 at this point. In the meantime, while Mandeep's > warning is correctly pointing out a problem, I think we ought to back > it out until we can fix this properly.

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-30 Thread Paul Walmsley
On Wed, 13 Mar 2013, Jeff Layton wrote: Of course, this is all a lot of work, and not something we can shove into the kernel for 3.9 at this point. In the meantime, while Mandeep's warning is correctly pointing out a problem, I think we ought to back it out until we can fix this properly.

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-13 Thread Jeff Layton
On Wed, 6 Mar 2013 13:40:16 -0800 Tejun Heo wrote: > On Wed, Mar 06, 2013 at 01:36:36PM -0800, Tejun Heo wrote: > > On Wed, Mar 06, 2013 at 01:31:10PM -0800, Linus Torvalds wrote: > > > So I do agree that we probably have *too* many of the stupid "let's > > > check if we can freeze", and I

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-13 Thread Jeff Layton
On Wed, 6 Mar 2013 13:40:16 -0800 Tejun Heo t...@kernel.org wrote: On Wed, Mar 06, 2013 at 01:36:36PM -0800, Tejun Heo wrote: On Wed, Mar 06, 2013 at 01:31:10PM -0800, Linus Torvalds wrote: So I do agree that we probably have *too* many of the stupid let's check if we can freeze, and I

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-08 Thread Ingo Molnar
* Linus Torvalds wrote: > - the "freeezer for suspend/resume on a laptop" > > [...] > > The second one is unlikely to really use NFS anyway. [...] Incidentally I use NFS to a file server on my laptop, over wifi, and I close the lid for the night. It's NFS mounted soft. If it was

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-08 Thread Ingo Molnar
* Linus Torvalds torva...@linux-foundation.org wrote: - the freeezer for suspend/resume on a laptop [...] The second one is unlikely to really use NFS anyway. [...] me raises a hand Incidentally I use NFS to a file server on my laptop, over wifi, and I close the lid for the night.

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Jeff Layton
On Thu, 7 Mar 2013 17:16:12 + "Myklebust, Trond" wrote: > On Thu, 2013-03-07 at 09:03 -0800, Linus Torvalds wrote: > > On Thu, Mar 7, 2013 at 8:45 AM, Myklebust, Trond > > wrote: > > > > > > The problem there is that we get into the whole 'hard' vs 'soft' mount > > > problem. We're supposed

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Rafael J. Wysocki
On Thursday, March 07, 2013 08:25:10 AM Linus Torvalds wrote: > On Thu, Mar 7, 2013 at 7:59 AM, Myklebust, Trond > wrote: > > > > It _shouldn't_ be an interruption unless the filesystem can't make > > progress. > > So how can we tell? Calling "freezable_schedule()" if you're not ready > to be

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Myklebust, Trond
On Thu, 2013-03-07 at 09:03 -0800, Linus Torvalds wrote: > On Thu, Mar 7, 2013 at 8:45 AM, Myklebust, Trond > wrote: > > > > The problem there is that we get into the whole 'hard' vs 'soft' mount > > problem. We're supposed to guarantee data integrity for 'hard' mounts, > > so no funny business

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Linus Torvalds
On Thu, Mar 7, 2013 at 8:45 AM, Myklebust, Trond wrote: > > The problem there is that we get into the whole 'hard' vs 'soft' mount > problem. We're supposed to guarantee data integrity for 'hard' mounts, > so no funny business is allowed. OTOH, 'soft' mounts time out and return > EIO to the

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Myklebust, Trond
On Thu, 2013-03-07 at 08:25 -0800, Linus Torvalds wrote: > On Thu, Mar 7, 2013 at 7:59 AM, Myklebust, Trond > wrote: > > > > It _shouldn't_ be an interruption unless the filesystem can't make > > progress. > > So how can we tell? Calling "freezable_schedule()" if you're not ready > to be frozen

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Linus Torvalds
On Thu, Mar 7, 2013 at 7:59 AM, Myklebust, Trond wrote: > > It _shouldn't_ be an interruption unless the filesystem can't make > progress. So how can we tell? Calling "freezable_schedule()" if you're not ready to be frozen is not good. And nobody but the NFS code can know. You might want to

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Tejun Heo
Hello, Linus. On Thu, Mar 07, 2013 at 07:55:39AM -0800, Linus Torvalds wrote: > In other words, that suggestion not only introduces new problems (a), > it's fundamentally broken anyway (b) *AND* it doesn't even solve > anything, it just moves it around. I don't think it's gonna solve the

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Myklebust, Trond
On Thu, 2013-03-07 at 07:55 -0800, Linus Torvalds wrote: > On Thu, Mar 7, 2013 at 3:41 AM, Jeff Layton wrote: > > > > I think Trond may be on the right track. We probably need some > > mechanism to quiesce the filesystem ahead of any sort of freezer > > event. > > No, guys. That cannot work.

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Linus Torvalds
On Thu, Mar 7, 2013 at 3:41 AM, Jeff Layton wrote: > > I think Trond may be on the right track. We probably need some > mechanism to quiesce the filesystem ahead of any sort of freezer > event. No, guys. That cannot work. It's a completely moronic idea. Let me count the way: (a) it's just

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Tejun Heo
Hello, Jeff. On Thu, Mar 07, 2013 at 06:41:40AM -0500, Jeff Layton wrote: > Suppose I call unlink("somefile"); on an NFS mount. We take all of the > VFS locks, go down into the NFS layer. That marshals up the UNLINK > call, sends it off to the server, and waits for the reply. While we're >

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Jeff Layton
On Wed, 6 Mar 2013 13:36:36 -0800 Tejun Heo wrote: > On Wed, Mar 06, 2013 at 01:31:10PM -0800, Linus Torvalds wrote: > > So I do agree that we probably have *too* many of the stupid "let's > > check if we can freeze", and I suspect that the NFS code should get > > rid of the

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Jeff Layton
On Wed, 6 Mar 2013 13:36:36 -0800 Tejun Heo t...@kernel.org wrote: On Wed, Mar 06, 2013 at 01:31:10PM -0800, Linus Torvalds wrote: So I do agree that we probably have *too* many of the stupid let's check if we can freeze, and I suspect that the NFS code should get rid of the

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Tejun Heo
Hello, Jeff. On Thu, Mar 07, 2013 at 06:41:40AM -0500, Jeff Layton wrote: Suppose I call unlink(somefile); on an NFS mount. We take all of the VFS locks, go down into the NFS layer. That marshals up the UNLINK call, sends it off to the server, and waits for the reply. While we're waiting, a

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Linus Torvalds
On Thu, Mar 7, 2013 at 3:41 AM, Jeff Layton jlay...@redhat.com wrote: I think Trond may be on the right track. We probably need some mechanism to quiesce the filesystem ahead of any sort of freezer event. No, guys. That cannot work. It's a completely moronic idea. Let me count the way: (a)

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Myklebust, Trond
On Thu, 2013-03-07 at 07:55 -0800, Linus Torvalds wrote: On Thu, Mar 7, 2013 at 3:41 AM, Jeff Layton jlay...@redhat.com wrote: I think Trond may be on the right track. We probably need some mechanism to quiesce the filesystem ahead of any sort of freezer event. No, guys. That cannot

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Tejun Heo
Hello, Linus. On Thu, Mar 07, 2013 at 07:55:39AM -0800, Linus Torvalds wrote: In other words, that suggestion not only introduces new problems (a), it's fundamentally broken anyway (b) *AND* it doesn't even solve anything, it just moves it around. I don't think it's gonna solve the problems

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Linus Torvalds
On Thu, Mar 7, 2013 at 7:59 AM, Myklebust, Trond trond.mykleb...@netapp.com wrote: It _shouldn't_ be an interruption unless the filesystem can't make progress. So how can we tell? Calling freezable_schedule() if you're not ready to be frozen is not good. And nobody but the NFS code can know.

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Myklebust, Trond
On Thu, 2013-03-07 at 08:25 -0800, Linus Torvalds wrote: On Thu, Mar 7, 2013 at 7:59 AM, Myklebust, Trond trond.mykleb...@netapp.com wrote: It _shouldn't_ be an interruption unless the filesystem can't make progress. So how can we tell? Calling freezable_schedule() if you're not ready

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Linus Torvalds
On Thu, Mar 7, 2013 at 8:45 AM, Myklebust, Trond trond.mykleb...@netapp.com wrote: The problem there is that we get into the whole 'hard' vs 'soft' mount problem. We're supposed to guarantee data integrity for 'hard' mounts, so no funny business is allowed. OTOH, 'soft' mounts time out and

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Myklebust, Trond
On Thu, 2013-03-07 at 09:03 -0800, Linus Torvalds wrote: On Thu, Mar 7, 2013 at 8:45 AM, Myklebust, Trond trond.mykleb...@netapp.com wrote: The problem there is that we get into the whole 'hard' vs 'soft' mount problem. We're supposed to guarantee data integrity for 'hard' mounts, so no

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Rafael J. Wysocki
On Thursday, March 07, 2013 08:25:10 AM Linus Torvalds wrote: On Thu, Mar 7, 2013 at 7:59 AM, Myklebust, Trond trond.mykleb...@netapp.com wrote: It _shouldn't_ be an interruption unless the filesystem can't make progress. So how can we tell? Calling freezable_schedule() if you're not

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-07 Thread Jeff Layton
On Thu, 7 Mar 2013 17:16:12 + Myklebust, Trond trond.mykleb...@netapp.com wrote: On Thu, 2013-03-07 at 09:03 -0800, Linus Torvalds wrote: On Thu, Mar 7, 2013 at 8:45 AM, Myklebust, Trond trond.mykleb...@netapp.com wrote: The problem there is that we get into the whole 'hard' vs

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Tejun Heo
On Wed, Mar 06, 2013 at 01:36:36PM -0800, Tejun Heo wrote: > On Wed, Mar 06, 2013 at 01:31:10PM -0800, Linus Torvalds wrote: > > So I do agree that we probably have *too* many of the stupid "let's > > check if we can freeze", and I suspect that the NFS code should get > > rid of the

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Tejun Heo
On Wed, Mar 06, 2013 at 01:31:10PM -0800, Linus Torvalds wrote: > So I do agree that we probably have *too* many of the stupid "let's > check if we can freeze", and I suspect that the NFS code should get > rid of the "freezable_schedule()" that is causing this warning > (because I also agree that

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Linus Torvalds
On Wed, Mar 6, 2013 at 1:24 PM, Tejun Heo wrote: > > With syscall paths out of the way, the surface is reduced a lot. So the issue is syscalls that don't react to signals, and that can potentially wait a long time. Like NFS with a network hickup. Which is not at all unlikely. Think wireless

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Tejun Heo
Hello, Linus. On Wed, Mar 06, 2013 at 01:00:02PM -0800, Linus Torvalds wrote: > > Oh yeah, we don't need another signal. We just need sigpending state > > and a wakeup. I wasn't really going into details. The important > > point is that for code paths outside signal/ptrace, freezing could > >

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Linus Torvalds
On Wed, Mar 6, 2013 at 10:53 AM, Tejun Heo wrote: > Hello, Oleg. > > On Wed, Mar 06, 2013 at 07:16:08PM +0100, Oleg Nesterov wrote: >> And how SIGFREEZE can help? If we want to interrupt the sleeps in NFS/RPC >> layer we can simply add TASK_WAKEFREEZE (can be used with TASK_KILLABLE) >> and

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Mandeep Singh Baines
On Wed, Mar 6, 2013 at 10:37 AM, Myklebust, Trond wrote: > On Wed, 2013-03-06 at 13:23 -0500, Jeff Layton wrote: >> On Wed, 6 Mar 2013 07:59:01 -0800 >> Mandeep Singh Baines wrote: >> > In general, holding a lock and freezing can cause a deadlock if: >> > >> > 1) you froze via the cgroup_freezer

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Tejun Heo
Hello, Oleg. On Wed, Mar 06, 2013 at 07:16:08PM +0100, Oleg Nesterov wrote: > And how SIGFREEZE can help? If we want to interrupt the sleeps in NFS/RPC > layer we can simply add TASK_WAKEFREEZE (can be used with TASK_KILLABLE) > and change freeze_task() to do

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Tejun Heo
Hello, On Wed, Mar 06, 2013 at 01:40:04PM -0500, Jeff Layton wrote: > Though when I said that, it was before Tejun mentioned hooking this up > to ptrace. I'll confess that I don't fully understand what he's > proposing either though... Oh, they're all just pretty closely related. All signal and

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Jeff Layton
On Wed, 6 Mar 2013 19:17:35 +0100 Oleg Nesterov wrote: > On 03/05, Jeff Layton wrote: > > > > Anyone up for working out how to handle a freeze event on a process > > that already has a pending signal, while it's being ptraced? > > Could you explain the problem? > Not very well. I was just

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Myklebust, Trond
On Wed, 2013-03-06 at 13:23 -0500, Jeff Layton wrote: > On Wed, 6 Mar 2013 07:59:01 -0800 > Mandeep Singh Baines wrote: > > In general, holding a lock and freezing can cause a deadlock if: > > > > 1) you froze via the cgroup_freezer subsystem and a task in another > > cgroup tried to acquire the

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Jeff Layton
On Wed, 6 Mar 2013 07:59:01 -0800 Mandeep Singh Baines wrote: > On Wed, Mar 6, 2013 at 4:06 AM, Jeff Layton wrote: > > On Wed, 6 Mar 2013 10:09:14 +0100 > > Ingo Molnar wrote: > > > >> > >> * Mandeep Singh Baines wrote: > >> > >> > On Tue, Mar 5, 2013 at 5:16 PM, Tejun Heo wrote: > >> > > On

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Oleg Nesterov
On 03/05, Jeff Layton wrote: > > Anyone up for working out how to handle a freeze event on a process > that already has a pending signal, while it's being ptraced? Could you explain the problem? Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Oleg Nesterov
On 03/05, Tejun Heo wrote: > > Oleg, are you still opposed to the idea of making freezer share trap > points with ptrace? My memory can fool me, but iirc I wasn't actually opposed... I guess you mean the previous discussion about vfork/ptrace/etc which I forgot completely. But I can recall the

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Mandeep Singh Baines
On Wed, Mar 6, 2013 at 4:06 AM, Jeff Layton wrote: > On Wed, 6 Mar 2013 10:09:14 +0100 > Ingo Molnar wrote: > >> >> * Mandeep Singh Baines wrote: >> >> > On Tue, Mar 5, 2013 at 5:16 PM, Tejun Heo wrote: >> > > On Tue, Mar 05, 2013 at 08:05:07PM -0500, J. Bruce Fields wrote: >> > >> If it's

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Jeff Layton
On Wed, 6 Mar 2013 10:09:14 +0100 Ingo Molnar wrote: > > * Mandeep Singh Baines wrote: > > > On Tue, Mar 5, 2013 at 5:16 PM, Tejun Heo wrote: > > > On Tue, Mar 05, 2013 at 08:05:07PM -0500, J. Bruce Fields wrote: > > >> If it's really just a 2-line patch to try_to_freeze(), could it just be

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Jeff Layton
On Wed, 6 Mar 2013 01:10:07 + "Myklebust, Trond" wrote: > On Tue, 2013-03-05 at 14:03 -0500, Jeff Layton wrote: > > On Tue, 5 Mar 2013 09:49:54 -0800 > > Tejun Heo wrote: > > > > > On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: > > > > So, I think this is why implementing

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Ingo Molnar
* Mandeep Singh Baines wrote: > On Tue, Mar 5, 2013 at 5:16 PM, Tejun Heo wrote: > > On Tue, Mar 05, 2013 at 08:05:07PM -0500, J. Bruce Fields wrote: > >> If it's really just a 2-line patch to try_to_freeze(), could it just be > >> carried out-of-tree by people that are specifically working on

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Ingo Molnar
* Mandeep Singh Baines m...@chromium.org wrote: On Tue, Mar 5, 2013 at 5:16 PM, Tejun Heo t...@kernel.org wrote: On Tue, Mar 05, 2013 at 08:05:07PM -0500, J. Bruce Fields wrote: If it's really just a 2-line patch to try_to_freeze(), could it just be carried out-of-tree by people that are

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Jeff Layton
On Wed, 6 Mar 2013 01:10:07 + Myklebust, Trond trond.mykleb...@netapp.com wrote: On Tue, 2013-03-05 at 14:03 -0500, Jeff Layton wrote: On Tue, 5 Mar 2013 09:49:54 -0800 Tejun Heo t...@kernel.org wrote: On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: So, I think this

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Jeff Layton
On Wed, 6 Mar 2013 10:09:14 +0100 Ingo Molnar mi...@kernel.org wrote: * Mandeep Singh Baines m...@chromium.org wrote: On Tue, Mar 5, 2013 at 5:16 PM, Tejun Heo t...@kernel.org wrote: On Tue, Mar 05, 2013 at 08:05:07PM -0500, J. Bruce Fields wrote: If it's really just a 2-line patch

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Mandeep Singh Baines
On Wed, Mar 6, 2013 at 4:06 AM, Jeff Layton jlay...@redhat.com wrote: On Wed, 6 Mar 2013 10:09:14 +0100 Ingo Molnar mi...@kernel.org wrote: * Mandeep Singh Baines m...@chromium.org wrote: On Tue, Mar 5, 2013 at 5:16 PM, Tejun Heo t...@kernel.org wrote: On Tue, Mar 05, 2013 at 08:05:07PM

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Oleg Nesterov
On 03/05, Tejun Heo wrote: Oleg, are you still opposed to the idea of making freezer share trap points with ptrace? My memory can fool me, but iirc I wasn't actually opposed... I guess you mean the previous discussion about vfork/ptrace/etc which I forgot completely. But I can recall the main

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Oleg Nesterov
On 03/05, Jeff Layton wrote: Anyone up for working out how to handle a freeze event on a process that already has a pending signal, while it's being ptraced? Could you explain the problem? Oleg. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Jeff Layton
On Wed, 6 Mar 2013 07:59:01 -0800 Mandeep Singh Baines m...@chromium.org wrote: On Wed, Mar 6, 2013 at 4:06 AM, Jeff Layton jlay...@redhat.com wrote: On Wed, 6 Mar 2013 10:09:14 +0100 Ingo Molnar mi...@kernel.org wrote: * Mandeep Singh Baines m...@chromium.org wrote: On Tue, Mar

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Myklebust, Trond
On Wed, 2013-03-06 at 13:23 -0500, Jeff Layton wrote: On Wed, 6 Mar 2013 07:59:01 -0800 Mandeep Singh Baines m...@chromium.org wrote: In general, holding a lock and freezing can cause a deadlock if: 1) you froze via the cgroup_freezer subsystem and a task in another cgroup tried to

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Jeff Layton
On Wed, 6 Mar 2013 19:17:35 +0100 Oleg Nesterov o...@redhat.com wrote: On 03/05, Jeff Layton wrote: Anyone up for working out how to handle a freeze event on a process that already has a pending signal, while it's being ptraced? Could you explain the problem? Not very well. I was

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Tejun Heo
Hello, On Wed, Mar 06, 2013 at 01:40:04PM -0500, Jeff Layton wrote: Though when I said that, it was before Tejun mentioned hooking this up to ptrace. I'll confess that I don't fully understand what he's proposing either though... Oh, they're all just pretty closely related. All signal and

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Tejun Heo
Hello, Oleg. On Wed, Mar 06, 2013 at 07:16:08PM +0100, Oleg Nesterov wrote: And how SIGFREEZE can help? If we want to interrupt the sleeps in NFS/RPC layer we can simply add TASK_WAKEFREEZE (can be used with TASK_KILLABLE) and change freeze_task() to do signal_wake_up_state(TASK_WAKEFREEZE).

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Mandeep Singh Baines
On Wed, Mar 6, 2013 at 10:37 AM, Myklebust, Trond trond.mykleb...@netapp.com wrote: On Wed, 2013-03-06 at 13:23 -0500, Jeff Layton wrote: On Wed, 6 Mar 2013 07:59:01 -0800 Mandeep Singh Baines m...@chromium.org wrote: In general, holding a lock and freezing can cause a deadlock if: 1) you

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Linus Torvalds
On Wed, Mar 6, 2013 at 10:53 AM, Tejun Heo t...@kernel.org wrote: Hello, Oleg. On Wed, Mar 06, 2013 at 07:16:08PM +0100, Oleg Nesterov wrote: And how SIGFREEZE can help? If we want to interrupt the sleeps in NFS/RPC layer we can simply add TASK_WAKEFREEZE (can be used with TASK_KILLABLE) and

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Tejun Heo
Hello, Linus. On Wed, Mar 06, 2013 at 01:00:02PM -0800, Linus Torvalds wrote: Oh yeah, we don't need another signal. We just need sigpending state and a wakeup. I wasn't really going into details. The important point is that for code paths outside signal/ptrace, freezing could look and

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Linus Torvalds
On Wed, Mar 6, 2013 at 1:24 PM, Tejun Heo t...@kernel.org wrote: With syscall paths out of the way, the surface is reduced a lot. So the issue is syscalls that don't react to signals, and that can potentially wait a long time. Like NFS with a network hickup. Which is not at all unlikely. Think

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Tejun Heo
On Wed, Mar 06, 2013 at 01:31:10PM -0800, Linus Torvalds wrote: So I do agree that we probably have *too* many of the stupid let's check if we can freeze, and I suspect that the NFS code should get rid of the freezable_schedule() that is causing this warning (because I also agree that you

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-06 Thread Tejun Heo
On Wed, Mar 06, 2013 at 01:36:36PM -0800, Tejun Heo wrote: On Wed, Mar 06, 2013 at 01:31:10PM -0800, Linus Torvalds wrote: So I do agree that we probably have *too* many of the stupid let's check if we can freeze, and I suspect that the NFS code should get rid of the freezable_schedule()

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Mandeep Singh Baines
On Tue, Mar 5, 2013 at 5:16 PM, Tejun Heo wrote: > On Tue, Mar 05, 2013 at 08:05:07PM -0500, J. Bruce Fields wrote: >> If it's really just a 2-line patch to try_to_freeze(), could it just be >> carried out-of-tree by people that are specifically working on tracking >> down these problems? >> >>

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
On Tue, Mar 05, 2013 at 05:14:23PM -0800, Tejun Heo wrote: > Then, the operation simply isn't freezable while in progress and > should be on the receiving end of failed-to-freeze error message and > users who depend on suspend/hibernation working properly should be > advised away from using nfs. >

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
On Tue, Mar 05, 2013 at 08:05:07PM -0500, J. Bruce Fields wrote: > If it's really just a 2-line patch to try_to_freeze(), could it just be > carried out-of-tree by people that are specifically working on tracking > down these problems? > > But I don't have strong feelings about it--as long as it

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
Hello, Trond. On Wed, Mar 06, 2013 at 01:10:07AM +, Myklebust, Trond wrote: > Not all RPC calls can just be interrupted and restarted. Something like > an exclusive file create, unlink, file locking attempt, etc may give > rise to different results when replayed in the above scenario. >

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Myklebust, Trond
On Tue, 2013-03-05 at 14:03 -0500, Jeff Layton wrote: > On Tue, 5 Mar 2013 09:49:54 -0800 > Tejun Heo wrote: > > > On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: > > > So, I think this is why implementing freezer as a separate blocking > > > mechanism isn't such a good idea. We're

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread J. Bruce Fields
On Tue, Mar 05, 2013 at 04:59:00PM -0800, Mandeep Singh Baines wrote: > On Tue, Mar 5, 2013 at 3:11 PM, J. Bruce Fields wrote: > > On Tue, Mar 05, 2013 at 09:49:54AM -0800, Tejun Heo wrote: > >> On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: > >> > So, I think this is why implementing

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Mandeep Singh Baines
On Tue, Mar 5, 2013 at 3:11 PM, J. Bruce Fields wrote: > On Tue, Mar 05, 2013 at 09:49:54AM -0800, Tejun Heo wrote: >> On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: >> > So, I think this is why implementing freezer as a separate blocking >> > mechanism isn't such a good idea. We're

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Rafael J. Wysocki
On Tuesday, March 05, 2013 06:11:10 PM J. Bruce Fields wrote: > On Tue, Mar 05, 2013 at 09:49:54AM -0800, Tejun Heo wrote: > > On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: > > > So, I think this is why implementing freezer as a separate blocking > > > mechanism isn't such a good

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
Hello, Jeff. On Tue, Mar 05, 2013 at 06:39:41PM -0500, Jeff Layton wrote: > Al was in the middle of his signal handling/execve rework though and I > ran the idea past him. He pointedly told me that I was crazy for even > considering it. This is rather non-trivial to handle since it means >

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Jeff Layton
On Tue, 5 Mar 2013 11:09:23 -0800 Tejun Heo wrote: > Hello, Jeff. > > On Tue, Mar 05, 2013 at 02:03:12PM -0500, Jeff Layton wrote: > > Sounds intriguing... > > > > I'm not sure what this really means for something like NFS though. How > > would you envision this working when we have long

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread J. Bruce Fields
On Tue, Mar 05, 2013 at 09:49:54AM -0800, Tejun Heo wrote: > On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: > > So, I think this is why implementing freezer as a separate blocking > > mechanism isn't such a good idea. We're effectively introducing a > > completely new waiting state to

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
Hello, Jeff. On Tue, Mar 05, 2013 at 02:03:12PM -0500, Jeff Layton wrote: > Sounds intriguing... > > I'm not sure what this really means for something like NFS though. How > would you envision this working when we have long running syscalls that > might sit waiting in the kernel indefinitely? I

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Jeff Layton
On Tue, 5 Mar 2013 09:49:54 -0800 Tejun Heo wrote: > On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: > > So, I think this is why implementing freezer as a separate blocking > > mechanism isn't such a good idea. We're effectively introducing a > > completely new waiting state to a lot

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: > So, I think this is why implementing freezer as a separate blocking > mechanism isn't such a good idea. We're effectively introducing a > completely new waiting state to a lot of unsuspecting paths which > generates a lot of risks and

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
Hello, guys. On Tue, Mar 05, 2013 at 08:23:08AM -0500, Jeff Layton wrote: > So, not a deadlock per-se in this case but it does prevent the freezer > from running to completion. I don't see any way to solve it though w/o > making all mutexes freezable. Note that I don't think this is really >

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Jeff Layton
On Mon, 4 Mar 2013 22:08:34 + "Myklebust, Trond" wrote: > On Mon, 2013-03-04 at 21:53 +0100, Oleg Nesterov wrote: > > On 03/04, Mandeep Singh Baines wrote: > > > > > > The problem is that freezer_count() calls try_to_freeze(). In this > > > case, try_to_freeze() is not really adding any

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Jeff Layton
On Mon, 4 Mar 2013 22:08:34 + Myklebust, Trond trond.mykleb...@netapp.com wrote: On Mon, 2013-03-04 at 21:53 +0100, Oleg Nesterov wrote: On 03/04, Mandeep Singh Baines wrote: The problem is that freezer_count() calls try_to_freeze(). In this case, try_to_freeze() is not really

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
Hello, guys. On Tue, Mar 05, 2013 at 08:23:08AM -0500, Jeff Layton wrote: So, not a deadlock per-se in this case but it does prevent the freezer from running to completion. I don't see any way to solve it though w/o making all mutexes freezable. Note that I don't think this is really limited

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: So, I think this is why implementing freezer as a separate blocking mechanism isn't such a good idea. We're effectively introducing a completely new waiting state to a lot of unsuspecting paths which generates a lot of risks and

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Jeff Layton
On Tue, 5 Mar 2013 09:49:54 -0800 Tejun Heo t...@kernel.org wrote: On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: So, I think this is why implementing freezer as a separate blocking mechanism isn't such a good idea. We're effectively introducing a completely new waiting state

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
Hello, Jeff. On Tue, Mar 05, 2013 at 02:03:12PM -0500, Jeff Layton wrote: Sounds intriguing... I'm not sure what this really means for something like NFS though. How would you envision this working when we have long running syscalls that might sit waiting in the kernel indefinitely? I

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread J. Bruce Fields
On Tue, Mar 05, 2013 at 09:49:54AM -0800, Tejun Heo wrote: On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: So, I think this is why implementing freezer as a separate blocking mechanism isn't such a good idea. We're effectively introducing a completely new waiting state to a lot

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Jeff Layton
On Tue, 5 Mar 2013 11:09:23 -0800 Tejun Heo t...@kernel.org wrote: Hello, Jeff. On Tue, Mar 05, 2013 at 02:03:12PM -0500, Jeff Layton wrote: Sounds intriguing... I'm not sure what this really means for something like NFS though. How would you envision this working when we have long

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
Hello, Jeff. On Tue, Mar 05, 2013 at 06:39:41PM -0500, Jeff Layton wrote: Al was in the middle of his signal handling/execve rework though and I ran the idea past him. He pointedly told me that I was crazy for even considering it. This is rather non-trivial to handle since it means mucking

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Rafael J. Wysocki
On Tuesday, March 05, 2013 06:11:10 PM J. Bruce Fields wrote: On Tue, Mar 05, 2013 at 09:49:54AM -0800, Tejun Heo wrote: On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: So, I think this is why implementing freezer as a separate blocking mechanism isn't such a good idea. We're

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Mandeep Singh Baines
On Tue, Mar 5, 2013 at 3:11 PM, J. Bruce Fields bfie...@fieldses.org wrote: On Tue, Mar 05, 2013 at 09:49:54AM -0800, Tejun Heo wrote: On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: So, I think this is why implementing freezer as a separate blocking mechanism isn't such a good

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread J. Bruce Fields
On Tue, Mar 05, 2013 at 04:59:00PM -0800, Mandeep Singh Baines wrote: On Tue, Mar 5, 2013 at 3:11 PM, J. Bruce Fields bfie...@fieldses.org wrote: On Tue, Mar 05, 2013 at 09:49:54AM -0800, Tejun Heo wrote: On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: So, I think this is why

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Myklebust, Trond
On Tue, 2013-03-05 at 14:03 -0500, Jeff Layton wrote: On Tue, 5 Mar 2013 09:49:54 -0800 Tejun Heo t...@kernel.org wrote: On Tue, Mar 05, 2013 at 09:46:48AM -0800, Tejun Heo wrote: So, I think this is why implementing freezer as a separate blocking mechanism isn't such a good idea.

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
Hello, Trond. On Wed, Mar 06, 2013 at 01:10:07AM +, Myklebust, Trond wrote: Not all RPC calls can just be interrupted and restarted. Something like an exclusive file create, unlink, file locking attempt, etc may give rise to different results when replayed in the above scenario.

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
On Tue, Mar 05, 2013 at 08:05:07PM -0500, J. Bruce Fields wrote: If it's really just a 2-line patch to try_to_freeze(), could it just be carried out-of-tree by people that are specifically working on tracking down these problems? But I don't have strong feelings about it--as long as it

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Tejun Heo
On Tue, Mar 05, 2013 at 05:14:23PM -0800, Tejun Heo wrote: Then, the operation simply isn't freezable while in progress and should be on the receiving end of failed-to-freeze error message and users who depend on suspend/hibernation working properly should be advised away from using nfs. It

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-05 Thread Mandeep Singh Baines
On Tue, Mar 5, 2013 at 5:16 PM, Tejun Heo t...@kernel.org wrote: On Tue, Mar 05, 2013 at 08:05:07PM -0500, J. Bruce Fields wrote: If it's really just a 2-line patch to try_to_freeze(), could it just be carried out-of-tree by people that are specifically working on tracking down these problems?

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-04 Thread Myklebust, Trond
On Mon, 2013-03-04 at 21:53 +0100, Oleg Nesterov wrote: > On 03/04, Mandeep Singh Baines wrote: > > > > The problem is that freezer_count() calls try_to_freeze(). In this > > case, try_to_freeze() is not really adding any value. > > Well, I tend to agree. > > If a task calls __refrigerator()

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-04 Thread Oleg Nesterov
On 03/04, Mandeep Singh Baines wrote: > > The problem is that freezer_count() calls try_to_freeze(). In this > case, try_to_freeze() is not really adding any value. Well, I tend to agree. If a task calls __refrigerator() holding a lock which another freezable task can wait for, this is not

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-04 Thread Mandeep Singh Baines
On Mon, Mar 4, 2013 at 12:09 PM, Mandeep Singh Baines wrote: > On Mon, Mar 4, 2013 at 7:53 AM, Myklebust, Trond > wrote: >> On Mon, 2013-03-04 at 23:33 +0800, Ming Lei wrote: >>> Hi, >>> >>> CC guys who introduced the lockdep change. >>> >>> On Mon, Mar 4, 2013 at 11:04 PM, Jeff Layton wrote:

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-04 Thread Mandeep Singh Baines
On Mon, Mar 4, 2013 at 7:53 AM, Myklebust, Trond wrote: > On Mon, 2013-03-04 at 23:33 +0800, Ming Lei wrote: >> Hi, >> >> CC guys who introduced the lockdep change. >> >> On Mon, Mar 4, 2013 at 11:04 PM, Jeff Layton wrote: >> >> > >> > I don't get it -- why is it bad to hold a lock across a

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-04 Thread Mandeep Singh Baines
+ rjw, akpm, tejun, mingo, oleg On Mon, Mar 4, 2013 at 6:23 AM, Jeff Layton wrote: > On Mon, 4 Mar 2013 14:14:23 + > "Myklebust, Trond" wrote: > >> On Mon, 2013-03-04 at 21:57 +0800, Ming Lei wrote: >> > Hi, >> > >> > The below warning can be triggered each time when mount.nfs is >> >

Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!

2013-03-04 Thread Myklebust, Trond
On Mon, 2013-03-04 at 23:33 +0800, Ming Lei wrote: > Hi, > > CC guys who introduced the lockdep change. > > On Mon, Mar 4, 2013 at 11:04 PM, Jeff Layton wrote: > > > > > I don't get it -- why is it bad to hold a lock across a freeze event? > > At least this may deadlock another mount.nfs

  1   2   >