Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Jamie Lokier
Olof Johansson wrote: > How's this? I went with get_val_no_fault(), since it isn't really a > get_user.*() any more (ptr being passed in), and no_paging is a little > misleading (not all faults are due to paging). How ironic: I deliberately didn't choose "no_fault" because that function *does*

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Linus Torvalds
On Wed, 23 Feb 2005, Olof Johansson wrote: > > How's this? I went with get_val_no_fault(), since it isn't really a > get_user.*() any more (ptr being passed in), and no_paging is a little > misleading (not all faults are due to paging). Applied with minor cosmetic changes. I'm like a dog who

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Olof Johansson
On Wed, Feb 23, 2005 at 06:49:46PM +, Jamie Lokier wrote: > Linus Torvalds wrote: > > > I suggest putting it into futex.c, and make it an inline function > > > which takes "u32 __user *". > > > > Agreed, except we've traditionally just made it "int __user *". > > The type signatures in

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Jamie Lokier
Linus Torvalds wrote: > > I suggest putting it into futex.c, and make it an inline function > > which takes "u32 __user *". > > Agreed, except we've traditionally just made it "int __user *". The type signatures in futex.c are a bit mixed up - most places say "int __user *" but sys_futex() says

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread David Howells
Olof Johansson <[EMAIL PROTECTED]> wrote: > > Alternately, you could just have do_page_fault() do: > > > > while (!down_read_trylock(>mm->mmap_sem)) > > continue; > > > > However, note that this can suffer from starvation due to a never ending > > flow of mixed write-locks and

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Olof Johansson
On Wed, Feb 23, 2005 at 06:22:04PM +, Jamie Lokier wrote: > Olof Johansson wrote: > > On Wed, Feb 23, 2005 at 07:54:06AM -0800, Linus Torvalds wrote: > > > > > > Otherwise, a preempt attempt in get_user would not be seen > > > > until some future preempt_enable was executed. > > > > > >

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Linus Torvalds
On Wed, 23 Feb 2005, Jamie Lokier wrote: > > I suggest putting it into futex.c, and make it an inline function > which takes "u32 __user *". Agreed, except we've traditionally just made it "int __user *". Linus - To unsubscribe from this list: send the line "unsubscribe

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Jamie Lokier
Olof Johansson wrote: > On Wed, Feb 23, 2005 at 07:54:06AM -0800, Linus Torvalds wrote: > > > > Otherwise, a preempt attempt in get_user would not be seen > > > until some future preempt_enable was executed. > > > > True. I guess we should have a "preempt_check_resched()" there too. That's > >

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Arjan van de Ven
On Wed, 2005-02-23 at 11:10 -0600, Olof Johansson wrote: > On Wed, Feb 23, 2005 at 07:54:06AM -0800, Linus Torvalds wrote: > > > > Otherwise, a preempt attempt in get_user would not be seen > > > until some future preempt_enable was executed. > > > > True. I guess we should have a

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Olof Johansson
On Wed, Feb 23, 2005 at 07:54:06AM -0800, Linus Torvalds wrote: > > Otherwise, a preempt attempt in get_user would not be seen > > until some future preempt_enable was executed. > > True. I guess we should have a "preempt_check_resched()" there too. That's > what "kunmap_atomic()" does too

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Olof Johansson
On Wed, Feb 23, 2005 at 11:39:08AM +, David Howells wrote: > Alternately, you could just have do_page_fault() do: > > while (!down_read_trylock(>mm->mmap_sem)) > continue; > > However, note that this can suffer from starvation due to a never ending flow > of mixed

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Linus Torvalds
On Wed, 23 Feb 2005, Joe Korty wrote: > > Perhaps this should be preempt_disable preempt_enable. No, the problem with preempt_disable/enable is that they go away if preemption is not enabled. So you really do have to do it by hand with the "inc_preempt_count". > Otherwise, a preempt

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Joe Korty
On Tue, Feb 22, 2005 at 01:30:27PM -0800, Linus Torvalds wrote: > > We really have this already, and it's called "current->preempt". It > handles any lock at all, and doesn't add yet another special case to all > the architectures. > > Just do > > repeat: >

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread David Howells
Andrew Morton <[EMAIL PROTECTED]> wrote: > wrt this down_read/down_write/down_read deadlock: iirc, the reason why > down_write() takes precedence over down_read() is to avoid the permanent > writer starvation which would occur if there is heavy down_read() traffic. down_write() doesn't actually

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread David Howells
Linus Torvalds <[EMAIL PROTECTED]> wrote: > That is uglee. True. You could just wrap it up in inline functions and hide it in a header file as I suggested in the email I've just sent. > We really have this already, and it's called "current->preempt". It > handles any lock at all, and doesn't

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread David Howells
Linus Torvalds <[EMAIL PROTECTED]> wrote: > It shouldn't be. If one read writer is active, another should be able to > come in, regardless of any pending writer trying to access it. At least > that's always been the rule for the rw-spinlocks _exactly_ for this > reaseon. But not with

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread David Howells
Linus Torvalds [EMAIL PROTECTED] wrote: It shouldn't be. If one read writer is active, another should be able to come in, regardless of any pending writer trying to access it. At least that's always been the rule for the rw-spinlocks _exactly_ for this reaseon. But not with rw-semaphores.

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread David Howells
Linus Torvalds [EMAIL PROTECTED] wrote: That is uglee. True. You could just wrap it up in inline functions and hide it in a header file as I suggested in the email I've just sent. We really have this already, and it's called current-preempt. It handles any lock at all, and doesn't add yet

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread David Howells
Andrew Morton [EMAIL PROTECTED] wrote: wrt this down_read/down_write/down_read deadlock: iirc, the reason why down_write() takes precedence over down_read() is to avoid the permanent writer starvation which would occur if there is heavy down_read() traffic. down_write() doesn't actually take

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Joe Korty
On Tue, Feb 22, 2005 at 01:30:27PM -0800, Linus Torvalds wrote: We really have this already, and it's called current-preempt. It handles any lock at all, and doesn't add yet another special case to all the architectures. Just do repeat:

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Linus Torvalds
On Wed, 23 Feb 2005, Joe Korty wrote: Perhaps this should be preempt_disable preempt_enable. No, the problem with preempt_disable/enable is that they go away if preemption is not enabled. So you really do have to do it by hand with the inc_preempt_count. Otherwise, a preempt attempt

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Olof Johansson
On Wed, Feb 23, 2005 at 11:39:08AM +, David Howells wrote: Alternately, you could just have do_page_fault() do: while (!down_read_trylock(current-mm-mmap_sem)) continue; However, note that this can suffer from starvation due to a never ending flow of mixed

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Olof Johansson
On Wed, Feb 23, 2005 at 07:54:06AM -0800, Linus Torvalds wrote: Otherwise, a preempt attempt in get_user would not be seen until some future preempt_enable was executed. True. I guess we should have a preempt_check_resched() there too. That's what kunmap_atomic() does too (which is what

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Arjan van de Ven
On Wed, 2005-02-23 at 11:10 -0600, Olof Johansson wrote: On Wed, Feb 23, 2005 at 07:54:06AM -0800, Linus Torvalds wrote: Otherwise, a preempt attempt in get_user would not be seen until some future preempt_enable was executed. True. I guess we should have a preempt_check_resched()

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Jamie Lokier
Olof Johansson wrote: On Wed, Feb 23, 2005 at 07:54:06AM -0800, Linus Torvalds wrote: Otherwise, a preempt attempt in get_user would not be seen until some future preempt_enable was executed. True. I guess we should have a preempt_check_resched() there too. That's what

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Linus Torvalds
On Wed, 23 Feb 2005, Jamie Lokier wrote: I suggest putting it into futex.c, and make it an inline function which takes u32 __user *. Agreed, except we've traditionally just made it int __user *. Linus - To unsubscribe from this list: send the line unsubscribe linux-kernel

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Olof Johansson
On Wed, Feb 23, 2005 at 06:22:04PM +, Jamie Lokier wrote: Olof Johansson wrote: On Wed, Feb 23, 2005 at 07:54:06AM -0800, Linus Torvalds wrote: Otherwise, a preempt attempt in get_user would not be seen until some future preempt_enable was executed. True. I guess we should

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread David Howells
Olof Johansson [EMAIL PROTECTED] wrote: Alternately, you could just have do_page_fault() do: while (!down_read_trylock(current-mm-mmap_sem)) continue; However, note that this can suffer from starvation due to a never ending flow of mixed write-locks and read-locks

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Jamie Lokier
Linus Torvalds wrote: I suggest putting it into futex.c, and make it an inline function which takes u32 __user *. Agreed, except we've traditionally just made it int __user *. The type signatures in futex.c are a bit mixed up - most places say int __user * but sys_futex() says u32 __user

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Olof Johansson
On Wed, Feb 23, 2005 at 06:49:46PM +, Jamie Lokier wrote: Linus Torvalds wrote: I suggest putting it into futex.c, and make it an inline function which takes u32 __user *. Agreed, except we've traditionally just made it int __user *. The type signatures in futex.c are a bit

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Linus Torvalds
On Wed, 23 Feb 2005, Olof Johansson wrote: How's this? I went with get_val_no_fault(), since it isn't really a get_user.*() any more (ptr being passed in), and no_paging is a little misleading (not all faults are due to paging). Applied with minor cosmetic changes. I'm like a dog who likes

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-23 Thread Jamie Lokier
Olof Johansson wrote: How's this? I went with get_val_no_fault(), since it isn't really a get_user.*() any more (ptr being passed in), and no_paging is a little misleading (not all faults are due to paging). How ironic: I deliberately didn't choose no_fault because that function *does* take

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Olof Johansson
On Tue, Feb 22, 2005 at 03:20:27PM -0800, Andrew Morton wrote: > [EMAIL PROTECTED] (Olof Johansson) wrote: > > > > + inc_preempt_count(); > > + ret = get_user(curval, (int __user *)uaddr1); > > + dec_preempt_count(); > > That _should_ generate a might_sleep()

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Andrew Morton
[EMAIL PROTECTED] (Olof Johansson) wrote: > > + inc_preempt_count(); > + ret = get_user(curval, (int __user *)uaddr1); > + dec_preempt_count(); That _should_ generate a might_sleep() warning, except it looks like we forgot to add a check to get_user(). It

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Greg KH
On Tue, Feb 22, 2005 at 02:10:58PM -0800, Linus Torvalds wrote: > > Oh, well. The reason I hate the rwsem behaviour is exactly because it > results in this very subtle class of deadlocks. This one case is certainly > solvable several ways, but do we have other issues somewhere else? Things > like

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Olof Johansson
On Tue, Feb 22, 2005 at 10:34:57PM +, Jamie Lokier wrote: > There is one small but important error: the "return ret" mustn't just > return. It must call unqueue_me() just like the code at out_unqueue, > _including_ the conditional "ret = 0", but _excluding_ the up_read(). Not only that, but

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Jamie Lokier
Linus Torvalds wrote: > > queue_me(...) etc. > > current->flags |= PF_MMAP_SEM; <- new > > ret = get_user(...); > > current->flags &= PF_MMAP_SEM; <- new > > /* the rest */ > > That is uglee. > > We really have this already, and it's called

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Benjamin Herrenschmidt
On Tue, 2005-02-22 at 14:10 -0800, Linus Torvalds wrote: > Oh, well. The reason I hate the rwsem behaviour is exactly because it > results in this very subtle class of deadlocks. This one case is certainly > solvable several ways, but do we have other issues somewhere else? Things > like kobject

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Jamie Lokier
Olof Johansson wrote: > > That won't work because the vma lock must be help between key > > calculation and get_user() - otherwise futex is not reliable. It > > would work if the futex key calculation was inside the loop. > > Sure, but that's still true: It's just that the get_user() is done

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Linus Torvalds
On Wed, 23 Feb 2005, Benjamin Herrenschmidt wrote: > > Yours is probably the most efficient too. Note sure what is best for > rwsems tho, there seem to be some interest preventing readers from > starving writers for ever, this has been debated endlessly iirc, > though I have no personal opinion

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Linus Torvalds
On Tue, 22 Feb 2005, Andrew Morton wrote: > > However the pte can get unmapped by memory reclaim so we could still take a > minor fault, and hit the same deadlock, yes? You _could_ fix that by getting the pagetable spinlock, I guess. Which check_user_page_readable() assumes you'd be holding

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Benjamin Herrenschmidt
On Tue, 2005-02-22 at 13:31 -0800, Linus Torvalds wrote: > > On Wed, 23 Feb 2005, Benjamin Herrenschmidt wrote: > > > > Isn't Olof scheme racy ? Can't the stuff get swapped out between the > > first get_user() and the "real" one ? > > Yes. But see my suggested modification (which I still think

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Andrew Morton
Jamie Lokier <[EMAIL PROTECTED]> wrote: > > ... > > > > One attempt to fix this is included below. It works, but I'm not entirely > > > happy with the fact that it's a bit messy solution. If anyone has a > > > better idea for how to solve it I'd be all ears. > > > > It's fairly sane. Style-wise

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Linus Torvalds
On Tue, 22 Feb 2005, Jamie Lokier wrote: > > A much simpler solution (and sorry for not offering it earlier, > because Andrew Morton pointed out this bug long ago, but I was busy), is: > > In futex.c: > > down_read(>mm->mmap_sem); > get_futex_key(...) etc. > queue_me(...)

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Linus Torvalds
On Wed, 23 Feb 2005, Benjamin Herrenschmidt wrote: > > Isn't Olof scheme racy ? Can't the stuff get swapped out between the > first get_user() and the "real" one ? Yes. But see my suggested modification (which I still think is "the thing that Olof does", except it's more efficient and avoids

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Jamie Lokier
Chris Friesen wrote: > > down_read(>mm->mmap_sem); > > get_futex_key(...) etc. > > queue_me(...) etc. > > current->flags |= PF_MMAP_SEM; <- new > > ret = get_user(...); > > current->flags &= PF_MMAP_SEM; <- new > > /* the rest */ > > Should the

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Olof Johansson
On Tue, Feb 22, 2005 at 09:07:52PM +, Jamie Lokier wrote: > > That won't work because the vma lock must be help between key > calculation and get_user() - otherwise futex is not reliable. It > would work if the futex key calculation was inside the loop. Sure, but that's still true: It's

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Benjamin Herrenschmidt
On Wed, 2005-02-23 at 08:16 +1100, Benjamin Herrenschmidt wrote: > On Tue, 2005-02-22 at 11:36 -0800, Linus Torvalds wrote: > > > DavidH - what's the word on nested read-semaphores like this? Are they > > supposed to work (like nested read-spinlocks), or do we need to do the > > things Olof

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Chris Friesen
Jamie Lokier wrote: In futex.c: down_read(>mm->mmap_sem); get_futex_key(...) etc. queue_me(...) etc. current->flags |= PF_MMAP_SEM; <- new ret = get_user(...); current->flags &= PF_MMAP_SEM; <- new /* the rest */

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Benjamin Herrenschmidt
On Tue, 2005-02-22 at 11:36 -0800, Linus Torvalds wrote: > DavidH - what's the word on nested read-semaphores like this? Are they > supposed to work (like nested read-spinlocks), or do we need to do the > things Olof does? Isn't Olof scheme racy ? Can't the stuff get swapped out between the

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Jamie Lokier
Andrew Morton wrote: > > This will quickly lock up, since the futex_wait code dows a > > down_read(mmap_sem), then a get_user(). > > > > The do_page_fault code on ppc64 (as well as other architectures) needs > > to take the same semaphore for reading. This is all good until the > > second thread

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Andrew Morton
[EMAIL PROTECTED] (Olof Johansson) wrote: > > Hi, > > Consider a small testcase that spawns off two threads, either thread > doing a loop of: > > buf = mmap /dev/zero MAP_SHARED for 0x10 bytes > call sys_futex (buf+page, FUTEX_WAIT, 1, NULL, NULL) for each page in > said mmap >

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Linus Torvalds
On Tue, 22 Feb 2005, Olof Johansson wrote: > > Consider a small testcase that spawns off two threads, either thread > doing a loop of: > > buf = mmap /dev/zero MAP_SHARED for 0x10 bytes > call sys_futex (buf+page, FUTEX_WAIT, 1, NULL, NULL) for each page in > said mmap >

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Linus Torvalds
On Tue, 22 Feb 2005, Olof Johansson wrote: Consider a small testcase that spawns off two threads, either thread doing a loop of: buf = mmap /dev/zero MAP_SHARED for 0x10 bytes call sys_futex (buf+page, FUTEX_WAIT, 1, NULL, NULL) for each page in said mmap

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Andrew Morton
[EMAIL PROTECTED] (Olof Johansson) wrote: Hi, Consider a small testcase that spawns off two threads, either thread doing a loop of: buf = mmap /dev/zero MAP_SHARED for 0x10 bytes call sys_futex (buf+page, FUTEX_WAIT, 1, NULL, NULL) for each page in said mmap

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Jamie Lokier
Andrew Morton wrote: This will quickly lock up, since the futex_wait code dows a down_read(mmap_sem), then a get_user(). The do_page_fault code on ppc64 (as well as other architectures) needs to take the same semaphore for reading. This is all good until the second thread comes into

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Benjamin Herrenschmidt
On Tue, 2005-02-22 at 11:36 -0800, Linus Torvalds wrote: DavidH - what's the word on nested read-semaphores like this? Are they supposed to work (like nested read-spinlocks), or do we need to do the things Olof does? Isn't Olof scheme racy ? Can't the stuff get swapped out between the first

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Chris Friesen
Jamie Lokier wrote: In futex.c: down_read(current-mm-mmap_sem); get_futex_key(...) etc. queue_me(...) etc. current-flags |= PF_MMAP_SEM; - new ret = get_user(...); current-flags = PF_MMAP_SEM; - new /* the rest */

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Benjamin Herrenschmidt
On Wed, 2005-02-23 at 08:16 +1100, Benjamin Herrenschmidt wrote: On Tue, 2005-02-22 at 11:36 -0800, Linus Torvalds wrote: DavidH - what's the word on nested read-semaphores like this? Are they supposed to work (like nested read-spinlocks), or do we need to do the things Olof does?

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Olof Johansson
On Tue, Feb 22, 2005 at 09:07:52PM +, Jamie Lokier wrote: That won't work because the vma lock must be help between key calculation and get_user() - otherwise futex is not reliable. It would work if the futex key calculation was inside the loop. Sure, but that's still true: It's just

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Jamie Lokier
Chris Friesen wrote: down_read(current-mm-mmap_sem); get_futex_key(...) etc. queue_me(...) etc. current-flags |= PF_MMAP_SEM; - new ret = get_user(...); current-flags = PF_MMAP_SEM; - new /* the rest */ Should the second new line

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Linus Torvalds
On Wed, 23 Feb 2005, Benjamin Herrenschmidt wrote: Isn't Olof scheme racy ? Can't the stuff get swapped out between the first get_user() and the real one ? Yes. But see my suggested modification (which I still think is the thing that Olof does, except it's more efficient and avoids the

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Linus Torvalds
On Tue, 22 Feb 2005, Jamie Lokier wrote: A much simpler solution (and sorry for not offering it earlier, because Andrew Morton pointed out this bug long ago, but I was busy), is: In futex.c: down_read(current-mm-mmap_sem); get_futex_key(...) etc. queue_me(...) etc.

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Andrew Morton
Jamie Lokier [EMAIL PROTECTED] wrote: ... One attempt to fix this is included below. It works, but I'm not entirely happy with the fact that it's a bit messy solution. If anyone has a better idea for how to solve it I'd be all ears. It's fairly sane. Style-wise I'd be inclined

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Benjamin Herrenschmidt
On Tue, 2005-02-22 at 13:31 -0800, Linus Torvalds wrote: On Wed, 23 Feb 2005, Benjamin Herrenschmidt wrote: Isn't Olof scheme racy ? Can't the stuff get swapped out between the first get_user() and the real one ? Yes. But see my suggested modification (which I still think is the thing

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Linus Torvalds
On Tue, 22 Feb 2005, Andrew Morton wrote: However the pte can get unmapped by memory reclaim so we could still take a minor fault, and hit the same deadlock, yes? You _could_ fix that by getting the pagetable spinlock, I guess. Which check_user_page_readable() assumes you'd be holding

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Jamie Lokier
Olof Johansson wrote: That won't work because the vma lock must be help between key calculation and get_user() - otherwise futex is not reliable. It would work if the futex key calculation was inside the loop. Sure, but that's still true: It's just that the get_user() is done twice

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Linus Torvalds
On Wed, 23 Feb 2005, Benjamin Herrenschmidt wrote: Yours is probably the most efficient too. Note sure what is best for rwsems tho, there seem to be some interest preventing readers from starving writers for ever, this has been debated endlessly iirc, though I have no personal opinion

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Benjamin Herrenschmidt
On Tue, 2005-02-22 at 14:10 -0800, Linus Torvalds wrote: Oh, well. The reason I hate the rwsem behaviour is exactly because it results in this very subtle class of deadlocks. This one case is certainly solvable several ways, but do we have other issues somewhere else? Things like kobject

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Jamie Lokier
Linus Torvalds wrote: queue_me(...) etc. current-flags |= PF_MMAP_SEM; - new ret = get_user(...); current-flags = PF_MMAP_SEM; - new /* the rest */ That is uglee. We really have this already, and it's called current-preempt. It handles

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Olof Johansson
On Tue, Feb 22, 2005 at 10:34:57PM +, Jamie Lokier wrote: There is one small but important error: the return ret mustn't just return. It must call unqueue_me(q) just like the code at out_unqueue, _including_ the conditional ret = 0, but _excluding_ the up_read(). Not only that, but

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Greg KH
On Tue, Feb 22, 2005 at 02:10:58PM -0800, Linus Torvalds wrote: Oh, well. The reason I hate the rwsem behaviour is exactly because it results in this very subtle class of deadlocks. This one case is certainly solvable several ways, but do we have other issues somewhere else? Things like

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Andrew Morton
[EMAIL PROTECTED] (Olof Johansson) wrote: + inc_preempt_count(); + ret = get_user(curval, (int __user *)uaddr1); + dec_preempt_count(); That _should_ generate a might_sleep() warning, except it looks like we forgot to add a check to get_user(). It would be

Re: [PATCH/RFC] Futex mmap_sem deadlock

2005-02-22 Thread Olof Johansson
On Tue, Feb 22, 2005 at 03:20:27PM -0800, Andrew Morton wrote: [EMAIL PROTECTED] (Olof Johansson) wrote: + inc_preempt_count(); + ret = get_user(curval, (int __user *)uaddr1); + dec_preempt_count(); That _should_ generate a might_sleep() warning, except it