Re: [PATCH 2.6.21-rc3-mm2 3/4] futex_requeue_pi optimization
Peter Zijlstra a écrit : Unfortunately not, nonlinear vmas don't have a linear relation between address and offset. What you would need to do is do a linear walk of the page tables. But even that might not suffice if nonlinear vmas may form a non-injective, surjective mapping. /me checks.. Hmm, yes that seems valid, so in general, this reverse mapping does not uniquely exist for non-linear vmas. :-( What to do... disallow futexes in nonlinear mappings, store the address in the key? << That seems to be the only solution... :-/ the vma_prio_tree would be able to give all vmas associated with a mapping. Thanks for your help. -- Pierre - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.21-rc3-mm2 3/4] futex_requeue_pi optimization
On Tue, 2007-03-20 at 16:32 +0100, Pierre Peiffer wrote: > Peter Zijlstra a écrit : > >> +static void *get_futex_address(union futex_key *key) > >> +{ > >> + void *uaddr; > >> + > >> + if (key->both.offset & 1) { > >> + /* shared mapping */ > >> + uaddr = (void*)((key->shared.pgoff << PAGE_SHIFT) > >> + + key->shared.offset - 1); > >> + } else { > >> + /* private mapping */ > >> + uaddr = (void*)(key->private.address + key->private.offset); > >> + } > >> + > >> + return uaddr; > >> +} > > > > This will not work for nonlinear vmas, granted, not a lot of ppl stick > > futexes in nonlinear vmas, but the futex_key stuff handles it, this > > doesn't. > > Indeed ! Thanks for pointing me to this. > > Since I'm not familiar with vmm, does this code look more correct to you ? Unfortunately not, nonlinear vmas don't have a linear relation between address and offset. What you would need to do is do a linear walk of the page tables. But even that might not suffice if nonlinear vmas may form a non-injective, surjective mapping. /me checks.. Hmm, yes that seems valid, so in general, this reverse mapping does not uniquely exist for non-linear vmas. :-( What to do... disallow futexes in nonlinear mappings, store the address in the key? > static void *get_futex_address(union futex_key *key) > { > void *uaddr; > struct vm_area_struct *vma = current->mm->mmap; > > if (key->both.offset & 1) { > /* shared mapping */ > struct file * vmf; > > do { > if ((vmf = vma->vm_file) > && (key->shared.inode == vmf->f_dentry->d_inode)) > break; > vma = vma->vm_next; > } while (vma); > > if (likely(!(vma->vm_flags & VM_NONLINEAR))) > uaddr = (void*)((key->shared.pgoff << PAGE_SHIFT) > + key->shared.offset - 1); > else > uaddr = (void*) vma->vm_start > + ((key->shared.pgoff - vma->vm_pgoff) > << PAGE_SHIFT) > + key->shared.offset - 1; > } else { > /* private mapping */ > uaddr = (void*)(key->private.address + key->private.offset); > } > > return uaddr; > } > > Or is there a more direct way to retrieve the vma corresponding to the given > inode ? the vma_prio_tree would be able to give all vmas associated with a mapping. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.21-rc3-mm2 3/4] futex_requeue_pi optimization
Peter Zijlstra a écrit : +static void *get_futex_address(union futex_key *key) +{ + void *uaddr; + + if (key->both.offset & 1) { + /* shared mapping */ + uaddr = (void*)((key->shared.pgoff << PAGE_SHIFT) + + key->shared.offset - 1); + } else { + /* private mapping */ + uaddr = (void*)(key->private.address + key->private.offset); + } + + return uaddr; +} This will not work for nonlinear vmas, granted, not a lot of ppl stick futexes in nonlinear vmas, but the futex_key stuff handles it, this doesn't. Indeed ! Thanks for pointing me to this. Since I'm not familiar with vmm, does this code look more correct to you ? static void *get_futex_address(union futex_key *key) { void *uaddr; struct vm_area_struct *vma = current->mm->mmap; if (key->both.offset & 1) { /* shared mapping */ struct file * vmf; do { if ((vmf = vma->vm_file) && (key->shared.inode == vmf->f_dentry->d_inode)) break; vma = vma->vm_next; } while (vma); if (likely(!(vma->vm_flags & VM_NONLINEAR))) uaddr = (void*)((key->shared.pgoff << PAGE_SHIFT) + key->shared.offset - 1); else uaddr = (void*) vma->vm_start + ((key->shared.pgoff - vma->vm_pgoff) << PAGE_SHIFT) + key->shared.offset - 1; } else { /* private mapping */ uaddr = (void*)(key->private.address + key->private.offset); } return uaddr; } Or is there a more direct way to retrieve the vma corresponding to the given inode ? Thanks, -- Pierre - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.21-rc3-mm2 3/4] futex_requeue_pi optimization
On Tue, 2007-03-13 at 10:52 +0100, [EMAIL PROTECTED] wrote: > plain text document attachment (futex-requeue-pi.diff) > This patch provides the futex_requeue_pi functionality. > > This provides an optimization, already used for (normal) futexes, to be used > for > PI-futexes. > > This optimization is currently used by the glibc in pthread_broadcast, when > using "normal" mutexes. With futex_requeue_pi, it can be used with > PRIO_INHERIT > mutexes too. > > Signed-off-by: Pierre Peiffer <[EMAIL PROTECTED]> > > --- > /* > + * Retrieve the original address used to compute this key > + */ > +static void *get_futex_address(union futex_key *key) > +{ > + void *uaddr; > + > + if (key->both.offset & 1) { > + /* shared mapping */ > + uaddr = (void*)((key->shared.pgoff << PAGE_SHIFT) > + + key->shared.offset - 1); > + } else { > + /* private mapping */ > + uaddr = (void*)(key->private.address + key->private.offset); > + } > + > + return uaddr; > +} This will not work for nonlinear vmas, granted, not a lot of ppl stick futexes in nonlinear vmas, but the futex_key stuff handles it, this doesn't. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.21-rc3-mm2 3/4] futex_requeue_pi optimization
This patch provides the futex_requeue_pi functionality. This provides an optimization, already used for (normal) futexes, to be used for PI-futexes. This optimization is currently used by the glibc in pthread_broadcast, when using "normal" mutexes. With futex_requeue_pi, it can be used with PRIO_INHERIT mutexes too. Signed-off-by: Pierre Peiffer <[EMAIL PROTECTED]> --- include/linux/futex.h |8 kernel/futex.c | 557 +++- kernel/futex_compat.c |3 kernel/rtmutex.c| 41 --- kernel/rtmutex_common.h | 34 ++ 5 files changed, 555 insertions(+), 88 deletions(-) Index: b/include/linux/futex.h === --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -15,6 +15,7 @@ #define FUTEX_LOCK_PI 6 #define FUTEX_UNLOCK_PI7 #define FUTEX_TRYLOCK_PI 8 +#define FUTEX_CMP_REQUEUE_PI 9 /* * Support for robust futexes: the kernel cleans up held futexes at @@ -83,9 +84,14 @@ struct robust_list_head { #define FUTEX_OWNER_DIED 0x4000 /* + * Some processes have been requeued on this PI-futex + */ +#define FUTEX_WAITER_REQUEUED 0x2000 + +/* * The rest of the robust-futex field is for the TID: */ -#define FUTEX_TID_MASK 0x3fff +#define FUTEX_TID_MASK 0x0fff /* * This limit protects against a deliberately circular list. Index: b/kernel/futex.c === --- a/kernel/futex.c +++ b/kernel/futex.c @@ -53,6 +53,12 @@ #include "rtmutex_common.h" +#ifdef CONFIG_DEBUG_RT_MUTEXES +# include "rtmutex-debug.h" +#else +# include "rtmutex.h" +#endif + #define FUTEX_HASHBITS (CONFIG_BASE_SMALL ? 4 : 8) /* @@ -102,6 +108,12 @@ struct futex_q { /* Optional priority inheritance state: */ struct futex_pi_state *pi_state; struct task_struct *task; + + /* +* This waiter is used in case of requeue from a +* normal futex to a PI-futex +*/ + struct rt_mutex_waiter waiter; }; /* @@ -224,6 +236,25 @@ int get_futex_key(u32 __user *uaddr, uni EXPORT_SYMBOL_GPL(get_futex_key); /* + * Retrieve the original address used to compute this key + */ +static void *get_futex_address(union futex_key *key) +{ + void *uaddr; + + if (key->both.offset & 1) { + /* shared mapping */ + uaddr = (void*)((key->shared.pgoff << PAGE_SHIFT) + + key->shared.offset - 1); + } else { + /* private mapping */ + uaddr = (void*)(key->private.address + key->private.offset); + } + + return uaddr; +} + +/* * Take a reference to the resource addressed by a key. * Can be called while holding spinlocks. * @@ -439,7 +470,8 @@ void exit_pi_state_list(struct task_stru } static int -lookup_pi_state(u32 uval, struct futex_hash_bucket *hb, struct futex_q *me) +lookup_pi_state(u32 uval, struct futex_hash_bucket *hb, + union futex_key *key, struct futex_pi_state **ps) { struct futex_pi_state *pi_state = NULL; struct futex_q *this, *next; @@ -450,7 +482,7 @@ lookup_pi_state(u32 uval, struct futex_h head = &hb->chain; plist_for_each_entry_safe(this, next, head, list) { - if (match_futex(&this->key, &me->key)) { + if (match_futex(&this->key, key)) { /* * Another waiter already exists - bump up * the refcount and return its pi_state: @@ -465,7 +497,7 @@ lookup_pi_state(u32 uval, struct futex_h WARN_ON(!atomic_read(&pi_state->refcount)); atomic_inc(&pi_state->refcount); - me->pi_state = pi_state; + *ps = pi_state; return 0; } @@ -492,7 +524,7 @@ lookup_pi_state(u32 uval, struct futex_h rt_mutex_init_proxy_locked(&pi_state->pi_mutex, p); /* Store the key for possible exit cleanups: */ - pi_state->key = me->key; + pi_state->key = *key; spin_lock_irq(&p->pi_lock); WARN_ON(!list_empty(&pi_state->list)); @@ -502,7 +534,7 @@ lookup_pi_state(u32 uval, struct futex_h put_task_struct(p); - me->pi_state = pi_state; + *ps = pi_state; return 0; } @@ -561,6 +593,8 @@ static int wake_futex_pi(u32 __user *uad */ if (!(uval & FUTEX_OWNER_DIED)) { newval = FUTEX_WAITERS | new_owner->pid; + /* Keep the FUTEX_WAITER_REQUEUED flag if it was set */ + newval |= (uval & FUTEX_WAITER_REQUEUED); pagefault_disable(); curval = futex_atomic_cmpxchg_inatomic(uaddr, uval, newval); @@ -664,6 +698,254 @@ out: } /* + * Called from futex_requeue_pi. + * Set FUTEX_WAITERS and FUTEX_WAITE