Re: [PATCH 2.6.21-rc3-mm2 3/4] futex_requeue_pi optimization

2007-03-20 Thread Pierre Peiffer

Peter Zijlstra a écrit :



Unfortunately not, nonlinear vmas don't have a linear relation between
address and offset. What you would need to do is do a linear walk of the
page tables. But even that might not suffice if nonlinear vmas may form
a non-injective, surjective mapping.

/me checks..

Hmm, yes that seems valid, so in general, this reverse mapping does not
uniquely exist for non-linear vmas. :-(

What to do... disallow futexes in nonlinear mappings, 



store the address in the key?   <<


That seems to be the only solution... :-/



the vma_prio_tree would be able to give all vmas associated with a
mapping.



Thanks for your help.

--
Pierre
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.21-rc3-mm2 3/4] futex_requeue_pi optimization

2007-03-20 Thread Peter Zijlstra
On Tue, 2007-03-20 at 16:32 +0100, Pierre Peiffer wrote:
> Peter Zijlstra a écrit :
> >> +static void *get_futex_address(union futex_key *key)
> >> +{
> >> +  void *uaddr;
> >> +
> >> +  if (key->both.offset & 1) {
> >> +  /* shared mapping */
> >> +  uaddr = (void*)((key->shared.pgoff << PAGE_SHIFT)
> >> +  + key->shared.offset - 1);
> >> +  } else {
> >> +  /* private mapping */
> >> +  uaddr = (void*)(key->private.address + key->private.offset);
> >> +  }
> >> +
> >> +  return uaddr;
> >> +}
> > 
> > This will not work for nonlinear vmas, granted, not a lot of ppl stick
> > futexes in nonlinear vmas, but the futex_key stuff handles it, this
> > doesn't.
> 
> Indeed ! Thanks for pointing me to this.
> 
> Since I'm not familiar with vmm, does this code look more correct to you ?

Unfortunately not, nonlinear vmas don't have a linear relation between
address and offset. What you would need to do is do a linear walk of the
page tables. But even that might not suffice if nonlinear vmas may form
a non-injective, surjective mapping.

/me checks..

Hmm, yes that seems valid, so in general, this reverse mapping does not
uniquely exist for non-linear vmas. :-(

What to do... disallow futexes in nonlinear mappings, store the address
in the key?

> static void *get_futex_address(union futex_key *key)
> {
>   void *uaddr;
>   struct vm_area_struct *vma = current->mm->mmap;
> 
>   if (key->both.offset & 1) {
>   /* shared mapping */
>   struct file * vmf;
> 
>   do {
>   if ((vmf = vma->vm_file)
>   && (key->shared.inode == vmf->f_dentry->d_inode))
>   break;
>   vma = vma->vm_next;
>   } while (vma);
> 
>   if (likely(!(vma->vm_flags & VM_NONLINEAR)))
>   uaddr = (void*)((key->shared.pgoff << PAGE_SHIFT)
>   + key->shared.offset - 1);
>   else
>   uaddr = (void*) vma->vm_start
>   + ((key->shared.pgoff - vma->vm_pgoff)
>  << PAGE_SHIFT)
>   + key->shared.offset - 1;
>   } else {
>   /* private mapping */
>   uaddr = (void*)(key->private.address + key->private.offset);
>   }
> 
>   return uaddr;
> }
> 
> Or is there a more direct way to retrieve the vma corresponding to the given 
> inode ?

the vma_prio_tree would be able to give all vmas associated with a
mapping.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.21-rc3-mm2 3/4] futex_requeue_pi optimization

2007-03-20 Thread Pierre Peiffer

Peter Zijlstra a écrit :

+static void *get_futex_address(union futex_key *key)
+{
+   void *uaddr;
+
+   if (key->both.offset & 1) {
+   /* shared mapping */
+   uaddr = (void*)((key->shared.pgoff << PAGE_SHIFT)
+   + key->shared.offset - 1);
+   } else {
+   /* private mapping */
+   uaddr = (void*)(key->private.address + key->private.offset);
+   }
+
+   return uaddr;
+}


This will not work for nonlinear vmas, granted, not a lot of ppl stick
futexes in nonlinear vmas, but the futex_key stuff handles it, this
doesn't.


Indeed ! Thanks for pointing me to this.

Since I'm not familiar with vmm, does this code look more correct to you ?

static void *get_futex_address(union futex_key *key)
{
void *uaddr;
struct vm_area_struct *vma = current->mm->mmap;

if (key->both.offset & 1) {
/* shared mapping */
struct file * vmf;

do {
if ((vmf = vma->vm_file)
&& (key->shared.inode == vmf->f_dentry->d_inode))
break;
vma = vma->vm_next;
} while (vma);

if (likely(!(vma->vm_flags & VM_NONLINEAR)))
uaddr = (void*)((key->shared.pgoff << PAGE_SHIFT)
+ key->shared.offset - 1);
else
uaddr = (void*) vma->vm_start
+ ((key->shared.pgoff - vma->vm_pgoff)
   << PAGE_SHIFT)
+ key->shared.offset - 1;
} else {
/* private mapping */
uaddr = (void*)(key->private.address + key->private.offset);
}

return uaddr;
}

Or is there a more direct way to retrieve the vma corresponding to the given 
inode ?

Thanks,

--
Pierre
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.21-rc3-mm2 3/4] futex_requeue_pi optimization

2007-03-16 Thread Peter Zijlstra
On Tue, 2007-03-13 at 10:52 +0100, [EMAIL PROTECTED] wrote:
> plain text document attachment (futex-requeue-pi.diff)
> This patch provides the futex_requeue_pi functionality.
> 
> This provides an optimization, already used for (normal) futexes, to be used 
> for
> PI-futexes.
> 
> This optimization is currently used by the glibc in pthread_broadcast, when
> using "normal" mutexes. With futex_requeue_pi, it can be used with 
> PRIO_INHERIT
> mutexes too.
> 
> Signed-off-by: Pierre Peiffer <[EMAIL PROTECTED]>
> 
> ---

>  /*
> + * Retrieve the original address used to compute this key
> + */
> +static void *get_futex_address(union futex_key *key)
> +{
> + void *uaddr;
> +
> + if (key->both.offset & 1) {
> + /* shared mapping */
> + uaddr = (void*)((key->shared.pgoff << PAGE_SHIFT)
> + + key->shared.offset - 1);
> + } else {
> + /* private mapping */
> + uaddr = (void*)(key->private.address + key->private.offset);
> + }
> +
> + return uaddr;
> +}

This will not work for nonlinear vmas, granted, not a lot of ppl stick
futexes in nonlinear vmas, but the futex_key stuff handles it, this
doesn't.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.21-rc3-mm2 3/4] futex_requeue_pi optimization

2007-03-13 Thread Pierre . Peiffer
This patch provides the futex_requeue_pi functionality.

This provides an optimization, already used for (normal) futexes, to be used for
PI-futexes.

This optimization is currently used by the glibc in pthread_broadcast, when
using "normal" mutexes. With futex_requeue_pi, it can be used with PRIO_INHERIT
mutexes too.

Signed-off-by: Pierre Peiffer <[EMAIL PROTECTED]>

---
 include/linux/futex.h   |8 
 kernel/futex.c  |  557 +++-
 kernel/futex_compat.c   |3 
 kernel/rtmutex.c|   41 ---
 kernel/rtmutex_common.h |   34 ++
 5 files changed, 555 insertions(+), 88 deletions(-)

Index: b/include/linux/futex.h
===
--- a/include/linux/futex.h
+++ b/include/linux/futex.h
@@ -15,6 +15,7 @@
 #define FUTEX_LOCK_PI  6
 #define FUTEX_UNLOCK_PI7
 #define FUTEX_TRYLOCK_PI   8
+#define FUTEX_CMP_REQUEUE_PI   9
 
 /*
  * Support for robust futexes: the kernel cleans up held futexes at
@@ -83,9 +84,14 @@ struct robust_list_head {
 #define FUTEX_OWNER_DIED   0x4000
 
 /*
+ * Some processes have been requeued on this PI-futex
+ */
+#define FUTEX_WAITER_REQUEUED  0x2000
+
+/*
  * The rest of the robust-futex field is for the TID:
  */
-#define FUTEX_TID_MASK 0x3fff
+#define FUTEX_TID_MASK 0x0fff
 
 /*
  * This limit protects against a deliberately circular list.
Index: b/kernel/futex.c
===
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -53,6 +53,12 @@
 
 #include "rtmutex_common.h"
 
+#ifdef CONFIG_DEBUG_RT_MUTEXES
+# include "rtmutex-debug.h"
+#else
+# include "rtmutex.h"
+#endif
+
 #define FUTEX_HASHBITS (CONFIG_BASE_SMALL ? 4 : 8)
 
 /*
@@ -102,6 +108,12 @@ struct futex_q {
/* Optional priority inheritance state: */
struct futex_pi_state *pi_state;
struct task_struct *task;
+
+   /*
+* This waiter is used in case of requeue from a
+* normal futex to a PI-futex
+*/
+   struct rt_mutex_waiter waiter;
 };
 
 /*
@@ -224,6 +236,25 @@ int get_futex_key(u32 __user *uaddr, uni
 EXPORT_SYMBOL_GPL(get_futex_key);
 
 /*
+ * Retrieve the original address used to compute this key
+ */
+static void *get_futex_address(union futex_key *key)
+{
+   void *uaddr;
+
+   if (key->both.offset & 1) {
+   /* shared mapping */
+   uaddr = (void*)((key->shared.pgoff << PAGE_SHIFT)
+   + key->shared.offset - 1);
+   } else {
+   /* private mapping */
+   uaddr = (void*)(key->private.address + key->private.offset);
+   }
+
+   return uaddr;
+}
+
+/*
  * Take a reference to the resource addressed by a key.
  * Can be called while holding spinlocks.
  *
@@ -439,7 +470,8 @@ void exit_pi_state_list(struct task_stru
 }
 
 static int
-lookup_pi_state(u32 uval, struct futex_hash_bucket *hb, struct futex_q *me)
+lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
+   union futex_key *key, struct futex_pi_state **ps)
 {
struct futex_pi_state *pi_state = NULL;
struct futex_q *this, *next;
@@ -450,7 +482,7 @@ lookup_pi_state(u32 uval, struct futex_h
head = &hb->chain;
 
plist_for_each_entry_safe(this, next, head, list) {
-   if (match_futex(&this->key, &me->key)) {
+   if (match_futex(&this->key, key)) {
/*
 * Another waiter already exists - bump up
 * the refcount and return its pi_state:
@@ -465,7 +497,7 @@ lookup_pi_state(u32 uval, struct futex_h
WARN_ON(!atomic_read(&pi_state->refcount));
 
atomic_inc(&pi_state->refcount);
-   me->pi_state = pi_state;
+   *ps = pi_state;
 
return 0;
}
@@ -492,7 +524,7 @@ lookup_pi_state(u32 uval, struct futex_h
rt_mutex_init_proxy_locked(&pi_state->pi_mutex, p);
 
/* Store the key for possible exit cleanups: */
-   pi_state->key = me->key;
+   pi_state->key = *key;
 
spin_lock_irq(&p->pi_lock);
WARN_ON(!list_empty(&pi_state->list));
@@ -502,7 +534,7 @@ lookup_pi_state(u32 uval, struct futex_h
 
put_task_struct(p);
 
-   me->pi_state = pi_state;
+   *ps = pi_state;
 
return 0;
 }
@@ -561,6 +593,8 @@ static int wake_futex_pi(u32 __user *uad
 */
if (!(uval & FUTEX_OWNER_DIED)) {
newval = FUTEX_WAITERS | new_owner->pid;
+   /* Keep the FUTEX_WAITER_REQUEUED flag if it was set */
+   newval |= (uval & FUTEX_WAITER_REQUEUED);
 
pagefault_disable();
curval = futex_atomic_cmpxchg_inatomic(uaddr, uval, newval);
@@ -664,6 +698,254 @@ out:
 }
 
 /*
+ * Called from futex_requeue_pi.
+ * Set FUTEX_WAITERS and FUTEX_WAITE