Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-12 Thread Eric W. Biederman
"Adam Litke" <[EMAIL PROTECTED]> writes:

> On 6/12/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
>> Adam Litke <[EMAIL PROTECTED]> writes:
>>
>> > Here's another breakage as a result of shared memory stacked files :(
>> >
>> > The NUMA policy for a VMA is determined by checking the following (in the
> order
>> > given):

Where that doesn't appear to be what the current code does.
If there is a VMA it doesn't appear that we check task->mempolicy.

>> > 1) vma->vm_ops->get_policy() (if defined)
>> > 2) vma->vm_policy (if defined)
>> > 3) task->mempolicy (if defined)
>> > 4) Fall back to default_policy
>> >
>> > By switching to stacked files for shared memory, get_policy() is now always
> set
>> > to shm_get_policy which is a wrapper function.  This causes us to stop at
> step
>> > 1, which yields NULL for hugetlb instead of task->mempolicy which was the
>> > previous (and correct) result.
>> >
>> > This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for
> the
>> > wrapped vm_ops.  Andi and Christoph, does this look right to you?
>>
>> I'm confused.
>>
>> I agree that the behavior you describe is correct.
>> However I only see two code paths were get_policy is called and
>> both of them take a NULL result and change it to task->mempolicy:
>
> The coffee hasn't quite absorbed yet, but don't those two code paths
> take a NULL result from get_policy() and turn it into default_policy,
> not task->mempolicy?

True, a NULL is turned into default_policy.  However the basic
confusion remains.   I don't see how we ever return task->mempolicy
on those paths if we have a vma, and those appear to be the only
call sites of get_policy.

>> From mm/mempolicy.c
>>
>> > long do_get_mempolicy(int *policy, nodemask_t *nmask,
>> >   unsigned long addr, unsigned long flags)
>> > {
>> >   int err;
>> >   struct mm_struct *mm = current->mm;
>> >   struct vm_area_struct *vma = NULL;
>> >   struct mempolicy *pol = current->mempolicy;
>> >
>> >   cpuset_update_task_memory_state();
>> >   if (flags & ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
>> >   return -EINVAL;
>> >   if (flags & MPOL_F_ADDR) {
>> >   down_read(>mmap_sem);
>> >   vma = find_vma_intersection(mm, addr, addr+1);
>> >   if (!vma) {
>> >   up_read(>mmap_sem);
>> >   return -EFAULT;
>> >   }
>> >   if (vma->vm_ops && vma->vm_ops->get_policy)
>> >   pol = vma->vm_ops->get_policy(vma, addr);
>> >   else
>> >   pol = vma->vm_policy;
>> >   } else if (addr)
>> >   return -EINVAL;
>> >
>> >   if (!pol)
>> >   pol = _policy;
>>
>>
>>
>> > /* Return effective policy for a VMA */
>> > static struct mempolicy * get_vma_policy(struct task_struct *task,
>> >   struct vm_area_struct *vma, unsigned long addr)
>> > {
>> >   struct mempolicy *pol = task->mempolicy;
>> >
>> >   if (vma) {
>> >   if (vma->vm_ops && vma->vm_ops->get_policy)
>> >   pol = vma->vm_ops->get_policy(vma, addr);
>> >   else if (vma->vm_policy &&
>> >   vma->vm_policy->policy != MPOL_DEFAULT)
>> >   pol = vma->vm_policy;
>> >   }
>> >   if (!pol)
>> >   pol = _policy;
>> >   return pol;
>> > }
>>
>>
>> Does this perhaps need to be:
>> > Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
>> >
>> > diff --git a/ipc/shm.c b/ipc/shm.c
>> > index 4fefbad..8d2672d 100644
>> > --- a/ipc/shm.c
>> > +++ b/ipc/shm.c
>> > @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct
>> > *vma, unsigned long addr)
>>
>> +   pol = NULL;
>> >
>> >   if (sfd->vm_ops->get_policy)
>> >   pol = sfd->vm_ops->get_policy(vma, addr);
>> > - else
>> > + else if (vma->vm_policy && vma->vm_policy->policy != MPOL_DEFAULT)
>> >   pol = vma->vm_policy;
>> >   return pol;
>> >  }
>> >  #endif
>
> afaict this would provide no way for pol to be set to task->mempolicy
> for hugetlb per my comment above.

Being dense I don't see a way for us to have ever returned task->mempolicy.

So perhaps something else changed.  Or something was made more
consistent and another bug was revealed.

I don't truly see the bug in shm_get_policy.  I can see why it would
not have the expected affect but that is different.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-12 Thread Adam Litke

On 6/12/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

Adam Litke <[EMAIL PROTECTED]> writes:

> Here's another breakage as a result of shared memory stacked files :(
>
> The NUMA policy for a VMA is determined by checking the following (in the 
order
> given):
>
> 1) vma->vm_ops->get_policy() (if defined)
> 2) vma->vm_policy (if defined)
> 3) task->mempolicy (if defined)
> 4) Fall back to default_policy
>
> By switching to stacked files for shared memory, get_policy() is now always 
set
> to shm_get_policy which is a wrapper function.  This causes us to stop at step
> 1, which yields NULL for hugetlb instead of task->mempolicy which was the
> previous (and correct) result.
>
> This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
> wrapped vm_ops.  Andi and Christoph, does this look right to you?

I'm confused.

I agree that the behavior you describe is correct.
However I only see two code paths were get_policy is called and
both of them take a NULL result and change it to task->mempolicy:


The coffee hasn't quite absorbed yet, but don't those two code paths
take a NULL result from get_policy() and turn it into default_policy,
not task->mempolicy?


From mm/mempolicy.c

> long do_get_mempolicy(int *policy, nodemask_t *nmask,
>   unsigned long addr, unsigned long flags)
> {
>   int err;
>   struct mm_struct *mm = current->mm;
>   struct vm_area_struct *vma = NULL;
>   struct mempolicy *pol = current->mempolicy;
>
>   cpuset_update_task_memory_state();
>   if (flags & ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
>   return -EINVAL;
>   if (flags & MPOL_F_ADDR) {
>   down_read(>mmap_sem);
>   vma = find_vma_intersection(mm, addr, addr+1);
>   if (!vma) {
>   up_read(>mmap_sem);
>   return -EFAULT;
>   }
>   if (vma->vm_ops && vma->vm_ops->get_policy)
>   pol = vma->vm_ops->get_policy(vma, addr);
>   else
>   pol = vma->vm_policy;
>   } else if (addr)
>   return -EINVAL;
>
>   if (!pol)
>   pol = _policy;



> /* Return effective policy for a VMA */
> static struct mempolicy * get_vma_policy(struct task_struct *task,
>   struct vm_area_struct *vma, unsigned long addr)
> {
>   struct mempolicy *pol = task->mempolicy;
>
>   if (vma) {
>   if (vma->vm_ops && vma->vm_ops->get_policy)
>   pol = vma->vm_ops->get_policy(vma, addr);
>   else if (vma->vm_policy &&
>   vma->vm_policy->policy != MPOL_DEFAULT)
>   pol = vma->vm_policy;
>   }
>   if (!pol)
>   pol = _policy;
>   return pol;
> }


Does this perhaps need to be:
> Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
>
> diff --git a/ipc/shm.c b/ipc/shm.c
> index 4fefbad..8d2672d 100644
> --- a/ipc/shm.c
> +++ b/ipc/shm.c
> @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct
> *vma, unsigned long addr)

+   pol = NULL;
>
>   if (sfd->vm_ops->get_policy)
>   pol = sfd->vm_ops->get_policy(vma, addr);
> - else
> + else if (vma->vm_policy && vma->vm_policy->policy != MPOL_DEFAULT)
>   pol = vma->vm_policy;
>   return pol;
>  }
>  #endif


afaict this would provide no way for pol to be set to task->mempolicy
for hugetlb per my comment above.

--
Adam Litke ( agl at us.ibm.com )
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-12 Thread Andi Kleen

> Can we just double-check the refcounting please?
> 
> > index 4fefbad..8d2672d 100644
> > --- a/ipc/shm.c
> > +++ b/ipc/shm.c
> > @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct 
> > *vma, unsigned long addr)
> >  
> > if (sfd->vm_ops->get_policy)
> > pol = sfd->vm_ops->get_policy(vma, addr);
> 
> afacit this takes a ref on the underlying policy
> 
> > -   else
> > +   else if (vma->vm_policy)
> > pol = vma->vm_policy;
> > +   else
> > +   pol = current->mempolicy;
> 
> but these two do not.
> 
> > return pol;
> >  }
> >  #endif
> 
> Is is all correct?

No it looks broken.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-12 Thread William Lee Irwin III
On Tue, Jun 12, 2007 at 12:20:52AM -0600, Eric W. Biederman wrote:
> Does this perhaps need to be:
>> diff --git a/ipc/shm.c b/ipc/shm.c
>> index 4fefbad..8d2672d 100644
>> --- a/ipc/shm.c
>> +++ b/ipc/shm.c
>> @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct
>> *vma, unsigned long addr)
>> 
>> +pol = NULL;
>>  
>>  if (sfd->vm_ops->get_policy)
>>  pol = sfd->vm_ops->get_policy(vma, addr);
>> -else
>> +else if (vma->vm_policy && vma->vm_policy->policy != MPOL_DEFAULT)
>>  pol = vma->vm_policy;
>>  return pol;

Those paths are above the level where shm_get_policy() is called.
It may be that shm_get_policy() doesn't need to recapitulate them 
if it's only ever called through such codepaths. It's not clear to
me whether that's intended as an invariant or is coincidental and
not guaranteed for future callsites.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-12 Thread Eric W. Biederman
Adam Litke <[EMAIL PROTECTED]> writes:

> Here's another breakage as a result of shared memory stacked files :(
>
> The NUMA policy for a VMA is determined by checking the following (in the 
> order
> given):
>
> 1) vma->vm_ops->get_policy() (if defined)
> 2) vma->vm_policy (if defined)
> 3) task->mempolicy (if defined)
> 4) Fall back to default_policy
>
> By switching to stacked files for shared memory, get_policy() is now always 
> set
> to shm_get_policy which is a wrapper function.  This causes us to stop at step
> 1, which yields NULL for hugetlb instead of task->mempolicy which was the
> previous (and correct) result.
>
> This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
> wrapped vm_ops.  Andi and Christoph, does this look right to you?

I'm confused.

I agree that the behavior you describe is correct.
However I only see two code paths were get_policy is called and
both of them take a NULL result and change it to task->mempolicy:

>From mm/mempolicy.c

> long do_get_mempolicy(int *policy, nodemask_t *nmask,
>   unsigned long addr, unsigned long flags)
> {
>   int err;
>   struct mm_struct *mm = current->mm;
>   struct vm_area_struct *vma = NULL;
>   struct mempolicy *pol = current->mempolicy;
> 
>   cpuset_update_task_memory_state();
>   if (flags & ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
>   return -EINVAL;
>   if (flags & MPOL_F_ADDR) {
>   down_read(>mmap_sem);
>   vma = find_vma_intersection(mm, addr, addr+1);
>   if (!vma) {
>   up_read(>mmap_sem);
>   return -EFAULT;
>   }
>   if (vma->vm_ops && vma->vm_ops->get_policy)
>   pol = vma->vm_ops->get_policy(vma, addr);
>   else
>   pol = vma->vm_policy;
>   } else if (addr)
>   return -EINVAL;
> 
>   if (!pol)
>   pol = _policy;



> /* Return effective policy for a VMA */
> static struct mempolicy * get_vma_policy(struct task_struct *task,
>   struct vm_area_struct *vma, unsigned long addr)
> {
>   struct mempolicy *pol = task->mempolicy;
> 
>   if (vma) {
>   if (vma->vm_ops && vma->vm_ops->get_policy)
>   pol = vma->vm_ops->get_policy(vma, addr);
>   else if (vma->vm_policy &&
>   vma->vm_policy->policy != MPOL_DEFAULT)
>   pol = vma->vm_policy;
>   }
>   if (!pol)
>   pol = _policy;
>   return pol;
> }


Does this perhaps need to be:
> Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
>
> diff --git a/ipc/shm.c b/ipc/shm.c
> index 4fefbad..8d2672d 100644
> --- a/ipc/shm.c
> +++ b/ipc/shm.c
> @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct
> *vma, unsigned long addr)

+   pol = NULL;
>  
>   if (sfd->vm_ops->get_policy)
>   pol = sfd->vm_ops->get_policy(vma, addr);
> - else
> + else if (vma->vm_policy && vma->vm_policy->policy != MPOL_DEFAULT)
>   pol = vma->vm_policy;
>   return pol;
>  }
>  #endif

Sorry I'm just a little dense at the moment.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-12 Thread Eric W. Biederman
Adam Litke [EMAIL PROTECTED] writes:

 Here's another breakage as a result of shared memory stacked files :(

 The NUMA policy for a VMA is determined by checking the following (in the 
 order
 given):

 1) vma-vm_ops-get_policy() (if defined)
 2) vma-vm_policy (if defined)
 3) task-mempolicy (if defined)
 4) Fall back to default_policy

 By switching to stacked files for shared memory, get_policy() is now always 
 set
 to shm_get_policy which is a wrapper function.  This causes us to stop at step
 1, which yields NULL for hugetlb instead of task-mempolicy which was the
 previous (and correct) result.

 This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
 wrapped vm_ops.  Andi and Christoph, does this look right to you?

I'm confused.

I agree that the behavior you describe is correct.
However I only see two code paths were get_policy is called and
both of them take a NULL result and change it to task-mempolicy:

From mm/mempolicy.c

 long do_get_mempolicy(int *policy, nodemask_t *nmask,
   unsigned long addr, unsigned long flags)
 {
   int err;
   struct mm_struct *mm = current-mm;
   struct vm_area_struct *vma = NULL;
   struct mempolicy *pol = current-mempolicy;
 
   cpuset_update_task_memory_state();
   if (flags  ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
   return -EINVAL;
   if (flags  MPOL_F_ADDR) {
   down_read(mm-mmap_sem);
   vma = find_vma_intersection(mm, addr, addr+1);
   if (!vma) {
   up_read(mm-mmap_sem);
   return -EFAULT;
   }
   if (vma-vm_ops  vma-vm_ops-get_policy)
   pol = vma-vm_ops-get_policy(vma, addr);
   else
   pol = vma-vm_policy;
   } else if (addr)
   return -EINVAL;
 
   if (!pol)
   pol = default_policy;



 /* Return effective policy for a VMA */
 static struct mempolicy * get_vma_policy(struct task_struct *task,
   struct vm_area_struct *vma, unsigned long addr)
 {
   struct mempolicy *pol = task-mempolicy;
 
   if (vma) {
   if (vma-vm_ops  vma-vm_ops-get_policy)
   pol = vma-vm_ops-get_policy(vma, addr);
   else if (vma-vm_policy 
   vma-vm_policy-policy != MPOL_DEFAULT)
   pol = vma-vm_policy;
   }
   if (!pol)
   pol = default_policy;
   return pol;
 }


Does this perhaps need to be:
 Signed-off-by: Adam Litke [EMAIL PROTECTED]

 diff --git a/ipc/shm.c b/ipc/shm.c
 index 4fefbad..8d2672d 100644
 --- a/ipc/shm.c
 +++ b/ipc/shm.c
 @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct
 *vma, unsigned long addr)

+   pol = NULL;
  
   if (sfd-vm_ops-get_policy)
   pol = sfd-vm_ops-get_policy(vma, addr);
 - else
 + else if (vma-vm_policy  vma-vm_policy-policy != MPOL_DEFAULT)
   pol = vma-vm_policy;
   return pol;
  }
  #endif

Sorry I'm just a little dense at the moment.

Eric

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-12 Thread William Lee Irwin III
On Tue, Jun 12, 2007 at 12:20:52AM -0600, Eric W. Biederman wrote:
 Does this perhaps need to be:
 diff --git a/ipc/shm.c b/ipc/shm.c
 index 4fefbad..8d2672d 100644
 --- a/ipc/shm.c
 +++ b/ipc/shm.c
 @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct
 *vma, unsigned long addr)
 
 +pol = NULL;
  
  if (sfd-vm_ops-get_policy)
  pol = sfd-vm_ops-get_policy(vma, addr);
 -else
 +else if (vma-vm_policy  vma-vm_policy-policy != MPOL_DEFAULT)
  pol = vma-vm_policy;
  return pol;

Those paths are above the level where shm_get_policy() is called.
It may be that shm_get_policy() doesn't need to recapitulate them 
if it's only ever called through such codepaths. It's not clear to
me whether that's intended as an invariant or is coincidental and
not guaranteed for future callsites.


-- wli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-12 Thread Andi Kleen

 Can we just double-check the refcounting please?
 
  index 4fefbad..8d2672d 100644
  --- a/ipc/shm.c
  +++ b/ipc/shm.c
  @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct 
  *vma, unsigned long addr)
   
  if (sfd-vm_ops-get_policy)
  pol = sfd-vm_ops-get_policy(vma, addr);
 
 afacit this takes a ref on the underlying policy
 
  -   else
  +   else if (vma-vm_policy)
  pol = vma-vm_policy;
  +   else
  +   pol = current-mempolicy;
 
 but these two do not.
 
  return pol;
   }
   #endif
 
 Is is all correct?

No it looks broken.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-12 Thread Adam Litke

On 6/12/07, Eric W. Biederman [EMAIL PROTECTED] wrote:

Adam Litke [EMAIL PROTECTED] writes:

 Here's another breakage as a result of shared memory stacked files :(

 The NUMA policy for a VMA is determined by checking the following (in the 
order
 given):

 1) vma-vm_ops-get_policy() (if defined)
 2) vma-vm_policy (if defined)
 3) task-mempolicy (if defined)
 4) Fall back to default_policy

 By switching to stacked files for shared memory, get_policy() is now always 
set
 to shm_get_policy which is a wrapper function.  This causes us to stop at step
 1, which yields NULL for hugetlb instead of task-mempolicy which was the
 previous (and correct) result.

 This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
 wrapped vm_ops.  Andi and Christoph, does this look right to you?

I'm confused.

I agree that the behavior you describe is correct.
However I only see two code paths were get_policy is called and
both of them take a NULL result and change it to task-mempolicy:


The coffee hasn't quite absorbed yet, but don't those two code paths
take a NULL result from get_policy() and turn it into default_policy,
not task-mempolicy?


From mm/mempolicy.c

 long do_get_mempolicy(int *policy, nodemask_t *nmask,
   unsigned long addr, unsigned long flags)
 {
   int err;
   struct mm_struct *mm = current-mm;
   struct vm_area_struct *vma = NULL;
   struct mempolicy *pol = current-mempolicy;

   cpuset_update_task_memory_state();
   if (flags  ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
   return -EINVAL;
   if (flags  MPOL_F_ADDR) {
   down_read(mm-mmap_sem);
   vma = find_vma_intersection(mm, addr, addr+1);
   if (!vma) {
   up_read(mm-mmap_sem);
   return -EFAULT;
   }
   if (vma-vm_ops  vma-vm_ops-get_policy)
   pol = vma-vm_ops-get_policy(vma, addr);
   else
   pol = vma-vm_policy;
   } else if (addr)
   return -EINVAL;

   if (!pol)
   pol = default_policy;



 /* Return effective policy for a VMA */
 static struct mempolicy * get_vma_policy(struct task_struct *task,
   struct vm_area_struct *vma, unsigned long addr)
 {
   struct mempolicy *pol = task-mempolicy;

   if (vma) {
   if (vma-vm_ops  vma-vm_ops-get_policy)
   pol = vma-vm_ops-get_policy(vma, addr);
   else if (vma-vm_policy 
   vma-vm_policy-policy != MPOL_DEFAULT)
   pol = vma-vm_policy;
   }
   if (!pol)
   pol = default_policy;
   return pol;
 }


Does this perhaps need to be:
 Signed-off-by: Adam Litke [EMAIL PROTECTED]

 diff --git a/ipc/shm.c b/ipc/shm.c
 index 4fefbad..8d2672d 100644
 --- a/ipc/shm.c
 +++ b/ipc/shm.c
 @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct
 *vma, unsigned long addr)

+   pol = NULL;

   if (sfd-vm_ops-get_policy)
   pol = sfd-vm_ops-get_policy(vma, addr);
 - else
 + else if (vma-vm_policy  vma-vm_policy-policy != MPOL_DEFAULT)
   pol = vma-vm_policy;
   return pol;
  }
  #endif


afaict this would provide no way for pol to be set to task-mempolicy
for hugetlb per my comment above.

--
Adam Litke ( agl at us.ibm.com )
IBM Linux Technology Center
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-12 Thread Eric W. Biederman
Adam Litke [EMAIL PROTECTED] writes:

 On 6/12/07, Eric W. Biederman [EMAIL PROTECTED] wrote:
 Adam Litke [EMAIL PROTECTED] writes:

  Here's another breakage as a result of shared memory stacked files :(
 
  The NUMA policy for a VMA is determined by checking the following (in the
 order
  given):

Where that doesn't appear to be what the current code does.
If there is a VMA it doesn't appear that we check task-mempolicy.

  1) vma-vm_ops-get_policy() (if defined)
  2) vma-vm_policy (if defined)
  3) task-mempolicy (if defined)
  4) Fall back to default_policy
 
  By switching to stacked files for shared memory, get_policy() is now always
 set
  to shm_get_policy which is a wrapper function.  This causes us to stop at
 step
  1, which yields NULL for hugetlb instead of task-mempolicy which was the
  previous (and correct) result.
 
  This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for
 the
  wrapped vm_ops.  Andi and Christoph, does this look right to you?

 I'm confused.

 I agree that the behavior you describe is correct.
 However I only see two code paths were get_policy is called and
 both of them take a NULL result and change it to task-mempolicy:

 The coffee hasn't quite absorbed yet, but don't those two code paths
 take a NULL result from get_policy() and turn it into default_policy,
 not task-mempolicy?

True, a NULL is turned into default_policy.  However the basic
confusion remains.   I don't see how we ever return task-mempolicy
on those paths if we have a vma, and those appear to be the only
call sites of get_policy.

 From mm/mempolicy.c

  long do_get_mempolicy(int *policy, nodemask_t *nmask,
unsigned long addr, unsigned long flags)
  {
int err;
struct mm_struct *mm = current-mm;
struct vm_area_struct *vma = NULL;
struct mempolicy *pol = current-mempolicy;
 
cpuset_update_task_memory_state();
if (flags  ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
return -EINVAL;
if (flags  MPOL_F_ADDR) {
down_read(mm-mmap_sem);
vma = find_vma_intersection(mm, addr, addr+1);
if (!vma) {
up_read(mm-mmap_sem);
return -EFAULT;
}
if (vma-vm_ops  vma-vm_ops-get_policy)
pol = vma-vm_ops-get_policy(vma, addr);
else
pol = vma-vm_policy;
} else if (addr)
return -EINVAL;
 
if (!pol)
pol = default_policy;



  /* Return effective policy for a VMA */
  static struct mempolicy * get_vma_policy(struct task_struct *task,
struct vm_area_struct *vma, unsigned long addr)
  {
struct mempolicy *pol = task-mempolicy;
 
if (vma) {
if (vma-vm_ops  vma-vm_ops-get_policy)
pol = vma-vm_ops-get_policy(vma, addr);
else if (vma-vm_policy 
vma-vm_policy-policy != MPOL_DEFAULT)
pol = vma-vm_policy;
}
if (!pol)
pol = default_policy;
return pol;
  }


 Does this perhaps need to be:
  Signed-off-by: Adam Litke [EMAIL PROTECTED]
 
  diff --git a/ipc/shm.c b/ipc/shm.c
  index 4fefbad..8d2672d 100644
  --- a/ipc/shm.c
  +++ b/ipc/shm.c
  @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct
  *vma, unsigned long addr)

 +   pol = NULL;
 
if (sfd-vm_ops-get_policy)
pol = sfd-vm_ops-get_policy(vma, addr);
  - else
  + else if (vma-vm_policy  vma-vm_policy-policy != MPOL_DEFAULT)
pol = vma-vm_policy;
return pol;
   }
   #endif

 afaict this would provide no way for pol to be set to task-mempolicy
 for hugetlb per my comment above.

Being dense I don't see a way for us to have ever returned task-mempolicy.

So perhaps something else changed.  Or something was made more
consistent and another bug was revealed.

I don't truly see the bug in shm_get_policy.  I can see why it would
not have the expected affect but that is different.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-11 Thread William Lee Irwin III
On Mon, Jun 11, 2007 at 09:30:20PM -0700, Andrew Morton wrote:
> Can we just double-check the refcounting please?

The refcounting for mpol's doesn't look good in general. I'm more
curious as to what releases the refcounts. alloc_page_vma(), for
instance, does get_vma_policy() which eventually takes a reference,
without ever releasing the reference it acquires. get_vma_policy()
itself uses a similar idiom to that used in aglitke's patch. I think
mpol refcounting needs to be addressed elsewhere besides this patch.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-11 Thread Andrew Morton
On Mon, 11 Jun 2007 16:34:54 -0500 Adam Litke <[EMAIL PROTECTED]> wrote:

> Here's another breakage as a result of shared memory stacked files :(
> 
> The NUMA policy for a VMA is determined by checking the following (in the 
> order
> given):
> 
> 1) vma->vm_ops->get_policy() (if defined)
> 2) vma->vm_policy (if defined)
> 3) task->mempolicy (if defined)
> 4) Fall back to default_policy
> 
> By switching to stacked files for shared memory, get_policy() is now always 
> set
> to shm_get_policy which is a wrapper function.  This causes us to stop at step
> 1, which yields NULL for hugetlb instead of task->mempolicy which was the
> previous (and correct) result.
> 
> This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
> wrapped vm_ops.  Andi and Christoph, does this look right to you?
> 

Can we just double-check the refcounting please?

> index 4fefbad..8d2672d 100644
> --- a/ipc/shm.c
> +++ b/ipc/shm.c
> @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct 
> *vma, unsigned long addr)
>  
>   if (sfd->vm_ops->get_policy)
>   pol = sfd->vm_ops->get_policy(vma, addr);

afacit this takes a ref on the underlying policy

> - else
> + else if (vma->vm_policy)
>   pol = vma->vm_policy;
> + else
> + pol = current->mempolicy;

but these two do not.

>   return pol;
>  }
>  #endif

Is is all correct?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-11 Thread William Lee Irwin III
On Mon, Jun 11, 2007 at 04:34:54PM -0500, Adam Litke wrote:
> Here's another breakage as a result of shared memory stacked files :(
> The NUMA policy for a VMA is determined by checking the following (in
> the order given):
> 1) vma->vm_ops->get_policy() (if defined)
> 2) vma->vm_policy (if defined)
> 3) task->mempolicy (if defined)
> 4) Fall back to default_policy
> By switching to stacked files for shared memory, get_policy() is now
> always set to shm_get_policy which is a wrapper function.  This
> causes us to stop at step 1, which yields NULL for hugetlb instead of
> task->mempolicy which was the previous (and correct) result.
> This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
> wrapped vm_ops.  Andi and Christoph, does this look right to you?
> Signed-off-by: Adam Litke <[EMAIL PROTECTED]>

Thanks for fielding this. The fix is certainly clear enough.

Acked-by: William Irwin <[EMAIL PROTECTED]>


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-11 Thread dean gaudet
On Mon, 11 Jun 2007, Adam Litke wrote:

> Here's another breakage as a result of shared memory stacked files :(
> 
> The NUMA policy for a VMA is determined by checking the following (in the 
> order
> given):
> 
> 1) vma->vm_ops->get_policy() (if defined)
> 2) vma->vm_policy (if defined)
> 3) task->mempolicy (if defined)
> 4) Fall back to default_policy
> 
> By switching to stacked files for shared memory, get_policy() is now always 
> set
> to shm_get_policy which is a wrapper function.  This causes us to stop at step
> 1, which yields NULL for hugetlb instead of task->mempolicy which was the
> previous (and correct) result.
> 
> This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
> wrapped vm_ops.  Andi and Christoph, does this look right to you?

thanks for the patch -- seems to do the trick for me.  it seems like it 
would be a candidate for stable series as well.

-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-11 Thread Adam Litke
Here's another breakage as a result of shared memory stacked files :(

The NUMA policy for a VMA is determined by checking the following (in the order
given):

1) vma->vm_ops->get_policy() (if defined)
2) vma->vm_policy (if defined)
3) task->mempolicy (if defined)
4) Fall back to default_policy

By switching to stacked files for shared memory, get_policy() is now always set
to shm_get_policy which is a wrapper function.  This causes us to stop at step
1, which yields NULL for hugetlb instead of task->mempolicy which was the
previous (and correct) result.

This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
wrapped vm_ops.  Andi and Christoph, does this look right to you?

Signed-off-by: Adam Litke <[EMAIL PROTECTED]>

diff --git a/ipc/shm.c b/ipc/shm.c
index 4fefbad..8d2672d 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct 
*vma, unsigned long addr)
 
if (sfd->vm_ops->get_policy)
pol = sfd->vm_ops->get_policy(vma, addr);
-   else
+   else if (vma->vm_policy)
pol = vma->vm_policy;
+   else
+   pol = current->mempolicy;
return pol;
 }
 #endif

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-11 Thread Adam Litke
Here's another breakage as a result of shared memory stacked files :(

The NUMA policy for a VMA is determined by checking the following (in the order
given):

1) vma-vm_ops-get_policy() (if defined)
2) vma-vm_policy (if defined)
3) task-mempolicy (if defined)
4) Fall back to default_policy

By switching to stacked files for shared memory, get_policy() is now always set
to shm_get_policy which is a wrapper function.  This causes us to stop at step
1, which yields NULL for hugetlb instead of task-mempolicy which was the
previous (and correct) result.

This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
wrapped vm_ops.  Andi and Christoph, does this look right to you?

Signed-off-by: Adam Litke [EMAIL PROTECTED]

diff --git a/ipc/shm.c b/ipc/shm.c
index 4fefbad..8d2672d 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct 
*vma, unsigned long addr)
 
if (sfd-vm_ops-get_policy)
pol = sfd-vm_ops-get_policy(vma, addr);
-   else
+   else if (vma-vm_policy)
pol = vma-vm_policy;
+   else
+   pol = current-mempolicy;
return pol;
 }
 #endif

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-11 Thread dean gaudet
On Mon, 11 Jun 2007, Adam Litke wrote:

 Here's another breakage as a result of shared memory stacked files :(
 
 The NUMA policy for a VMA is determined by checking the following (in the 
 order
 given):
 
 1) vma-vm_ops-get_policy() (if defined)
 2) vma-vm_policy (if defined)
 3) task-mempolicy (if defined)
 4) Fall back to default_policy
 
 By switching to stacked files for shared memory, get_policy() is now always 
 set
 to shm_get_policy which is a wrapper function.  This causes us to stop at step
 1, which yields NULL for hugetlb instead of task-mempolicy which was the
 previous (and correct) result.
 
 This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
 wrapped vm_ops.  Andi and Christoph, does this look right to you?

thanks for the patch -- seems to do the trick for me.  it seems like it 
would be a candidate for stable series as well.

-dean
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-11 Thread William Lee Irwin III
On Mon, Jun 11, 2007 at 04:34:54PM -0500, Adam Litke wrote:
 Here's another breakage as a result of shared memory stacked files :(
 The NUMA policy for a VMA is determined by checking the following (in
 the order given):
 1) vma-vm_ops-get_policy() (if defined)
 2) vma-vm_policy (if defined)
 3) task-mempolicy (if defined)
 4) Fall back to default_policy
 By switching to stacked files for shared memory, get_policy() is now
 always set to shm_get_policy which is a wrapper function.  This
 causes us to stop at step 1, which yields NULL for hugetlb instead of
 task-mempolicy which was the previous (and correct) result.
 This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
 wrapped vm_ops.  Andi and Christoph, does this look right to you?
 Signed-off-by: Adam Litke [EMAIL PROTECTED]

Thanks for fielding this. The fix is certainly clear enough.

Acked-by: William Irwin [EMAIL PROTECTED]


-- wli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-11 Thread Andrew Morton
On Mon, 11 Jun 2007 16:34:54 -0500 Adam Litke [EMAIL PROTECTED] wrote:

 Here's another breakage as a result of shared memory stacked files :(
 
 The NUMA policy for a VMA is determined by checking the following (in the 
 order
 given):
 
 1) vma-vm_ops-get_policy() (if defined)
 2) vma-vm_policy (if defined)
 3) task-mempolicy (if defined)
 4) Fall back to default_policy
 
 By switching to stacked files for shared memory, get_policy() is now always 
 set
 to shm_get_policy which is a wrapper function.  This causes us to stop at step
 1, which yields NULL for hugetlb instead of task-mempolicy which was the
 previous (and correct) result.
 
 This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
 wrapped vm_ops.  Andi and Christoph, does this look right to you?
 

Can we just double-check the refcounting please?

 index 4fefbad..8d2672d 100644
 --- a/ipc/shm.c
 +++ b/ipc/shm.c
 @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct 
 *vma, unsigned long addr)
  
   if (sfd-vm_ops-get_policy)
   pol = sfd-vm_ops-get_policy(vma, addr);

afacit this takes a ref on the underlying policy

 - else
 + else if (vma-vm_policy)
   pol = vma-vm_policy;
 + else
 + pol = current-mempolicy;

but these two do not.

   return pol;
  }
  #endif

Is is all correct?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

2007-06-11 Thread William Lee Irwin III
On Mon, Jun 11, 2007 at 09:30:20PM -0700, Andrew Morton wrote:
 Can we just double-check the refcounting please?

The refcounting for mpol's doesn't look good in general. I'm more
curious as to what releases the refcounts. alloc_page_vma(), for
instance, does get_vma_policy() which eventually takes a reference,
without ever releasing the reference it acquires. get_vma_policy()
itself uses a similar idiom to that used in aglitke's patch. I think
mpol refcounting needs to be addressed elsewhere besides this patch.


-- wli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/