Re: [RFC] mm: change find_vma() function

2015-12-21 Thread yalin wang

> On Dec 15, 2015, at 19:53, Kirill A. Shutemov 
>  wrote:
> 
> On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote:
>>> On Dec 15, 2015, at 05:11, Kirill A. Shutemov  wrote:
>>> Anyway, I don't think it's possible to gain anything measurable from this
>>> optimization.
>>> 
>> the advantage is that if addr don’t belong to any vma, we don’t need loop 
>> all vma,
>> we can break earlier if we found the most closest vma which vma->end_add > 
>> addr,
> 
> Do you have any workload which can demonstrate the advantage?
> 
> — 
i add the log in find_vma() to see the call stack ,
it is very efficient in mmap() / munmap / do_execve() / get_unmaped_area() /
mem_cgroup_move_task()->walk_page_range()->find_vma() call ,

in most time the loop will break after search about 7 vm,
i don’t consider the cache pollution problem in this patch,
yeah, this patch will check the vm_prev->vm_end for every loop,
but this only happened when tmp->vm_end > addr ,
if you don’t not check this , you will continue to loop to check next rb ,
this will also pollute the cache ,

so the question is which one is better ?
i don’t have a better method to test this .
Any good ideas about this ?
how to test it ?

Thanks







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-21 Thread yalin wang

> On Dec 15, 2015, at 19:53, Kirill A. Shutemov 
>  wrote:
> 
> On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote:
>>> On Dec 15, 2015, at 05:11, Kirill A. Shutemov  wrote:
>>> Anyway, I don't think it's possible to gain anything measurable from this
>>> optimization.
>>> 
>> the advantage is that if addr don’t belong to any vma, we don’t need loop 
>> all vma,
>> we can break earlier if we found the most closest vma which vma->end_add > 
>> addr,
> 
> Do you have any workload which can demonstrate the advantage?
> 
> — 
i add the log in find_vma() to see the call stack ,
it is very efficient in mmap() / munmap / do_execve() / get_unmaped_area() /
mem_cgroup_move_task()->walk_page_range()->find_vma() call ,

in most time the loop will break after search about 7 vm,
i don’t consider the cache pollution problem in this patch,
yeah, this patch will check the vm_prev->vm_end for every loop,
but this only happened when tmp->vm_end > addr ,
if you don’t not check this , you will continue to loop to check next rb ,
this will also pollute the cache ,

so the question is which one is better ?
i don’t have a better method to test this .
Any good ideas about this ?
how to test it ?

Thanks







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-15 Thread Kirill A. Shutemov
On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote:
> > On Dec 15, 2015, at 05:11, Kirill A. Shutemov  wrote:
> > Anyway, I don't think it's possible to gain anything measurable from this
> > optimization.
> > 
> the advantage is that if addr don’t belong to any vma, we don’t need loop all 
> vma,
> we can break earlier if we found the most closest vma which vma->end_add > 
> addr,

Do you have any workload which can demonstrate the advantage?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-15 Thread Andrea Arcangeli
On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote:
> 
> > On Dec 15, 2015, at 05:11, Kirill A. Shutemov  wrote:
> > 
> > On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote:
> >> On 12/14, Kirill A. Shutemov wrote:
> >>> 
> >>> On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote:
>  change find_vma() to break ealier when found the adderss
>  is not in any vma, don't need loop to search all vma.
>  
>  Signed-off-by: yalin wang 
>  ---
>  mm/mmap.c | 3 +++
>  1 file changed, 3 insertions(+)
>  
>  diff --git a/mm/mmap.c b/mm/mmap.c
>  index b513f20..8294c9b 100644
>  --- a/mm/mmap.c
>  +++ b/mm/mmap.c
>  @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct 
>  *mm, unsigned long addr)
>   vma = tmp;
>   if (tmp->vm_start <= addr)
>   break;
>  +if (!tmp->vm_prev || tmp->vm_prev->vm_end <= 
>  addr)
>  +break;
>  +
> >>> 
> >>> This 'break' would return 'tmp' as found vma.
> >> 
> >> But this would be right?
> > 
> > Hm. Right. Sorry for my tone.
> > 
> > I think the right condition is 'tmp->vm_prev->vm_end < addr', not '<=' as
> > vm_end is the first byte after the vma. But it's equivalent in practice
> > here.
> > 
> this should be <= here,
> because vma’s effect address space doesn’t include vm_end add,
> so if an address vm_end <= add , this means this addr don’t belong to this 
> vma,

We need to return the vma with lowest vm_end that satisifes "addr <
vm_end", so if vm_prev has "addr >= vm_end" would imply we can return
"tmp", that includes if addr == vm_prev->vm_end yes. If we'd spend CPU
for this it's worth to optimize for the case of tmp->vm_start ==
tmp->vm_prev->vm_end too and stop in such case too.

> 
> > Anyway, I don't think it's possible to gain anything measurable from this
> > optimization.
> > 
> the advantage is that if addr don’t belong to any vma, we don’t need loop all 
> vma,
> we can break earlier if we found the most closest vma which vma->end_add > 
> addr,

But that costs CPU for all cases were we cannot stop: it would waste
cachelines to check vmas that we would otherwise not even touch at
all. All those vmas are in different addresses, they're not
contiguous, plus the vma is pretty large object anyway (even if they
were contiguous).

So this will requires 2 cachelines instead of 1, for each vma we
encounter during the rbtree lookup. And for a large tree we may not
have to rescan those vmas of the vm_prev while moving down, so even if
the CPU cache could fit those extra vm_prev cachelines they would be
just useless as we continue the lookup down the tree.

So if a tree is very large and this optimization only allows us to
skip the last few level of tree walk, it'll still double up the
cachline cost of all the upper level lookups. Which may greatly exceed
the last few levels we skept.

Proper (probably non trivial) math could calculate the exact
probability this ends up being an optimization vs a slowdown. My gut
feeling is that for a small tree this increases performance, for a
large tree this decreases performance and the breakpoint is somewhere
in the middle. However a small tree runs fast anyway.

Last but not the least, one easy thing we can trivially tell already,
is that the worst case latency for the worst possible slowest lookup
would definitely be worsened and it would almost double with this
patch applied. So this is enough for me to be against applying this
patch.

All special cases we already hard optimize it with vmacache_find so if
that is about a special case with userland knowledge that should go
elsewhere and we already do those kind of fast path optimizations.

The patched code is the one that runs if we're not in a special
userland fast path case and to me the highest priority for this piece
of code is to optimize to guarantee the least harmful worst case for
the most inefficient lookup, and this patch would make the worst case
twice as slow as it is now for the worst case.

I think keeping the worst case as fast as possible is the highest
priority here.

Thanks,
Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-15 Thread Andrea Arcangeli
On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote:
> 
> > On Dec 15, 2015, at 05:11, Kirill A. Shutemov  wrote:
> > 
> > On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote:
> >> On 12/14, Kirill A. Shutemov wrote:
> >>> 
> >>> On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote:
>  change find_vma() to break ealier when found the adderss
>  is not in any vma, don't need loop to search all vma.
>  
>  Signed-off-by: yalin wang 
>  ---
>  mm/mmap.c | 3 +++
>  1 file changed, 3 insertions(+)
>  
>  diff --git a/mm/mmap.c b/mm/mmap.c
>  index b513f20..8294c9b 100644
>  --- a/mm/mmap.c
>  +++ b/mm/mmap.c
>  @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct 
>  *mm, unsigned long addr)
>   vma = tmp;
>   if (tmp->vm_start <= addr)
>   break;
>  +if (!tmp->vm_prev || tmp->vm_prev->vm_end <= 
>  addr)
>  +break;
>  +
> >>> 
> >>> This 'break' would return 'tmp' as found vma.
> >> 
> >> But this would be right?
> > 
> > Hm. Right. Sorry for my tone.
> > 
> > I think the right condition is 'tmp->vm_prev->vm_end < addr', not '<=' as
> > vm_end is the first byte after the vma. But it's equivalent in practice
> > here.
> > 
> this should be <= here,
> because vma’s effect address space doesn’t include vm_end add,
> so if an address vm_end <= add , this means this addr don’t belong to this 
> vma,

We need to return the vma with lowest vm_end that satisifes "addr <
vm_end", so if vm_prev has "addr >= vm_end" would imply we can return
"tmp", that includes if addr == vm_prev->vm_end yes. If we'd spend CPU
for this it's worth to optimize for the case of tmp->vm_start ==
tmp->vm_prev->vm_end too and stop in such case too.

> 
> > Anyway, I don't think it's possible to gain anything measurable from this
> > optimization.
> > 
> the advantage is that if addr don’t belong to any vma, we don’t need loop all 
> vma,
> we can break earlier if we found the most closest vma which vma->end_add > 
> addr,

But that costs CPU for all cases were we cannot stop: it would waste
cachelines to check vmas that we would otherwise not even touch at
all. All those vmas are in different addresses, they're not
contiguous, plus the vma is pretty large object anyway (even if they
were contiguous).

So this will requires 2 cachelines instead of 1, for each vma we
encounter during the rbtree lookup. And for a large tree we may not
have to rescan those vmas of the vm_prev while moving down, so even if
the CPU cache could fit those extra vm_prev cachelines they would be
just useless as we continue the lookup down the tree.

So if a tree is very large and this optimization only allows us to
skip the last few level of tree walk, it'll still double up the
cachline cost of all the upper level lookups. Which may greatly exceed
the last few levels we skept.

Proper (probably non trivial) math could calculate the exact
probability this ends up being an optimization vs a slowdown. My gut
feeling is that for a small tree this increases performance, for a
large tree this decreases performance and the breakpoint is somewhere
in the middle. However a small tree runs fast anyway.

Last but not the least, one easy thing we can trivially tell already,
is that the worst case latency for the worst possible slowest lookup
would definitely be worsened and it would almost double with this
patch applied. So this is enough for me to be against applying this
patch.

All special cases we already hard optimize it with vmacache_find so if
that is about a special case with userland knowledge that should go
elsewhere and we already do those kind of fast path optimizations.

The patched code is the one that runs if we're not in a special
userland fast path case and to me the highest priority for this piece
of code is to optimize to guarantee the least harmful worst case for
the most inefficient lookup, and this patch would make the worst case
twice as slow as it is now for the worst case.

I think keeping the worst case as fast as possible is the highest
priority here.

Thanks,
Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-15 Thread Kirill A. Shutemov
On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote:
> > On Dec 15, 2015, at 05:11, Kirill A. Shutemov  wrote:
> > Anyway, I don't think it's possible to gain anything measurable from this
> > optimization.
> > 
> the advantage is that if addr don’t belong to any vma, we don’t need loop all 
> vma,
> we can break earlier if we found the most closest vma which vma->end_add > 
> addr,

Do you have any workload which can demonstrate the advantage?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-14 Thread yalin wang

> On Dec 15, 2015, at 05:11, Kirill A. Shutemov  wrote:
> 
> On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote:
>> On 12/14, Kirill A. Shutemov wrote:
>>> 
>>> On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote:
 change find_vma() to break ealier when found the adderss
 is not in any vma, don't need loop to search all vma.
 
 Signed-off-by: yalin wang 
 ---
 mm/mmap.c | 3 +++
 1 file changed, 3 insertions(+)
 
 diff --git a/mm/mmap.c b/mm/mmap.c
 index b513f20..8294c9b 100644
 --- a/mm/mmap.c
 +++ b/mm/mmap.c
 @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct 
 *mm, unsigned long addr)
vma = tmp;
if (tmp->vm_start <= addr)
break;
 +  if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr)
 +  break;
 +
>>> 
>>> This 'break' would return 'tmp' as found vma.
>> 
>> But this would be right?
> 
> Hm. Right. Sorry for my tone.
> 
> I think the right condition is 'tmp->vm_prev->vm_end < addr', not '<=' as
> vm_end is the first byte after the vma. But it's equivalent in practice
> here.
> 
this should be <= here,
because vma’s effect address space doesn’t include vm_end add,
so if an address vm_end <= add , this means this addr don’t belong to this vma,

> Anyway, I don't think it's possible to gain anything measurable from this
> optimization.
> 
the advantage is that if addr don’t belong to any vma, we don’t need loop all 
vma,
we can break earlier if we found the most closest vma which vma->end_add > addr,
>> 
>> Not that I think this optimization makes sense, I simply do not know,
>> but to me this change looks technically correct at first glance...
>> 
>> But the changelog is wrong or I missed something. This change can stop
>> the main loop earlier; if "tmp" is the first vma,
> 
> For the first vma, we don't get anything comparing to what we have now:
> check for !rb_node on the next iteration would have the same trade off and
> effect as the proposed check.
Yes
> 
>> or if the previous one is below the address.
> 
> Yes, but would it compensate additional check on each 'tmp->vm_end > addr'
> iteration to the point? That's not obvious.
> 
>> Or perhaps I just misread that "not in any vma" note in the changelog.
>> 
>> No?
>> 
>> Oleg.
>> 

i have test it, it works fine. :)
Thanks




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-14 Thread Kirill A. Shutemov
On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote:
> On 12/14, Kirill A. Shutemov wrote:
> >
> > On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote:
> > > change find_vma() to break ealier when found the adderss
> > > is not in any vma, don't need loop to search all vma.
> > >
> > > Signed-off-by: yalin wang 
> > > ---
> > >  mm/mmap.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > index b513f20..8294c9b 100644
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct 
> > > *mm, unsigned long addr)
> > >   vma = tmp;
> > >   if (tmp->vm_start <= addr)
> > >   break;
> > > + if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr)
> > > + break;
> > > +
> >
> > This 'break' would return 'tmp' as found vma.
> 
> But this would be right?

Hm. Right. Sorry for my tone.

I think the right condition is 'tmp->vm_prev->vm_end < addr', not '<=' as
vm_end is the first byte after the vma. But it's equivalent in practice
here.

Anyway, I don't think it's possible to gain anything measurable from this
optimization.

> 
> Not that I think this optimization makes sense, I simply do not know,
> but to me this change looks technically correct at first glance...
> 
> But the changelog is wrong or I missed something. This change can stop
> the main loop earlier; if "tmp" is the first vma,

For the first vma, we don't get anything comparing to what we have now:
check for !rb_node on the next iteration would have the same trade off and
effect as the proposed check.

> or if the previous one is below the address.

Yes, but would it compensate additional check on each 'tmp->vm_end > addr'
iteration to the point? That's not obvious.

> Or perhaps I just misread that "not in any vma" note in the changelog.
> 
> No?
> 
> Oleg.
> 

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-14 Thread Oleg Nesterov
On 12/14, Kirill A. Shutemov wrote:
>
> On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote:
> > change find_vma() to break ealier when found the adderss
> > is not in any vma, don't need loop to search all vma.
> >
> > Signed-off-by: yalin wang 
> > ---
> >  mm/mmap.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index b513f20..8294c9b 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, 
> > unsigned long addr)
> > vma = tmp;
> > if (tmp->vm_start <= addr)
> > break;
> > +   if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr)
> > +   break;
> > +
>
> This 'break' would return 'tmp' as found vma.

But this would be right?

Not that I think this optimization makes sense, I simply do not know,
but to me this change looks technically correct at first glance...

But the changelog is wrong or I missed something. This change can stop
the main loop earlier; if "tmp" is the first vma, or if the previous one
is below the address. Or perhaps I just misread that "not in any vma"
note in the changelog.

No?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-14 Thread Kirill A. Shutemov
On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote:
> change find_vma() to break ealier when found the adderss
> is not in any vma, don't need loop to search all vma.
> 
> Signed-off-by: yalin wang 
> ---
>  mm/mmap.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index b513f20..8294c9b 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, 
> unsigned long addr)
>   vma = tmp;
>   if (tmp->vm_start <= addr)
>   break;
> + if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr)
> + break;
> +

This 'break' would return 'tmp' as found vma.

Have you even tried to test the code?

>   rb_node = rb_node->rb_left;
>   } else
>   rb_node = rb_node->rb_right;
> -- 
> 1.9.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-14 Thread Kirill A. Shutemov
On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote:
> change find_vma() to break ealier when found the adderss
> is not in any vma, don't need loop to search all vma.
> 
> Signed-off-by: yalin wang 
> ---
>  mm/mmap.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index b513f20..8294c9b 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, 
> unsigned long addr)
>   vma = tmp;
>   if (tmp->vm_start <= addr)
>   break;
> + if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr)
> + break;
> +

This 'break' would return 'tmp' as found vma.

Have you even tried to test the code?

>   rb_node = rb_node->rb_left;
>   } else
>   rb_node = rb_node->rb_right;
> -- 
> 1.9.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-14 Thread yalin wang

> On Dec 15, 2015, at 05:11, Kirill A. Shutemov  wrote:
> 
> On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote:
>> On 12/14, Kirill A. Shutemov wrote:
>>> 
>>> On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote:
 change find_vma() to break ealier when found the adderss
 is not in any vma, don't need loop to search all vma.
 
 Signed-off-by: yalin wang 
 ---
 mm/mmap.c | 3 +++
 1 file changed, 3 insertions(+)
 
 diff --git a/mm/mmap.c b/mm/mmap.c
 index b513f20..8294c9b 100644
 --- a/mm/mmap.c
 +++ b/mm/mmap.c
 @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct 
 *mm, unsigned long addr)
vma = tmp;
if (tmp->vm_start <= addr)
break;
 +  if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr)
 +  break;
 +
>>> 
>>> This 'break' would return 'tmp' as found vma.
>> 
>> But this would be right?
> 
> Hm. Right. Sorry for my tone.
> 
> I think the right condition is 'tmp->vm_prev->vm_end < addr', not '<=' as
> vm_end is the first byte after the vma. But it's equivalent in practice
> here.
> 
this should be <= here,
because vma’s effect address space doesn’t include vm_end add,
so if an address vm_end <= add , this means this addr don’t belong to this vma,

> Anyway, I don't think it's possible to gain anything measurable from this
> optimization.
> 
the advantage is that if addr don’t belong to any vma, we don’t need loop all 
vma,
we can break earlier if we found the most closest vma which vma->end_add > addr,
>> 
>> Not that I think this optimization makes sense, I simply do not know,
>> but to me this change looks technically correct at first glance...
>> 
>> But the changelog is wrong or I missed something. This change can stop
>> the main loop earlier; if "tmp" is the first vma,
> 
> For the first vma, we don't get anything comparing to what we have now:
> check for !rb_node on the next iteration would have the same trade off and
> effect as the proposed check.
Yes
> 
>> or if the previous one is below the address.
> 
> Yes, but would it compensate additional check on each 'tmp->vm_end > addr'
> iteration to the point? That's not obvious.
> 
>> Or perhaps I just misread that "not in any vma" note in the changelog.
>> 
>> No?
>> 
>> Oleg.
>> 

i have test it, it works fine. :)
Thanks




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-14 Thread Oleg Nesterov
On 12/14, Kirill A. Shutemov wrote:
>
> On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote:
> > change find_vma() to break ealier when found the adderss
> > is not in any vma, don't need loop to search all vma.
> >
> > Signed-off-by: yalin wang 
> > ---
> >  mm/mmap.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index b513f20..8294c9b 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, 
> > unsigned long addr)
> > vma = tmp;
> > if (tmp->vm_start <= addr)
> > break;
> > +   if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr)
> > +   break;
> > +
>
> This 'break' would return 'tmp' as found vma.

But this would be right?

Not that I think this optimization makes sense, I simply do not know,
but to me this change looks technically correct at first glance...

But the changelog is wrong or I missed something. This change can stop
the main loop earlier; if "tmp" is the first vma, or if the previous one
is below the address. Or perhaps I just misread that "not in any vma"
note in the changelog.

No?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mm: change find_vma() function

2015-12-14 Thread Kirill A. Shutemov
On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote:
> On 12/14, Kirill A. Shutemov wrote:
> >
> > On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin wang wrote:
> > > change find_vma() to break ealier when found the adderss
> > > is not in any vma, don't need loop to search all vma.
> > >
> > > Signed-off-by: yalin wang 
> > > ---
> > >  mm/mmap.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > index b513f20..8294c9b 100644
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -2064,6 +2064,9 @@ struct vm_area_struct *find_vma(struct mm_struct 
> > > *mm, unsigned long addr)
> > >   vma = tmp;
> > >   if (tmp->vm_start <= addr)
> > >   break;
> > > + if (!tmp->vm_prev || tmp->vm_prev->vm_end <= addr)
> > > + break;
> > > +
> >
> > This 'break' would return 'tmp' as found vma.
> 
> But this would be right?

Hm. Right. Sorry for my tone.

I think the right condition is 'tmp->vm_prev->vm_end < addr', not '<=' as
vm_end is the first byte after the vma. But it's equivalent in practice
here.

Anyway, I don't think it's possible to gain anything measurable from this
optimization.

> 
> Not that I think this optimization makes sense, I simply do not know,
> but to me this change looks technically correct at first glance...
> 
> But the changelog is wrong or I missed something. This change can stop
> the main loop earlier; if "tmp" is the first vma,

For the first vma, we don't get anything comparing to what we have now:
check for !rb_node on the next iteration would have the same trade off and
effect as the proposed check.

> or if the previous one is below the address.

Yes, but would it compensate additional check on each 'tmp->vm_end > addr'
iteration to the point? That's not obvious.

> Or perhaps I just misread that "not in any vma" note in the changelog.
> 
> No?
> 
> Oleg.
> 

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/