Re: [PATCH 1/2] mm: Fix vma_is_anonymous() false-positives

2018-07-17 Thread Michal Hocko
On Tue 17-07-18 12:30:30, Kirill A. Shutemov wrote:
[...]
> You propose quite a big redesign on how we handle anonymous VMAs.
> Feel free to propose the patch(set). But I don't think it would fly for
> stable@.

OK, fair enough. I thought this would be much easier in the end but I
admit I haven't tried that so I might have underestimated the whole
thing.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 1/2] mm: Fix vma_is_anonymous() false-positives

2018-07-17 Thread Kirill A. Shutemov
On Tue, Jul 17, 2018 at 09:00:53AM +, Michal Hocko wrote:
> On Mon 16-07-18 23:38:46, Kirill A. Shutemov wrote:
> > On Mon, Jul 16, 2018 at 07:40:42PM +0200, Michal Hocko wrote:
> > > On Mon 16-07-18 17:47:39, Kirill A. Shutemov wrote:
> > > > On Mon, Jul 16, 2018 at 04:22:45PM +0200, Michal Hocko wrote:
> > > > > On Mon 16-07-18 17:04:41, Kirill A. Shutemov wrote:
> > > > > > On Mon, Jul 16, 2018 at 01:30:28PM +, Michal Hocko wrote:
> > > > > > > On Tue 10-07-18 13:48:58, Andrew Morton wrote:
> > > > > > > > On Tue, 10 Jul 2018 16:48:20 +0300 "Kirill A. Shutemov" 
> > > > > > > >  wrote:
> > > > > > > > 
> > > > > > > > > vma_is_anonymous() relies on ->vm_ops being NULL to detect 
> > > > > > > > > anonymous
> > > > > > > > > VMA. This is unreliable as ->mmap may not set ->vm_ops.
> > > > > > > > > 
> > > > > > > > > False-positive vma_is_anonymous() may lead to crashes:
> > > > > > > > > 
> > > > > > > > > ...
> > > > > > > > > 
> > > > > > > > > This can be fixed by assigning anonymous VMAs own vm_ops and 
> > > > > > > > > not relying
> > > > > > > > > on it being NULL.
> > > > > > > > > 
> > > > > > > > > If ->mmap() failed to set ->vm_ops, mmap_region() will set it 
> > > > > > > > > to
> > > > > > > > > dummy_vm_ops. This way we will have non-NULL ->vm_ops for all 
> > > > > > > > > VMAs.
> > > > > > > > 
> > > > > > > > Is there a smaller, simpler fix which we can use for backporting
> > > > > > > > purposes and save the larger rework for development kernels?
> > > > > > > 
> > > > > > > Why cannot we simply keep anon vma with null vm_ops and set 
> > > > > > > dummy_vm_ops
> > > > > > > for all users who do not initialize it in their mmap callbacks?
> > > > > > > Basically have a sanity check&fixup in call_mmap?
> > > > > > 
> > > > > > As I said, there's a corner case of MAP_PRIVATE of /dev/zero.
> > > > > 
> > > > > This is really creative. I really didn't think about that. I am
> > > > > wondering whether this really has to be handled as a private anonymous
> > > > > mapping implicitly. Why does vma_is_anonymous has to succeed for these
> > > > > mappings? Why cannot we simply handle it as any other file backed
> > > > > PRIVATE mapping?
> > > > 
> > > > Because it's established way to create anonymous mappings in Linux.
> > > > And we cannot break the semantics.
> > > 
> > > How exactly would semantic break? You would still get zero pages on read
> > > faults and anonymous pages on CoW. So basically the same thing as for
> > > any other file backed MAP_PRIVATE mapping.
> > 
> > You are wrong about zero page.
> 
> Well, if we redirect ->fault to do_anonymous_page and

Yeah. And it will make write fault to allocate *two* pages. One in
do_anonymous_page() and one in do_cow_fault(). Just no.

We have a reason why anon VMAs handled separately. It's possible to unify
them, but it requires substantial ground work.

> > And you won't get THP.
> 
> huge_fault to do_huge_pmd_anonymous_page then we should emulate the
> standard anonymous mapping.
> 
> > And I'm sure there's more differences. Just grep for
> > vma_is_anonymous().
> 
> I am sorry to push on this but if we have one odd case I would rather
> handle it and have a simple _rule_ that every mmap provide _has_ to
> provide vm_ops and have a trivial fix up at a single place rather than
> patch a subtle placeholders you were proposing.
> 
> I will not insist of course but this looks less fragile to me.

You propose quite a big redesign on how we handle anonymous VMAs.
Feel free to propose the patch(set). But I don't think it would fly for
stable@.

-- 
 Kirill A. Shutemov


Re: [PATCH 1/2] mm: Fix vma_is_anonymous() false-positives

2018-07-17 Thread Michal Hocko
On Mon 16-07-18 23:38:46, Kirill A. Shutemov wrote:
> On Mon, Jul 16, 2018 at 07:40:42PM +0200, Michal Hocko wrote:
> > On Mon 16-07-18 17:47:39, Kirill A. Shutemov wrote:
> > > On Mon, Jul 16, 2018 at 04:22:45PM +0200, Michal Hocko wrote:
> > > > On Mon 16-07-18 17:04:41, Kirill A. Shutemov wrote:
> > > > > On Mon, Jul 16, 2018 at 01:30:28PM +, Michal Hocko wrote:
> > > > > > On Tue 10-07-18 13:48:58, Andrew Morton wrote:
> > > > > > > On Tue, 10 Jul 2018 16:48:20 +0300 "Kirill A. Shutemov" 
> > > > > > >  wrote:
> > > > > > > 
> > > > > > > > vma_is_anonymous() relies on ->vm_ops being NULL to detect 
> > > > > > > > anonymous
> > > > > > > > VMA. This is unreliable as ->mmap may not set ->vm_ops.
> > > > > > > > 
> > > > > > > > False-positive vma_is_anonymous() may lead to crashes:
> > > > > > > > 
> > > > > > > > ...
> > > > > > > > 
> > > > > > > > This can be fixed by assigning anonymous VMAs own vm_ops and 
> > > > > > > > not relying
> > > > > > > > on it being NULL.
> > > > > > > > 
> > > > > > > > If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
> > > > > > > > dummy_vm_ops. This way we will have non-NULL ->vm_ops for all 
> > > > > > > > VMAs.
> > > > > > > 
> > > > > > > Is there a smaller, simpler fix which we can use for backporting
> > > > > > > purposes and save the larger rework for development kernels?
> > > > > > 
> > > > > > Why cannot we simply keep anon vma with null vm_ops and set 
> > > > > > dummy_vm_ops
> > > > > > for all users who do not initialize it in their mmap callbacks?
> > > > > > Basically have a sanity check&fixup in call_mmap?
> > > > > 
> > > > > As I said, there's a corner case of MAP_PRIVATE of /dev/zero.
> > > > 
> > > > This is really creative. I really didn't think about that. I am
> > > > wondering whether this really has to be handled as a private anonymous
> > > > mapping implicitly. Why does vma_is_anonymous has to succeed for these
> > > > mappings? Why cannot we simply handle it as any other file backed
> > > > PRIVATE mapping?
> > > 
> > > Because it's established way to create anonymous mappings in Linux.
> > > And we cannot break the semantics.
> > 
> > How exactly would semantic break? You would still get zero pages on read
> > faults and anonymous pages on CoW. So basically the same thing as for
> > any other file backed MAP_PRIVATE mapping.
> 
> You are wrong about zero page.

Well, if we redirect ->fault to do_anonymous_page and

> And you won't get THP.

huge_fault to do_huge_pmd_anonymous_page then we should emulate the
standard anonymous mapping.

> And I'm sure there's more differences. Just grep for
> vma_is_anonymous().

I am sorry to push on this but if we have one odd case I would rather
handle it and have a simple _rule_ that every mmap provide _has_ to
provide vm_ops and have a trivial fix up at a single place rather than
patch a subtle placeholders you were proposing.

I will not insist of course but this looks less fragile to me.

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 1/2] mm: Fix vma_is_anonymous() false-positives

2018-07-16 Thread Kirill A. Shutemov
On Mon, Jul 16, 2018 at 07:40:42PM +0200, Michal Hocko wrote:
> On Mon 16-07-18 17:47:39, Kirill A. Shutemov wrote:
> > On Mon, Jul 16, 2018 at 04:22:45PM +0200, Michal Hocko wrote:
> > > On Mon 16-07-18 17:04:41, Kirill A. Shutemov wrote:
> > > > On Mon, Jul 16, 2018 at 01:30:28PM +, Michal Hocko wrote:
> > > > > On Tue 10-07-18 13:48:58, Andrew Morton wrote:
> > > > > > On Tue, 10 Jul 2018 16:48:20 +0300 "Kirill A. Shutemov" 
> > > > > >  wrote:
> > > > > > 
> > > > > > > vma_is_anonymous() relies on ->vm_ops being NULL to detect 
> > > > > > > anonymous
> > > > > > > VMA. This is unreliable as ->mmap may not set ->vm_ops.
> > > > > > > 
> > > > > > > False-positive vma_is_anonymous() may lead to crashes:
> > > > > > > 
> > > > > > > ...
> > > > > > > 
> > > > > > > This can be fixed by assigning anonymous VMAs own vm_ops and not 
> > > > > > > relying
> > > > > > > on it being NULL.
> > > > > > > 
> > > > > > > If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
> > > > > > > dummy_vm_ops. This way we will have non-NULL ->vm_ops for all 
> > > > > > > VMAs.
> > > > > > 
> > > > > > Is there a smaller, simpler fix which we can use for backporting
> > > > > > purposes and save the larger rework for development kernels?
> > > > > 
> > > > > Why cannot we simply keep anon vma with null vm_ops and set 
> > > > > dummy_vm_ops
> > > > > for all users who do not initialize it in their mmap callbacks?
> > > > > Basically have a sanity check&fixup in call_mmap?
> > > > 
> > > > As I said, there's a corner case of MAP_PRIVATE of /dev/zero.
> > > 
> > > This is really creative. I really didn't think about that. I am
> > > wondering whether this really has to be handled as a private anonymous
> > > mapping implicitly. Why does vma_is_anonymous has to succeed for these
> > > mappings? Why cannot we simply handle it as any other file backed
> > > PRIVATE mapping?
> > 
> > Because it's established way to create anonymous mappings in Linux.
> > And we cannot break the semantics.
> 
> How exactly would semantic break? You would still get zero pages on read
> faults and anonymous pages on CoW. So basically the same thing as for
> any other file backed MAP_PRIVATE mapping.

You are wrong about zero page. And you won't get THP. And I'm sure there's
more differences. Just grep for vma_is_anonymous().

-- 
 Kirill A. Shutemov


Re: [PATCH 1/2] mm: Fix vma_is_anonymous() false-positives

2018-07-16 Thread Michal Hocko
On Mon 16-07-18 17:47:39, Kirill A. Shutemov wrote:
> On Mon, Jul 16, 2018 at 04:22:45PM +0200, Michal Hocko wrote:
> > On Mon 16-07-18 17:04:41, Kirill A. Shutemov wrote:
> > > On Mon, Jul 16, 2018 at 01:30:28PM +, Michal Hocko wrote:
> > > > On Tue 10-07-18 13:48:58, Andrew Morton wrote:
> > > > > On Tue, 10 Jul 2018 16:48:20 +0300 "Kirill A. Shutemov" 
> > > > >  wrote:
> > > > > 
> > > > > > vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous
> > > > > > VMA. This is unreliable as ->mmap may not set ->vm_ops.
> > > > > > 
> > > > > > False-positive vma_is_anonymous() may lead to crashes:
> > > > > > 
> > > > > > ...
> > > > > > 
> > > > > > This can be fixed by assigning anonymous VMAs own vm_ops and not 
> > > > > > relying
> > > > > > on it being NULL.
> > > > > > 
> > > > > > If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
> > > > > > dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs.
> > > > > 
> > > > > Is there a smaller, simpler fix which we can use for backporting
> > > > > purposes and save the larger rework for development kernels?
> > > > 
> > > > Why cannot we simply keep anon vma with null vm_ops and set dummy_vm_ops
> > > > for all users who do not initialize it in their mmap callbacks?
> > > > Basically have a sanity check&fixup in call_mmap?
> > > 
> > > As I said, there's a corner case of MAP_PRIVATE of /dev/zero.
> > 
> > This is really creative. I really didn't think about that. I am
> > wondering whether this really has to be handled as a private anonymous
> > mapping implicitly. Why does vma_is_anonymous has to succeed for these
> > mappings? Why cannot we simply handle it as any other file backed
> > PRIVATE mapping?
> 
> Because it's established way to create anonymous mappings in Linux.
> And we cannot break the semantics.

How exactly would semantic break? You would still get zero pages on read
faults and anonymous pages on CoW. So basically the same thing as for
any other file backed MAP_PRIVATE mapping.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 1/2] mm: Fix vma_is_anonymous() false-positives

2018-07-16 Thread Kirill A. Shutemov
On Mon, Jul 16, 2018 at 04:22:45PM +0200, Michal Hocko wrote:
> On Mon 16-07-18 17:04:41, Kirill A. Shutemov wrote:
> > On Mon, Jul 16, 2018 at 01:30:28PM +, Michal Hocko wrote:
> > > On Tue 10-07-18 13:48:58, Andrew Morton wrote:
> > > > On Tue, 10 Jul 2018 16:48:20 +0300 "Kirill A. Shutemov" 
> > > >  wrote:
> > > > 
> > > > > vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous
> > > > > VMA. This is unreliable as ->mmap may not set ->vm_ops.
> > > > > 
> > > > > False-positive vma_is_anonymous() may lead to crashes:
> > > > > 
> > > > > ...
> > > > > 
> > > > > This can be fixed by assigning anonymous VMAs own vm_ops and not 
> > > > > relying
> > > > > on it being NULL.
> > > > > 
> > > > > If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
> > > > > dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs.
> > > > 
> > > > Is there a smaller, simpler fix which we can use for backporting
> > > > purposes and save the larger rework for development kernels?
> > > 
> > > Why cannot we simply keep anon vma with null vm_ops and set dummy_vm_ops
> > > for all users who do not initialize it in their mmap callbacks?
> > > Basically have a sanity check&fixup in call_mmap?
> > 
> > As I said, there's a corner case of MAP_PRIVATE of /dev/zero.
> 
> This is really creative. I really didn't think about that. I am
> wondering whether this really has to be handled as a private anonymous
> mapping implicitly. Why does vma_is_anonymous has to succeed for these
> mappings? Why cannot we simply handle it as any other file backed
> PRIVATE mapping?

Because it's established way to create anonymous mappings in Linux.
And we cannot break the semantics.

-- 
 Kirill A. Shutemov


Re: [PATCH 1/2] mm: Fix vma_is_anonymous() false-positives

2018-07-16 Thread Michal Hocko
On Mon 16-07-18 17:04:41, Kirill A. Shutemov wrote:
> On Mon, Jul 16, 2018 at 01:30:28PM +, Michal Hocko wrote:
> > On Tue 10-07-18 13:48:58, Andrew Morton wrote:
> > > On Tue, 10 Jul 2018 16:48:20 +0300 "Kirill A. Shutemov" 
> > >  wrote:
> > > 
> > > > vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous
> > > > VMA. This is unreliable as ->mmap may not set ->vm_ops.
> > > > 
> > > > False-positive vma_is_anonymous() may lead to crashes:
> > > > 
> > > > ...
> > > > 
> > > > This can be fixed by assigning anonymous VMAs own vm_ops and not relying
> > > > on it being NULL.
> > > > 
> > > > If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
> > > > dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs.
> > > 
> > > Is there a smaller, simpler fix which we can use for backporting
> > > purposes and save the larger rework for development kernels?
> > 
> > Why cannot we simply keep anon vma with null vm_ops and set dummy_vm_ops
> > for all users who do not initialize it in their mmap callbacks?
> > Basically have a sanity check&fixup in call_mmap?
> 
> As I said, there's a corner case of MAP_PRIVATE of /dev/zero.

This is really creative. I really didn't think about that. I am
wondering whether this really has to be handled as a private anonymous
mapping implicitly. Why does vma_is_anonymous has to succeed for these
mappings? Why cannot we simply handle it as any other file backed
PRIVATE mapping?

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 1/2] mm: Fix vma_is_anonymous() false-positives

2018-07-16 Thread Kirill A. Shutemov
On Mon, Jul 16, 2018 at 01:30:28PM +, Michal Hocko wrote:
> On Tue 10-07-18 13:48:58, Andrew Morton wrote:
> > On Tue, 10 Jul 2018 16:48:20 +0300 "Kirill A. Shutemov" 
> >  wrote:
> > 
> > > vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous
> > > VMA. This is unreliable as ->mmap may not set ->vm_ops.
> > > 
> > > False-positive vma_is_anonymous() may lead to crashes:
> > > 
> > > ...
> > > 
> > > This can be fixed by assigning anonymous VMAs own vm_ops and not relying
> > > on it being NULL.
> > > 
> > > If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
> > > dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs.
> > 
> > Is there a smaller, simpler fix which we can use for backporting
> > purposes and save the larger rework for development kernels?
> 
> Why cannot we simply keep anon vma with null vm_ops and set dummy_vm_ops
> for all users who do not initialize it in their mmap callbacks?
> Basically have a sanity check&fixup in call_mmap?

As I said, there's a corner case of MAP_PRIVATE of /dev/zero. It has to
produce anonymous VMA, but in map_region() we cannot distinguish it from
broken ->mmap handler.

See my attempt

6dc296e7df4c ("mm: make sure all file VMAs have ->vm_ops set")

and it's revert

 28c553d0aa0a ("revert "mm: make sure all file VMAs have ->vm_ops set"")

-- 
 Kirill A. Shutemov


Re: [PATCH 1/2] mm: Fix vma_is_anonymous() false-positives

2018-07-16 Thread Michal Hocko
On Tue 10-07-18 13:48:58, Andrew Morton wrote:
> On Tue, 10 Jul 2018 16:48:20 +0300 "Kirill A. Shutemov" 
>  wrote:
> 
> > vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous
> > VMA. This is unreliable as ->mmap may not set ->vm_ops.
> > 
> > False-positive vma_is_anonymous() may lead to crashes:
> > 
> > ...
> > 
> > This can be fixed by assigning anonymous VMAs own vm_ops and not relying
> > on it being NULL.
> > 
> > If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
> > dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs.
> 
> Is there a smaller, simpler fix which we can use for backporting
> purposes and save the larger rework for development kernels?

Why cannot we simply keep anon vma with null vm_ops and set dummy_vm_ops
for all users who do not initialize it in their mmap callbacks?
Basically have a sanity check&fixup in call_mmap?
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 1/2] mm: Fix vma_is_anonymous() false-positives

2018-07-11 Thread Kirill A. Shutemov
On Tue, Jul 10, 2018 at 01:48:58PM -0700, Andrew Morton wrote:
> On Tue, 10 Jul 2018 16:48:20 +0300 "Kirill A. Shutemov" 
>  wrote:
> 
> > vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous
> > VMA. This is unreliable as ->mmap may not set ->vm_ops.
> > 
> > False-positive vma_is_anonymous() may lead to crashes:
> > 
> > ...
> > 
> > This can be fixed by assigning anonymous VMAs own vm_ops and not relying
> > on it being NULL.
> > 
> > If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
> > dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs.
> 
> Is there a smaller, simpler fix which we can use for backporting
> purposes and save the larger rework for development kernels?

I've tried to move dummy_vm_ops stuff into a separate patch, but it didn't
workaround.

In some cases (like in create_huge_pmd()/wp_huge_pmd()) we rely on
vma_is_anonymous() to guarantee that ->vm_ops is non-NULL. But with new
implementation of the helper there's no such guarantee. And I see crash in
create_huge_pmd().

We can add explicit ->vm_ops check in such places. But it's more risky.
I may miss some instances. dummy_vm_ops should be safer here.

I think it's better to backport whole patch.

> 
> >
> > ...
> >
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -71,6 +71,9 @@ int mmap_rnd_compat_bits __read_mostly = 
> > CONFIG_ARCH_MMAP_RND_COMPAT_BITS;
> >  static bool ignore_rlimit_data;
> >  core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644);
> >  
> > +const struct vm_operations_struct anon_vm_ops = {};
> > +const struct vm_operations_struct dummy_vm_ops = {};
> 
> Some nice comments here would be useful.  Especially for dummy_vm_ops. 
> Why does it exist, what is its role, etc.

Fixup is below.

> >  static void unmap_region(struct mm_struct *mm,
> > struct vm_area_struct *vma, struct vm_area_struct *prev,
> > unsigned long start, unsigned long end);
> > @@ -561,6 +564,8 @@ static unsigned long count_vma_pages_range(struct 
> > mm_struct *mm,
> >  void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
> > struct rb_node **rb_link, struct rb_node *rb_parent)
> >  {
> > +   WARN_ONCE(!vma->vm_ops, "missing vma->vm_ops");
> > +
> > /* Update tracking information for the gap following the new vma. */
> > if (vma->vm_next)
> > vma_gap_update(vma->vm_next);
> > @@ -1774,12 +1779,19 @@ unsigned long mmap_region(struct file *file, 
> > unsigned long addr,
> >  */
> > WARN_ON_ONCE(addr != vma->vm_start);
> >  
> > +   /* All mappings must have ->vm_ops set */
> > +   if (!vma->vm_ops)
> > +   vma->vm_ops = &dummy_vm_ops;
> 
> Can this happen?  Can we make it a rule that file_operations.mmap(vma)
> must initialize vma->vm_ops?  Should we have a WARN here to detect when
> the fs implementation failed to do that?

Yes, it can happen. KCOV doesn't set it now. And I'm pretty sure some
drivers do not set it too.

We can add warning here. But I'm not sure what value it would have.
It's perfectly fine to have no need in any of vm operations. Silently set
it to dummy_vm_ops should be good enough here.

> > addr = vma->vm_start;
> > vm_flags = vma->vm_flags;
> > } else if (vm_flags & VM_SHARED) {
> > error = shmem_zero_setup(vma);
> > if (error)
> > goto free_vma;
> > +   } else {
> > +   /* vma_is_anonymous() relies on this. */
> + vma->vm_ops = &anon_vm_ops;
> > }
> >  
> > vma_link(mm, vma, prev, rb_link, rb_parent);
> > ...
> >
> 

diff --git a/mm/mmap.c b/mm/mmap.c
index 0729ed06b01c..6f59ade58fa7 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -71,7 +71,16 @@ int mmap_rnd_compat_bits __read_mostly = 
CONFIG_ARCH_MMAP_RND_COMPAT_BITS;
 static bool ignore_rlimit_data;
 core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644);
 
+/*
+ * All anonymous VMAs have ->vm_ops set to anon_vm_ops.
+ * vma_is_anonymous() reiles on anon_vm_ops to detect anonymous VMA.
+ */
 const struct vm_operations_struct anon_vm_ops = {};
+
+/*
+ * All VMAs have to have ->vm_ops set. dummy_vm_ops can be used if the VMA
+ * doesn't need to handle any of the operations.
+ */
 const struct vm_operations_struct dummy_vm_ops = {};
 
 static void unmap_region(struct mm_struct *mm,
-- 
 Kirill A. Shutemov


Re: [PATCH 1/2] mm: Fix vma_is_anonymous() false-positives

2018-07-10 Thread Andrew Morton
On Tue, 10 Jul 2018 16:48:20 +0300 "Kirill A. Shutemov" 
 wrote:

> vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous
> VMA. This is unreliable as ->mmap may not set ->vm_ops.
> 
> False-positive vma_is_anonymous() may lead to crashes:
> 
> ...
> 
> This can be fixed by assigning anonymous VMAs own vm_ops and not relying
> on it being NULL.
> 
> If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
> dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs.

Is there a smaller, simpler fix which we can use for backporting
purposes and save the larger rework for development kernels?

>
> ...
>
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -71,6 +71,9 @@ int mmap_rnd_compat_bits __read_mostly = 
> CONFIG_ARCH_MMAP_RND_COMPAT_BITS;
>  static bool ignore_rlimit_data;
>  core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644);
>  
> +const struct vm_operations_struct anon_vm_ops = {};
> +const struct vm_operations_struct dummy_vm_ops = {};

Some nice comments here would be useful.  Especially for dummy_vm_ops. 
Why does it exist, what is its role, etc.

>  static void unmap_region(struct mm_struct *mm,
>   struct vm_area_struct *vma, struct vm_area_struct *prev,
>   unsigned long start, unsigned long end);
> @@ -561,6 +564,8 @@ static unsigned long count_vma_pages_range(struct 
> mm_struct *mm,
>  void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
>   struct rb_node **rb_link, struct rb_node *rb_parent)
>  {
> + WARN_ONCE(!vma->vm_ops, "missing vma->vm_ops");
> +
>   /* Update tracking information for the gap following the new vma. */
>   if (vma->vm_next)
>   vma_gap_update(vma->vm_next);
> @@ -1774,12 +1779,19 @@ unsigned long mmap_region(struct file *file, unsigned 
> long addr,
>*/
>   WARN_ON_ONCE(addr != vma->vm_start);
>  
> + /* All mappings must have ->vm_ops set */
> + if (!vma->vm_ops)
> + vma->vm_ops = &dummy_vm_ops;

Can this happen?  Can we make it a rule that file_operations.mmap(vma)
must initialize vma->vm_ops?  Should we have a WARN here to detect when
the fs implementation failed to do that?

>   addr = vma->vm_start;
>   vm_flags = vma->vm_flags;
>   } else if (vm_flags & VM_SHARED) {
>   error = shmem_zero_setup(vma);
>   if (error)
>   goto free_vma;
> + } else {
> + /* vma_is_anonymous() relies on this. */
> + vma->vm_ops = &anon_vm_ops;
>   }
>  
>   vma_link(mm, vma, prev, rb_link, rb_parent);
> ...
>