Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-23 Thread Daniel Vetter
On Tue, Feb 23, 2021 at 4:59 PM Neil Roberts  wrote:
>
> Daniel Vetter  writes:
>
> > drm_gem_shmem_fault() does not seem to check for purged objects at all.
> >
> > No idea how this works, or if it ever worked, but yeah something is
> > clearly still busted.
>
> Oh of course, the fault handler doesn’t check this. I’ve added a second
> patch to make it check and posted it as a separate series here:
>
> https://lists.freedesktop.org/archives/dri-devel/2021-February/298170.html
>
> The two patches combined make the IGT test pass.
>
> > Definitely a good idae to have an igt. btw to make that faster you can
> > either use the vm_drop_caches file from proc (it's a bit a hammer), or
> > what I recommend: Have a dedicated debugfs file to only drop
> > everything from your shrinker. That's much quicker and  controlled.
> > See e.g. ttm_tt_debugfs_shrink from d4bd7776a7ac ("drm/ttm: rework
> > ttm_tt page limit v4") which recently landed in drm-misc-next.
>
> I agree it would be great to have a debugfs option to trigger the purge.
> I wonder if someone more involved in Panfrost would like to implement
> this, because I am actually trying to work on VC4 and this is already
> turning out to be quite a lot of yak shaving :) I’d also like to
> implement the same debugfs option and IGT test for VC4.

If we push the shrinker setup into the helpers (this means we
minimally need an lru, and probably more reasonable locking that shmem
helpers uses right now) then we could have one debugfs file for all
drivers supporting purgeable objects. Could then even share some of
the igt, only the ioctl code would need to be driver specific.

It's a bit of work though.
-Daniel

> Thanks for the feedback.
>
> Regards,
> - Neil



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-23 Thread Neil Roberts
Daniel Vetter  writes:

> drm_gem_shmem_fault() does not seem to check for purged objects at all.
>
> No idea how this works, or if it ever worked, but yeah something is
> clearly still busted.

Oh of course, the fault handler doesn’t check this. I’ve added a second
patch to make it check and posted it as a separate series here:

https://lists.freedesktop.org/archives/dri-devel/2021-February/298170.html

The two patches combined make the IGT test pass.

> Definitely a good idae to have an igt. btw to make that faster you can
> either use the vm_drop_caches file from proc (it's a bit a hammer), or
> what I recommend: Have a dedicated debugfs file to only drop
> everything from your shrinker. That's much quicker and  controlled.
> See e.g. ttm_tt_debugfs_shrink from d4bd7776a7ac ("drm/ttm: rework
> ttm_tt page limit v4") which recently landed in drm-misc-next.

I agree it would be great to have a debugfs option to trigger the purge.
I wonder if someone more involved in Panfrost would like to implement
this, because I am actually trying to work on VC4 and this is already
turning out to be quite a lot of yak shaving :) I’d also like to
implement the same debugfs option and IGT test for VC4.

Thanks for the feedback.

Regards,
- Neil
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-23 Thread Daniel Vetter
On Tue, Feb 23, 2021 at 1:42 PM Neil Roberts  wrote:
>
> Daniel Vetter  writes:
>
> > Yeah plus Cc: stable for backporting and I think an igt or similar for
> > panfrost to check this works correctly would be pretty good too. Since
> > if it took us over 1 year to notice this bug it's pretty clear that
> > normal testing doesn't catch this. So very likely we'll break this
> > again.
>
> I made the IGT test below which seems to reproduce the bug. However, the
> kernel patch doesn’t fix it, so maybe there is something more subtle
> going on.
>
> https://gitlab.freedesktop.org/nroberts/igt-gpu-tools/-/commits/panfrost-purgemap/

drm_gem_shmem_fault() does not seem to check for purged objects at all.

No idea how this works, or if it ever worked, but yeah something is
clearly still busted.

Definitely a good idae to have an igt. btw to make that faster you can
either use the vm_drop_caches file from proc (it's a bit a hammer), or
what I recommend: Have a dedicated debugfs file to only drop
everything from your shrinker. That's much quicker and  controlled.
See e.g. ttm_tt_debugfs_shrink from d4bd7776a7ac ("drm/ttm: rework
ttm_tt page limit v4") which recently landed in drm-misc-next.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-23 Thread Neil Roberts
Daniel Vetter  writes:

> Yeah plus Cc: stable for backporting and I think an igt or similar for
> panfrost to check this works correctly would be pretty good too. Since
> if it took us over 1 year to notice this bug it's pretty clear that
> normal testing doesn't catch this. So very likely we'll break this
> again.

I made the IGT test below which seems to reproduce the bug. However, the
kernel patch doesn’t fix it, so maybe there is something more subtle
going on.

https://gitlab.freedesktop.org/nroberts/igt-gpu-tools/-/commits/panfrost-purgemap/

Regards,
- Neil
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-22 Thread Daniel Vetter
On Mon, Feb 22, 2021 at 03:24:17PM +0100, Thomas Zimmermann wrote:
> Hi
> 
> Am 17.02.21 um 17:59 schrieb Neil Roberts:
> > When mmapping the shmem, it would previously adjust the pgoff in the
> > vm_area_struct to remove the fake offset that is added to be able to
> > identify the buffer. This patch removes the adjustment and makes the
> > fault handler use the vm_fault address to calculate the page offset
> > instead. Although using this address is apparently discouraged, several
> > DRM drivers seem to be doing it anyway.
> > 
> > The problem with removing the pgoff is that it prevents
> > drm_vma_node_unmap from working because that searches the mapping tree
> > by address. That doesn't work because all of the mappings are at offset
> > 0. drm_vma_node_unmap is being used by the shmem helpers when purging
> > the buffer.
> 
> I just want to ask if this is how the mmap callbacks are supposed to work
> now?
> 
> I have a number of patches in here that convert several drivers to the GEM
> object's mmap callback. They might not be affected by the vm_pgoff bug, but
> I'd like to submit something that could work in the longer term.

Yeah we pretty much require the uniq vm_pgoff for runtime unmapping.
Especially with more dynamic memory managers like ttm that move buffers
around - for more static ones (most of the armsoc ones) it's just a bit a
security issue since you can potentially access memory after it's gone.
-Daniel

> Best regards
> Thomas
> 
> > 
> > It looks like panfrost is using drm_gem_shmem_purge so this might fix a
> > potential bug there.
> > 
> > Signed-off-by: Neil Roberts 
> > ---
> >   drivers/gpu/drm/drm_gem_shmem_helper.c | 12 +++-
> >   1 file changed, 7 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> > b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > index 9825c378dfa6..4b14157f1962 100644
> > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > @@ -526,11 +526,16 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault 
> > *vmf)
> > struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
> > loff_t num_pages = obj->size >> PAGE_SHIFT;
> > struct page *page;
> > +   pgoff_t page_offset;
> > -   if (vmf->pgoff >= num_pages || WARN_ON_ONCE(!shmem->pages))
> > +   /* We don't use vmf->pgoff since that has the fake offset */
> > +   page_offset = (vmf->address - vma->vm_start) >> PAGE_SHIFT;
> > +
> > +   if (page_offset < 0 || page_offset >= num_pages ||
> > +   WARN_ON_ONCE(!shmem->pages))
> > return VM_FAULT_SIGBUS;
> > -   page = shmem->pages[vmf->pgoff];
> > +   page = shmem->pages[page_offset];
> > return vmf_insert_page(vma, vmf->address, page);
> >   }
> > @@ -581,9 +586,6 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, 
> > struct vm_area_struct *vma)
> > struct drm_gem_shmem_object *shmem;
> > int ret;
> > -   /* Remove the fake offset */
> > -   vma->vm_pgoff -= drm_vma_node_start(>vma_node);
> > -
> > if (obj->import_attach) {
> > /* Drop the reference drm_gem_mmap_obj() acquired.*/
> > drm_gem_object_put(obj);
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
> 




> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-22 Thread Daniel Vetter
On Mon, Feb 22, 2021 at 09:34:43AM +0100, Boris Brezillon wrote:
> Hi Daniel,
> 
> On Fri, 19 Feb 2021 18:45:00 +0100
> Daniel Vetter  wrote:
> 
> > On Fri, Feb 19, 2021 at 5:17 PM Steven Price  wrote:
> > >
> > > On 19/02/2021 15:13, Daniel Vetter wrote:  
> > > > On Fri, Feb 19, 2021 at 01:36:06PM +, Steven Price wrote:  
> > > >> On 18/02/2021 18:20, Daniel Vetter wrote:  
> > > >>> On Thu, Feb 18, 2021 at 6:16 PM Rob Herring  
> > > >>> wrote:  
> > > 
> > >  On Thu, Feb 18, 2021 at 10:51 AM Steven Price  
> > >  wrote:  
> > > >
> > > > On 18/02/2021 16:38, Rob Herring wrote:  
> > > >> On Thu, Feb 18, 2021 at 10:15 AM Steven Price 
> > > >>  wrote:  
> > > >>>
> > > >>> On 18/02/2021 15:45, Alyssa Rosenzweig wrote:  
> > > > Yeah plus Cc: stable for backporting and I think an igt or 
> > > > similar for
> > > > panfrost to check this works correctly would be pretty good 
> > > > too. Since
> > > > if it took us over 1 year to notice this bug it's pretty clear 
> > > > that
> > > > normal testing doesn't catch this. So very likely we'll break 
> > > > this
> > > > again.  
> > > 
> > >  Unfortunately there are a lot of kernel bugs which are noticed 
> > >  during actual
> > >  use (but not CI runs), some of which have never been fixed. I do 
> > >  know
> > >  the shrinker impl is buggy for us, if this is the fix I'm very 
> > >  happy.  
> > > >>>
> > > >>> I doubt this will actually "fix" anything - if I understand 
> > > >>> correctly
> > > >>> then the sequence which is broken is:
> > > >>>
> > > >>>  * allocate BO, mmap to CPU
> > > >>>  * madvise(DONTNEED)
> > > >>>  * trigger purge
> > > >>>  * try to access the BO memory
> > > >>>
> > > >>> which is an invalid sequence for user space - the attempt to 
> > > >>> access
> > > >>> memory should cause a SIGSEGV. However because 
> > > >>> drm_vma_node_unmap() is
> > > >>> unable to find the mappings there may still be page table entries
> > > >>> present which would provide access to memory the kernel has 
> > > >>> freed. Which
> > > >>> is of course a big security hole and so this fix is needed.
> > > >>>
> > > >>> In what way do you find the shrinker impl buggy? I'm aware 
> > > >>> there's some
> > > >>> dodgy locking (although I haven't worked out how to fix it) - but 
> > > >>> AFAICT
> > > >>> it's more deadlock territory rather than lacking in locks. Are 
> > > >>> there
> > > >>> correctness issues?  
> > > >>
> > > >> What's there was largely a result of getting lockdep happy.
> > > >>  
> > > > btw for testing shrinkers recommended way is to have a debugfs 
> > > > file
> > > > that just force-shrinks everything. That way you avoid all the 
> > > > trouble
> > > > that tend to happen when you drive a system close to OOM on 
> > > > linux, and
> > > > it's also much faster.  
> > > 
> > >  2nding this as a good idea.
> > >   
> > > >>>
> > > >>> Sounds like a good idea to me too. But equally I'm wondering 
> > > >>> whether the
> > > >>> best (short term) solution is to actually disable the shrinker. 
> > > >>> I'm
> > > >>> somewhat surprised that nobody has got fed up with the "Purging 
> > > >>> xxx
> > > >>> bytes" message spam - which makes me think that most people are 
> > > >>> not
> > > >>> hitting memory pressure to trigger the shrinker.  
> > > >>
> > > >> If the shrinker is dodgy, then it's probably good to have the 
> > > >> messages
> > > >> to know if it ran.
> > > >>  
> > > >>> The shrinker on kbase caused a lot of grief - and the only way I 
> > > >>> managed
> > > >>> to get that under control was by writing a static analysis tool 
> > > >>> for the
> > > >>> locking, and by upsetting people by enforcing the (rather dumb) 
> > > >>> rules of
> > > >>> the tool on the code base. I've been meaning to look at whether 
> > > >>> sparse
> > > >>> can do a similar check of locks.  
> > > >>
> > > >> Lockdep doesn't cover it?  
> > > >
> > > > Short answer: no ;)  
> > > >>>
> > > >>> It's pretty good actually, if you correctly annotate things up.  
> > > >>
> > > >> I agree - it's pretty good, the problem is you need reasonable test
> > > >> coverage, and getting good test coverage of shrinkers is hard.
> > > >>  
> > > > The problem with lockdep is that you have to trigger the locking
> > > > scenario to get a warning out of it. For example you obviously 
> > > > won't get
> > > > any warnings about the shrinker without triggering the shrinker 
> > > > (which
> > > > means memory pressure since we don't have the 

Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-22 Thread Thomas Zimmermann

Hi

Am 17.02.21 um 17:59 schrieb Neil Roberts:

When mmapping the shmem, it would previously adjust the pgoff in the
vm_area_struct to remove the fake offset that is added to be able to
identify the buffer. This patch removes the adjustment and makes the
fault handler use the vm_fault address to calculate the page offset
instead. Although using this address is apparently discouraged, several
DRM drivers seem to be doing it anyway.

The problem with removing the pgoff is that it prevents
drm_vma_node_unmap from working because that searches the mapping tree
by address. That doesn't work because all of the mappings are at offset
0. drm_vma_node_unmap is being used by the shmem helpers when purging
the buffer.


I just want to ask if this is how the mmap callbacks are supposed to 
work now?


I have a number of patches in here that convert several drivers to the 
GEM object's mmap callback. They might not be affected by the vm_pgoff 
bug, but I'd like to submit something that could work in the longer term.


Best regards
Thomas



It looks like panfrost is using drm_gem_shmem_purge so this might fix a
potential bug there.

Signed-off-by: Neil Roberts 
---
  drivers/gpu/drm/drm_gem_shmem_helper.c | 12 +++-
  1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 9825c378dfa6..4b14157f1962 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -526,11 +526,16 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault 
*vmf)
struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
loff_t num_pages = obj->size >> PAGE_SHIFT;
struct page *page;
+   pgoff_t page_offset;
  
-	if (vmf->pgoff >= num_pages || WARN_ON_ONCE(!shmem->pages))

+   /* We don't use vmf->pgoff since that has the fake offset */
+   page_offset = (vmf->address - vma->vm_start) >> PAGE_SHIFT;
+
+   if (page_offset < 0 || page_offset >= num_pages ||
+   WARN_ON_ONCE(!shmem->pages))
return VM_FAULT_SIGBUS;
  
-	page = shmem->pages[vmf->pgoff];

+   page = shmem->pages[page_offset];
  
  	return vmf_insert_page(vma, vmf->address, page);

  }
@@ -581,9 +586,6 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, struct 
vm_area_struct *vma)
struct drm_gem_shmem_object *shmem;
int ret;
  
-	/* Remove the fake offset */

-   vma->vm_pgoff -= drm_vma_node_start(>vma_node);
-
if (obj->import_attach) {
/* Drop the reference drm_gem_mmap_obj() acquired.*/
drm_gem_object_put(obj);



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-22 Thread Boris Brezillon
Hi Daniel,

On Fri, 19 Feb 2021 18:45:00 +0100
Daniel Vetter  wrote:

> On Fri, Feb 19, 2021 at 5:17 PM Steven Price  wrote:
> >
> > On 19/02/2021 15:13, Daniel Vetter wrote:  
> > > On Fri, Feb 19, 2021 at 01:36:06PM +, Steven Price wrote:  
> > >> On 18/02/2021 18:20, Daniel Vetter wrote:  
> > >>> On Thu, Feb 18, 2021 at 6:16 PM Rob Herring  wrote: 
> > >>>  
> > 
> >  On Thu, Feb 18, 2021 at 10:51 AM Steven Price  
> >  wrote:  
> > >
> > > On 18/02/2021 16:38, Rob Herring wrote:  
> > >> On Thu, Feb 18, 2021 at 10:15 AM Steven Price  
> > >> wrote:  
> > >>>
> > >>> On 18/02/2021 15:45, Alyssa Rosenzweig wrote:  
> > > Yeah plus Cc: stable for backporting and I think an igt or 
> > > similar for
> > > panfrost to check this works correctly would be pretty good too. 
> > > Since
> > > if it took us over 1 year to notice this bug it's pretty clear 
> > > that
> > > normal testing doesn't catch this. So very likely we'll break this
> > > again.  
> > 
> >  Unfortunately there are a lot of kernel bugs which are noticed 
> >  during actual
> >  use (but not CI runs), some of which have never been fixed. I do 
> >  know
> >  the shrinker impl is buggy for us, if this is the fix I'm very 
> >  happy.  
> > >>>
> > >>> I doubt this will actually "fix" anything - if I understand 
> > >>> correctly
> > >>> then the sequence which is broken is:
> > >>>
> > >>>  * allocate BO, mmap to CPU
> > >>>  * madvise(DONTNEED)
> > >>>  * trigger purge
> > >>>  * try to access the BO memory
> > >>>
> > >>> which is an invalid sequence for user space - the attempt to access
> > >>> memory should cause a SIGSEGV. However because drm_vma_node_unmap() 
> > >>> is
> > >>> unable to find the mappings there may still be page table entries
> > >>> present which would provide access to memory the kernel has freed. 
> > >>> Which
> > >>> is of course a big security hole and so this fix is needed.
> > >>>
> > >>> In what way do you find the shrinker impl buggy? I'm aware there's 
> > >>> some
> > >>> dodgy locking (although I haven't worked out how to fix it) - but 
> > >>> AFAICT
> > >>> it's more deadlock territory rather than lacking in locks. Are there
> > >>> correctness issues?  
> > >>
> > >> What's there was largely a result of getting lockdep happy.
> > >>  
> > > btw for testing shrinkers recommended way is to have a debugfs 
> > > file
> > > that just force-shrinks everything. That way you avoid all the 
> > > trouble
> > > that tend to happen when you drive a system close to OOM on 
> > > linux, and
> > > it's also much faster.  
> > 
> >  2nding this as a good idea.
> >   
> > >>>
> > >>> Sounds like a good idea to me too. But equally I'm wondering 
> > >>> whether the
> > >>> best (short term) solution is to actually disable the shrinker. I'm
> > >>> somewhat surprised that nobody has got fed up with the "Purging xxx
> > >>> bytes" message spam - which makes me think that most people are not
> > >>> hitting memory pressure to trigger the shrinker.  
> > >>
> > >> If the shrinker is dodgy, then it's probably good to have the 
> > >> messages
> > >> to know if it ran.
> > >>  
> > >>> The shrinker on kbase caused a lot of grief - and the only way I 
> > >>> managed
> > >>> to get that under control was by writing a static analysis tool for 
> > >>> the
> > >>> locking, and by upsetting people by enforcing the (rather dumb) 
> > >>> rules of
> > >>> the tool on the code base. I've been meaning to look at whether 
> > >>> sparse
> > >>> can do a similar check of locks.  
> > >>
> > >> Lockdep doesn't cover it?  
> > >
> > > Short answer: no ;)  
> > >>>
> > >>> It's pretty good actually, if you correctly annotate things up.  
> > >>
> > >> I agree - it's pretty good, the problem is you need reasonable test
> > >> coverage, and getting good test coverage of shrinkers is hard.
> > >>  
> > > The problem with lockdep is that you have to trigger the locking
> > > scenario to get a warning out of it. For example you obviously won't 
> > > get
> > > any warnings about the shrinker without triggering the shrinker (which
> > > means memory pressure since we don't have the debugfs file to trigger 
> > > it).  
> > 
> >  Actually, you don't need debugfs. Writing to /proc/sys/vm/drop_caches
> >  will do it. Though maybe there's other code path scenarios that
> >  wouldn't cover.  
> > >>>
> > >>> Huh didn't know, but it's a bit a shotgun, plus it doesn't use
> > >>> fs_reclaim shrinker annotations, which means you don't have lockdep
> > >>> 

Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-19 Thread Daniel Vetter
On Fri, Feb 19, 2021 at 5:17 PM Steven Price  wrote:
>
> On 19/02/2021 15:13, Daniel Vetter wrote:
> > On Fri, Feb 19, 2021 at 01:36:06PM +, Steven Price wrote:
> >> On 18/02/2021 18:20, Daniel Vetter wrote:
> >>> On Thu, Feb 18, 2021 at 6:16 PM Rob Herring  wrote:
> 
>  On Thu, Feb 18, 2021 at 10:51 AM Steven Price  
>  wrote:
> >
> > On 18/02/2021 16:38, Rob Herring wrote:
> >> On Thu, Feb 18, 2021 at 10:15 AM Steven Price  
> >> wrote:
> >>>
> >>> On 18/02/2021 15:45, Alyssa Rosenzweig wrote:
> > Yeah plus Cc: stable for backporting and I think an igt or similar 
> > for
> > panfrost to check this works correctly would be pretty good too. 
> > Since
> > if it took us over 1 year to notice this bug it's pretty clear that
> > normal testing doesn't catch this. So very likely we'll break this
> > again.
> 
>  Unfortunately there are a lot of kernel bugs which are noticed 
>  during actual
>  use (but not CI runs), some of which have never been fixed. I do know
>  the shrinker impl is buggy for us, if this is the fix I'm very happy.
> >>>
> >>> I doubt this will actually "fix" anything - if I understand correctly
> >>> then the sequence which is broken is:
> >>>
> >>>  * allocate BO, mmap to CPU
> >>>  * madvise(DONTNEED)
> >>>  * trigger purge
> >>>  * try to access the BO memory
> >>>
> >>> which is an invalid sequence for user space - the attempt to access
> >>> memory should cause a SIGSEGV. However because drm_vma_node_unmap() is
> >>> unable to find the mappings there may still be page table entries
> >>> present which would provide access to memory the kernel has freed. 
> >>> Which
> >>> is of course a big security hole and so this fix is needed.
> >>>
> >>> In what way do you find the shrinker impl buggy? I'm aware there's 
> >>> some
> >>> dodgy locking (although I haven't worked out how to fix it) - but 
> >>> AFAICT
> >>> it's more deadlock territory rather than lacking in locks. Are there
> >>> correctness issues?
> >>
> >> What's there was largely a result of getting lockdep happy.
> >>
> > btw for testing shrinkers recommended way is to have a debugfs file
> > that just force-shrinks everything. That way you avoid all the 
> > trouble
> > that tend to happen when you drive a system close to OOM on linux, 
> > and
> > it's also much faster.
> 
>  2nding this as a good idea.
> 
> >>>
> >>> Sounds like a good idea to me too. But equally I'm wondering whether 
> >>> the
> >>> best (short term) solution is to actually disable the shrinker. I'm
> >>> somewhat surprised that nobody has got fed up with the "Purging xxx
> >>> bytes" message spam - which makes me think that most people are not
> >>> hitting memory pressure to trigger the shrinker.
> >>
> >> If the shrinker is dodgy, then it's probably good to have the messages
> >> to know if it ran.
> >>
> >>> The shrinker on kbase caused a lot of grief - and the only way I 
> >>> managed
> >>> to get that under control was by writing a static analysis tool for 
> >>> the
> >>> locking, and by upsetting people by enforcing the (rather dumb) rules 
> >>> of
> >>> the tool on the code base. I've been meaning to look at whether sparse
> >>> can do a similar check of locks.
> >>
> >> Lockdep doesn't cover it?
> >
> > Short answer: no ;)
> >>>
> >>> It's pretty good actually, if you correctly annotate things up.
> >>
> >> I agree - it's pretty good, the problem is you need reasonable test
> >> coverage, and getting good test coverage of shrinkers is hard.
> >>
> > The problem with lockdep is that you have to trigger the locking
> > scenario to get a warning out of it. For example you obviously won't get
> > any warnings about the shrinker without triggering the shrinker (which
> > means memory pressure since we don't have the debugfs file to trigger 
> > it).
> 
>  Actually, you don't need debugfs. Writing to /proc/sys/vm/drop_caches
>  will do it. Though maybe there's other code path scenarios that
>  wouldn't cover.
> >>>
> >>> Huh didn't know, but it's a bit a shotgun, plus it doesn't use
> >>> fs_reclaim shrinker annotations, which means you don't have lockdep
> >>> checks. I think at least, would need some deadlock and testing.
> >>
> >> The big problem with this sort of method for triggering the shrinkers is
> >> that they are called without (many) locks held. Whereas it's entirely
> >> possible for a shrinker to be called at (almost) any allocation in the
> >> kernel.
> >>
> >> Admittedly the Panfrost shrinkers are fairly safe - because most things are
> >> xxx_trylock(). kbase avoids trylock which makes 

Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-19 Thread Steven Price

On 19/02/2021 15:13, Daniel Vetter wrote:

On Fri, Feb 19, 2021 at 01:36:06PM +, Steven Price wrote:

On 18/02/2021 18:20, Daniel Vetter wrote:

On Thu, Feb 18, 2021 at 6:16 PM Rob Herring  wrote:


On Thu, Feb 18, 2021 at 10:51 AM Steven Price  wrote:


On 18/02/2021 16:38, Rob Herring wrote:

On Thu, Feb 18, 2021 at 10:15 AM Steven Price  wrote:


On 18/02/2021 15:45, Alyssa Rosenzweig wrote:

Yeah plus Cc: stable for backporting and I think an igt or similar for
panfrost to check this works correctly would be pretty good too. Since
if it took us over 1 year to notice this bug it's pretty clear that
normal testing doesn't catch this. So very likely we'll break this
again.


Unfortunately there are a lot of kernel bugs which are noticed during actual
use (but not CI runs), some of which have never been fixed. I do know
the shrinker impl is buggy for us, if this is the fix I'm very happy.


I doubt this will actually "fix" anything - if I understand correctly
then the sequence which is broken is:

 * allocate BO, mmap to CPU
 * madvise(DONTNEED)
 * trigger purge
 * try to access the BO memory

which is an invalid sequence for user space - the attempt to access
memory should cause a SIGSEGV. However because drm_vma_node_unmap() is
unable to find the mappings there may still be page table entries
present which would provide access to memory the kernel has freed. Which
is of course a big security hole and so this fix is needed.

In what way do you find the shrinker impl buggy? I'm aware there's some
dodgy locking (although I haven't worked out how to fix it) - but AFAICT
it's more deadlock territory rather than lacking in locks. Are there
correctness issues?


What's there was largely a result of getting lockdep happy.


btw for testing shrinkers recommended way is to have a debugfs file
that just force-shrinks everything. That way you avoid all the trouble
that tend to happen when you drive a system close to OOM on linux, and
it's also much faster.


2nding this as a good idea.



Sounds like a good idea to me too. But equally I'm wondering whether the
best (short term) solution is to actually disable the shrinker. I'm
somewhat surprised that nobody has got fed up with the "Purging xxx
bytes" message spam - which makes me think that most people are not
hitting memory pressure to trigger the shrinker.


If the shrinker is dodgy, then it's probably good to have the messages
to know if it ran.


The shrinker on kbase caused a lot of grief - and the only way I managed
to get that under control was by writing a static analysis tool for the
locking, and by upsetting people by enforcing the (rather dumb) rules of
the tool on the code base. I've been meaning to look at whether sparse
can do a similar check of locks.


Lockdep doesn't cover it?


Short answer: no ;)


It's pretty good actually, if you correctly annotate things up.


I agree - it's pretty good, the problem is you need reasonable test
coverage, and getting good test coverage of shrinkers is hard.


The problem with lockdep is that you have to trigger the locking
scenario to get a warning out of it. For example you obviously won't get
any warnings about the shrinker without triggering the shrinker (which
means memory pressure since we don't have the debugfs file to trigger it).


Actually, you don't need debugfs. Writing to /proc/sys/vm/drop_caches
will do it. Though maybe there's other code path scenarios that
wouldn't cover.


Huh didn't know, but it's a bit a shotgun, plus it doesn't use
fs_reclaim shrinker annotations, which means you don't have lockdep
checks. I think at least, would need some deadlock and testing.


The big problem with this sort of method for triggering the shrinkers is
that they are called without (many) locks held. Whereas it's entirely
possible for a shrinker to be called at (almost) any allocation in the
kernel.

Admittedly the Panfrost shrinkers are fairly safe - because most things are
xxx_trylock(). kbase avoids trylock which makes reclaim more reliable, but
means deadlocks are much easier.


This is why you need the fs_reclaim annotation. With that lockdep can
connect the dots. See also might_alloc() annotations I've added in 5.11 or
so.

Validating shrinkers for deadlocks is actually not that hard, you just
need the debugfs interface to run your shrinker at will under the
fs_reclaim_acquire/release annotations. You do _not_ need to hit the full
combinatorial test matrix of making sure that your shrinker is called in
any possible place where memory is allocated.


Cool - I hadn't looked at that code before, but it does look like it 
should pick up the problem cases. I wish that had existed back when I 
was dealing with kbase! :)



I have to admit I'm not 100% sure I've seen any lockdep warnings based
on buffer objects recently. I can trigger them based on jobs:


[snip]


Certainly here the mutex causing the problem is the shrinker_lock!

The above is triggered by chucking a whole ton of jobs which

Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-19 Thread Daniel Vetter
On Fri, Feb 19, 2021 at 01:36:06PM +, Steven Price wrote:
> On 18/02/2021 18:20, Daniel Vetter wrote:
> > On Thu, Feb 18, 2021 at 6:16 PM Rob Herring  wrote:
> > > 
> > > On Thu, Feb 18, 2021 at 10:51 AM Steven Price  
> > > wrote:
> > > > 
> > > > On 18/02/2021 16:38, Rob Herring wrote:
> > > > > On Thu, Feb 18, 2021 at 10:15 AM Steven Price  
> > > > > wrote:
> > > > > > 
> > > > > > On 18/02/2021 15:45, Alyssa Rosenzweig wrote:
> > > > > > > > Yeah plus Cc: stable for backporting and I think an igt or 
> > > > > > > > similar for
> > > > > > > > panfrost to check this works correctly would be pretty good 
> > > > > > > > too. Since
> > > > > > > > if it took us over 1 year to notice this bug it's pretty clear 
> > > > > > > > that
> > > > > > > > normal testing doesn't catch this. So very likely we'll break 
> > > > > > > > this
> > > > > > > > again.
> > > > > > > 
> > > > > > > Unfortunately there are a lot of kernel bugs which are noticed 
> > > > > > > during actual
> > > > > > > use (but not CI runs), some of which have never been fixed. I do 
> > > > > > > know
> > > > > > > the shrinker impl is buggy for us, if this is the fix I'm very 
> > > > > > > happy.
> > > > > > 
> > > > > > I doubt this will actually "fix" anything - if I understand 
> > > > > > correctly
> > > > > > then the sequence which is broken is:
> > > > > > 
> > > > > > * allocate BO, mmap to CPU
> > > > > > * madvise(DONTNEED)
> > > > > > * trigger purge
> > > > > > * try to access the BO memory
> > > > > > 
> > > > > > which is an invalid sequence for user space - the attempt to access
> > > > > > memory should cause a SIGSEGV. However because drm_vma_node_unmap() 
> > > > > > is
> > > > > > unable to find the mappings there may still be page table entries
> > > > > > present which would provide access to memory the kernel has freed. 
> > > > > > Which
> > > > > > is of course a big security hole and so this fix is needed.
> > > > > > 
> > > > > > In what way do you find the shrinker impl buggy? I'm aware there's 
> > > > > > some
> > > > > > dodgy locking (although I haven't worked out how to fix it) - but 
> > > > > > AFAICT
> > > > > > it's more deadlock territory rather than lacking in locks. Are there
> > > > > > correctness issues?
> > > > > 
> > > > > What's there was largely a result of getting lockdep happy.
> > > > > 
> > > > > > > > btw for testing shrinkers recommended way is to have a debugfs 
> > > > > > > > file
> > > > > > > > that just force-shrinks everything. That way you avoid all the 
> > > > > > > > trouble
> > > > > > > > that tend to happen when you drive a system close to OOM on 
> > > > > > > > linux, and
> > > > > > > > it's also much faster.
> > > > > > > 
> > > > > > > 2nding this as a good idea.
> > > > > > > 
> > > > > > 
> > > > > > Sounds like a good idea to me too. But equally I'm wondering 
> > > > > > whether the
> > > > > > best (short term) solution is to actually disable the shrinker. I'm
> > > > > > somewhat surprised that nobody has got fed up with the "Purging xxx
> > > > > > bytes" message spam - which makes me think that most people are not
> > > > > > hitting memory pressure to trigger the shrinker.
> > > > > 
> > > > > If the shrinker is dodgy, then it's probably good to have the messages
> > > > > to know if it ran.
> > > > > 
> > > > > > The shrinker on kbase caused a lot of grief - and the only way I 
> > > > > > managed
> > > > > > to get that under control was by writing a static analysis tool for 
> > > > > > the
> > > > > > locking, and by upsetting people by enforcing the (rather dumb) 
> > > > > > rules of
> > > > > > the tool on the code base. I've been meaning to look at whether 
> > > > > > sparse
> > > > > > can do a similar check of locks.
> > > > > 
> > > > > Lockdep doesn't cover it?
> > > > 
> > > > Short answer: no ;)
> > 
> > It's pretty good actually, if you correctly annotate things up.
> 
> I agree - it's pretty good, the problem is you need reasonable test
> coverage, and getting good test coverage of shrinkers is hard.
> 
> > > > The problem with lockdep is that you have to trigger the locking
> > > > scenario to get a warning out of it. For example you obviously won't get
> > > > any warnings about the shrinker without triggering the shrinker (which
> > > > means memory pressure since we don't have the debugfs file to trigger 
> > > > it).
> > > 
> > > Actually, you don't need debugfs. Writing to /proc/sys/vm/drop_caches
> > > will do it. Though maybe there's other code path scenarios that
> > > wouldn't cover.
> > 
> > Huh didn't know, but it's a bit a shotgun, plus it doesn't use
> > fs_reclaim shrinker annotations, which means you don't have lockdep
> > checks. I think at least, would need some deadlock and testing.
> 
> The big problem with this sort of method for triggering the shrinkers is
> that they are called without (many) locks held. Whereas it's entirely
> possible for a shrinker to be called at (almost) any allocation in the

Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-19 Thread Steven Price

On 18/02/2021 18:20, Daniel Vetter wrote:

On Thu, Feb 18, 2021 at 6:16 PM Rob Herring  wrote:


On Thu, Feb 18, 2021 at 10:51 AM Steven Price  wrote:


On 18/02/2021 16:38, Rob Herring wrote:

On Thu, Feb 18, 2021 at 10:15 AM Steven Price  wrote:


On 18/02/2021 15:45, Alyssa Rosenzweig wrote:

Yeah plus Cc: stable for backporting and I think an igt or similar for
panfrost to check this works correctly would be pretty good too. Since
if it took us over 1 year to notice this bug it's pretty clear that
normal testing doesn't catch this. So very likely we'll break this
again.


Unfortunately there are a lot of kernel bugs which are noticed during actual
use (but not CI runs), some of which have never been fixed. I do know
the shrinker impl is buggy for us, if this is the fix I'm very happy.


I doubt this will actually "fix" anything - if I understand correctly
then the sequence which is broken is:

* allocate BO, mmap to CPU
* madvise(DONTNEED)
* trigger purge
* try to access the BO memory

which is an invalid sequence for user space - the attempt to access
memory should cause a SIGSEGV. However because drm_vma_node_unmap() is
unable to find the mappings there may still be page table entries
present which would provide access to memory the kernel has freed. Which
is of course a big security hole and so this fix is needed.

In what way do you find the shrinker impl buggy? I'm aware there's some
dodgy locking (although I haven't worked out how to fix it) - but AFAICT
it's more deadlock territory rather than lacking in locks. Are there
correctness issues?


What's there was largely a result of getting lockdep happy.


btw for testing shrinkers recommended way is to have a debugfs file
that just force-shrinks everything. That way you avoid all the trouble
that tend to happen when you drive a system close to OOM on linux, and
it's also much faster.


2nding this as a good idea.



Sounds like a good idea to me too. But equally I'm wondering whether the
best (short term) solution is to actually disable the shrinker. I'm
somewhat surprised that nobody has got fed up with the "Purging xxx
bytes" message spam - which makes me think that most people are not
hitting memory pressure to trigger the shrinker.


If the shrinker is dodgy, then it's probably good to have the messages
to know if it ran.


The shrinker on kbase caused a lot of grief - and the only way I managed
to get that under control was by writing a static analysis tool for the
locking, and by upsetting people by enforcing the (rather dumb) rules of
the tool on the code base. I've been meaning to look at whether sparse
can do a similar check of locks.


Lockdep doesn't cover it?


Short answer: no ;)


It's pretty good actually, if you correctly annotate things up.


I agree - it's pretty good, the problem is you need reasonable test 
coverage, and getting good test coverage of shrinkers is hard.



The problem with lockdep is that you have to trigger the locking
scenario to get a warning out of it. For example you obviously won't get
any warnings about the shrinker without triggering the shrinker (which
means memory pressure since we don't have the debugfs file to trigger it).


Actually, you don't need debugfs. Writing to /proc/sys/vm/drop_caches
will do it. Though maybe there's other code path scenarios that
wouldn't cover.


Huh didn't know, but it's a bit a shotgun, plus it doesn't use
fs_reclaim shrinker annotations, which means you don't have lockdep
checks. I think at least, would need some deadlock and testing.


The big problem with this sort of method for triggering the shrinkers is 
that they are called without (many) locks held. Whereas it's entirely 
possible for a shrinker to be called at (almost) any allocation in the 
kernel.


Admittedly the Panfrost shrinkers are fairly safe - because most things 
are xxx_trylock(). kbase avoids trylock which makes reclaim more 
reliable, but means deadlocks are much easier.





I have to admit I'm not 100% sure I've seen any lockdep warnings based
on buffer objects recently. I can trigger them based on jobs:


[snip]


Certainly here the mutex causing the problem is the shrinker_lock!

The above is triggered by chucking a whole ton of jobs which
fault at the GPU.

Sadly I haven't found time to work out how to untangle the locks.


They are tricky because pretty much any memory allocation can trigger
things as I recall.


The above should only be possible with my dma_fence annotations, and
yes the point to bugs in the drm/scheduler. They shouldn't matter for
panfrost, and those patches aren't in upstream yet.


Yes that's on a (random version of) drm-misc - just what I happened to 
have built recently. Good news if that's not actually Panfrost's bug. I 
haven't had the time to track down what's going on yet.


Sounds like I'm perhaps being a bit unfair on the shrinkers - I'm just 
aware that I went down a rabbit hole before looking at changing the 
locks which started because I was 

Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-18 Thread Daniel Vetter
On Thu, Feb 18, 2021 at 6:16 PM Rob Herring  wrote:
>
> On Thu, Feb 18, 2021 at 10:51 AM Steven Price  wrote:
> >
> > On 18/02/2021 16:38, Rob Herring wrote:
> > > On Thu, Feb 18, 2021 at 10:15 AM Steven Price  
> > > wrote:
> > >>
> > >> On 18/02/2021 15:45, Alyssa Rosenzweig wrote:
> >  Yeah plus Cc: stable for backporting and I think an igt or similar for
> >  panfrost to check this works correctly would be pretty good too. Since
> >  if it took us over 1 year to notice this bug it's pretty clear that
> >  normal testing doesn't catch this. So very likely we'll break this
> >  again.
> > >>>
> > >>> Unfortunately there are a lot of kernel bugs which are noticed during 
> > >>> actual
> > >>> use (but not CI runs), some of which have never been fixed. I do know
> > >>> the shrinker impl is buggy for us, if this is the fix I'm very happy.
> > >>
> > >> I doubt this will actually "fix" anything - if I understand correctly
> > >> then the sequence which is broken is:
> > >>
> > >>* allocate BO, mmap to CPU
> > >>* madvise(DONTNEED)
> > >>* trigger purge
> > >>* try to access the BO memory
> > >>
> > >> which is an invalid sequence for user space - the attempt to access
> > >> memory should cause a SIGSEGV. However because drm_vma_node_unmap() is
> > >> unable to find the mappings there may still be page table entries
> > >> present which would provide access to memory the kernel has freed. Which
> > >> is of course a big security hole and so this fix is needed.
> > >>
> > >> In what way do you find the shrinker impl buggy? I'm aware there's some
> > >> dodgy locking (although I haven't worked out how to fix it) - but AFAICT
> > >> it's more deadlock territory rather than lacking in locks. Are there
> > >> correctness issues?
> > >
> > > What's there was largely a result of getting lockdep happy.
> > >
> >  btw for testing shrinkers recommended way is to have a debugfs file
> >  that just force-shrinks everything. That way you avoid all the trouble
> >  that tend to happen when you drive a system close to OOM on linux, and
> >  it's also much faster.
> > >>>
> > >>> 2nding this as a good idea.
> > >>>
> > >>
> > >> Sounds like a good idea to me too. But equally I'm wondering whether the
> > >> best (short term) solution is to actually disable the shrinker. I'm
> > >> somewhat surprised that nobody has got fed up with the "Purging xxx
> > >> bytes" message spam - which makes me think that most people are not
> > >> hitting memory pressure to trigger the shrinker.
> > >
> > > If the shrinker is dodgy, then it's probably good to have the messages
> > > to know if it ran.
> > >
> > >> The shrinker on kbase caused a lot of grief - and the only way I managed
> > >> to get that under control was by writing a static analysis tool for the
> > >> locking, and by upsetting people by enforcing the (rather dumb) rules of
> > >> the tool on the code base. I've been meaning to look at whether sparse
> > >> can do a similar check of locks.
> > >
> > > Lockdep doesn't cover it?
> >
> > Short answer: no ;)

It's pretty good actually, if you correctly annotate things up.

> > The problem with lockdep is that you have to trigger the locking
> > scenario to get a warning out of it. For example you obviously won't get
> > any warnings about the shrinker without triggering the shrinker (which
> > means memory pressure since we don't have the debugfs file to trigger it).
>
> Actually, you don't need debugfs. Writing to /proc/sys/vm/drop_caches
> will do it. Though maybe there's other code path scenarios that
> wouldn't cover.

Huh didn't know, but it's a bit a shotgun, plus it doesn't use
fs_reclaim shrinker annotations, which means you don't have lockdep
checks. I think at least, would need some deadlock and testing.
>
>
> > I have to admit I'm not 100% sure I've seen any lockdep warnings based
> > on buffer objects recently. I can trigger them based on jobs:
> >
> > -8<--
> > [  265.764474] ==
> > [  265.771380] WARNING: possible circular locking dependency detected
> > [  265.778294] 5.11.0-rc2+ #22 Tainted: GW
> > [  265.784148] --
> > [  265.791050] kworker/0:3/90 is trying to acquire lock:
> > [  265.796694] c0d982b0 (fs_reclaim){+.+.}-{0:0}, at: 
> > __fs_reclaim_acquire+0x0/0x38
> > [  265.804994]
> > [  265.804994] but task is already holding lock:
> > [  265.811513] c49a348c (>queue[j].lock){+.+.}-{3:3}, at: 
> > panfrost_reset+0x124/0x1cc [panfrost]
> > [  265.821375]
> > [  265.821375] which lock already depends on the new lock.
> > [  265.821375]
> > [  265.830524]
> > [  265.830524] the existing dependency chain (in reverse order) is:
> > [  265.838892]
> > [  265.838892] -> #2 (>queue[j].lock){+.+.}-{3:3}:
> > [  265.845996]mutex_lock_nested+0x18/0x20
> > [  265.850961]panfrost_scheduler_stop+0x1c/0x94 [panfrost]
> > [  265.857590] 

Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-18 Thread Rob Herring
On Thu, Feb 18, 2021 at 10:51 AM Steven Price  wrote:
>
> On 18/02/2021 16:38, Rob Herring wrote:
> > On Thu, Feb 18, 2021 at 10:15 AM Steven Price  wrote:
> >>
> >> On 18/02/2021 15:45, Alyssa Rosenzweig wrote:
>  Yeah plus Cc: stable for backporting and I think an igt or similar for
>  panfrost to check this works correctly would be pretty good too. Since
>  if it took us over 1 year to notice this bug it's pretty clear that
>  normal testing doesn't catch this. So very likely we'll break this
>  again.
> >>>
> >>> Unfortunately there are a lot of kernel bugs which are noticed during 
> >>> actual
> >>> use (but not CI runs), some of which have never been fixed. I do know
> >>> the shrinker impl is buggy for us, if this is the fix I'm very happy.
> >>
> >> I doubt this will actually "fix" anything - if I understand correctly
> >> then the sequence which is broken is:
> >>
> >>* allocate BO, mmap to CPU
> >>* madvise(DONTNEED)
> >>* trigger purge
> >>* try to access the BO memory
> >>
> >> which is an invalid sequence for user space - the attempt to access
> >> memory should cause a SIGSEGV. However because drm_vma_node_unmap() is
> >> unable to find the mappings there may still be page table entries
> >> present which would provide access to memory the kernel has freed. Which
> >> is of course a big security hole and so this fix is needed.
> >>
> >> In what way do you find the shrinker impl buggy? I'm aware there's some
> >> dodgy locking (although I haven't worked out how to fix it) - but AFAICT
> >> it's more deadlock territory rather than lacking in locks. Are there
> >> correctness issues?
> >
> > What's there was largely a result of getting lockdep happy.
> >
>  btw for testing shrinkers recommended way is to have a debugfs file
>  that just force-shrinks everything. That way you avoid all the trouble
>  that tend to happen when you drive a system close to OOM on linux, and
>  it's also much faster.
> >>>
> >>> 2nding this as a good idea.
> >>>
> >>
> >> Sounds like a good idea to me too. But equally I'm wondering whether the
> >> best (short term) solution is to actually disable the shrinker. I'm
> >> somewhat surprised that nobody has got fed up with the "Purging xxx
> >> bytes" message spam - which makes me think that most people are not
> >> hitting memory pressure to trigger the shrinker.
> >
> > If the shrinker is dodgy, then it's probably good to have the messages
> > to know if it ran.
> >
> >> The shrinker on kbase caused a lot of grief - and the only way I managed
> >> to get that under control was by writing a static analysis tool for the
> >> locking, and by upsetting people by enforcing the (rather dumb) rules of
> >> the tool on the code base. I've been meaning to look at whether sparse
> >> can do a similar check of locks.
> >
> > Lockdep doesn't cover it?
>
> Short answer: no ;)
>
> The problem with lockdep is that you have to trigger the locking
> scenario to get a warning out of it. For example you obviously won't get
> any warnings about the shrinker without triggering the shrinker (which
> means memory pressure since we don't have the debugfs file to trigger it).

Actually, you don't need debugfs. Writing to /proc/sys/vm/drop_caches
will do it. Though maybe there's other code path scenarios that
wouldn't cover.


> I have to admit I'm not 100% sure I've seen any lockdep warnings based
> on buffer objects recently. I can trigger them based on jobs:
>
> -8<--
> [  265.764474] ==
> [  265.771380] WARNING: possible circular locking dependency detected
> [  265.778294] 5.11.0-rc2+ #22 Tainted: GW
> [  265.784148] --
> [  265.791050] kworker/0:3/90 is trying to acquire lock:
> [  265.796694] c0d982b0 (fs_reclaim){+.+.}-{0:0}, at: 
> __fs_reclaim_acquire+0x0/0x38
> [  265.804994]
> [  265.804994] but task is already holding lock:
> [  265.811513] c49a348c (>queue[j].lock){+.+.}-{3:3}, at: 
> panfrost_reset+0x124/0x1cc [panfrost]
> [  265.821375]
> [  265.821375] which lock already depends on the new lock.
> [  265.821375]
> [  265.830524]
> [  265.830524] the existing dependency chain (in reverse order) is:
> [  265.838892]
> [  265.838892] -> #2 (>queue[j].lock){+.+.}-{3:3}:
> [  265.845996]mutex_lock_nested+0x18/0x20
> [  265.850961]panfrost_scheduler_stop+0x1c/0x94 [panfrost]
> [  265.857590]panfrost_reset+0x54/0x1cc [panfrost]
> [  265.863441]process_one_work+0x238/0x51c
> [  265.868503]worker_thread+0x22c/0x2e0
> [  265.873270]kthread+0x128/0x138
> [  265.877455]ret_from_fork+0x14/0x38
> [  265.882028]0x0
> [  265.884657]
> [  265.884657] -> #1 (dma_fence_map){}-{0:0}:
> [  265.891277]dma_resv_lockdep+0x1b4/0x290
> [  265.896339]do_one_initcall+0x5c/0x2e8
> [  265.901206]kernel_init_freeable+0x184/0x1d4
> [  265.906651]   

Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-18 Thread Steven Price

On 18/02/2021 16:38, Rob Herring wrote:

On Thu, Feb 18, 2021 at 10:15 AM Steven Price  wrote:


On 18/02/2021 15:45, Alyssa Rosenzweig wrote:

Yeah plus Cc: stable for backporting and I think an igt or similar for
panfrost to check this works correctly would be pretty good too. Since
if it took us over 1 year to notice this bug it's pretty clear that
normal testing doesn't catch this. So very likely we'll break this
again.


Unfortunately there are a lot of kernel bugs which are noticed during actual
use (but not CI runs), some of which have never been fixed. I do know
the shrinker impl is buggy for us, if this is the fix I'm very happy.


I doubt this will actually "fix" anything - if I understand correctly
then the sequence which is broken is:

   * allocate BO, mmap to CPU
   * madvise(DONTNEED)
   * trigger purge
   * try to access the BO memory

which is an invalid sequence for user space - the attempt to access
memory should cause a SIGSEGV. However because drm_vma_node_unmap() is
unable to find the mappings there may still be page table entries
present which would provide access to memory the kernel has freed. Which
is of course a big security hole and so this fix is needed.

In what way do you find the shrinker impl buggy? I'm aware there's some
dodgy locking (although I haven't worked out how to fix it) - but AFAICT
it's more deadlock territory rather than lacking in locks. Are there
correctness issues?


What's there was largely a result of getting lockdep happy.


btw for testing shrinkers recommended way is to have a debugfs file
that just force-shrinks everything. That way you avoid all the trouble
that tend to happen when you drive a system close to OOM on linux, and
it's also much faster.


2nding this as a good idea.



Sounds like a good idea to me too. But equally I'm wondering whether the
best (short term) solution is to actually disable the shrinker. I'm
somewhat surprised that nobody has got fed up with the "Purging xxx
bytes" message spam - which makes me think that most people are not
hitting memory pressure to trigger the shrinker.


If the shrinker is dodgy, then it's probably good to have the messages
to know if it ran.


The shrinker on kbase caused a lot of grief - and the only way I managed
to get that under control was by writing a static analysis tool for the
locking, and by upsetting people by enforcing the (rather dumb) rules of
the tool on the code base. I've been meaning to look at whether sparse
can do a similar check of locks.


Lockdep doesn't cover it?


Short answer: no ;)

The problem with lockdep is that you have to trigger the locking
scenario to get a warning out of it. For example you obviously won't get
any warnings about the shrinker without triggering the shrinker (which
means memory pressure since we don't have the debugfs file to trigger it).

I have to admit I'm not 100% sure I've seen any lockdep warnings based
on buffer objects recently. I can trigger them based on jobs:

-8<--
[  265.764474] ==
[  265.771380] WARNING: possible circular locking dependency detected
[  265.778294] 5.11.0-rc2+ #22 Tainted: GW
[  265.784148] --
[  265.791050] kworker/0:3/90 is trying to acquire lock:
[  265.796694] c0d982b0 (fs_reclaim){+.+.}-{0:0}, at: 
__fs_reclaim_acquire+0x0/0x38
[  265.804994]
[  265.804994] but task is already holding lock:
[  265.811513] c49a348c (>queue[j].lock){+.+.}-{3:3}, at: 
panfrost_reset+0x124/0x1cc [panfrost]
[  265.821375]
[  265.821375] which lock already depends on the new lock.
[  265.821375]
[  265.830524]
[  265.830524] the existing dependency chain (in reverse order) is:
[  265.838892]
[  265.838892] -> #2 (>queue[j].lock){+.+.}-{3:3}:
[  265.845996]mutex_lock_nested+0x18/0x20
[  265.850961]panfrost_scheduler_stop+0x1c/0x94 [panfrost]
[  265.857590]panfrost_reset+0x54/0x1cc [panfrost]
[  265.863441]process_one_work+0x238/0x51c
[  265.868503]worker_thread+0x22c/0x2e0
[  265.873270]kthread+0x128/0x138
[  265.877455]ret_from_fork+0x14/0x38
[  265.882028]0x0
[  265.884657]
[  265.884657] -> #1 (dma_fence_map){}-{0:0}:
[  265.891277]dma_resv_lockdep+0x1b4/0x290
[  265.896339]do_one_initcall+0x5c/0x2e8
[  265.901206]kernel_init_freeable+0x184/0x1d4
[  265.906651]kernel_init+0x8/0x11c
[  265.911029]ret_from_fork+0x14/0x38
[  265.915610]0x0
[  265.918247]
[  265.918247] -> #0 (fs_reclaim){+.+.}-{0:0}:
[  265.924579]lock_acquire+0x3a4/0x45c
[  265.929260]__fs_reclaim_acquire+0x28/0x38
[  265.934523]slab_pre_alloc_hook.constprop.28+0x1c/0x64
[  265.940948]kmem_cache_alloc_trace+0x38/0x114
[  265.946493]panfrost_job_run+0x60/0x2b4 [panfrost]
[  265.952540]drm_sched_resubmit_jobs+0x88/0xc4 [gpu_sched]
[  265.959256]panfrost_reset+0x174/0x1cc [panfrost]
[  

Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-18 Thread Rob Herring
On Thu, Feb 18, 2021 at 10:15 AM Steven Price  wrote:
>
> On 18/02/2021 15:45, Alyssa Rosenzweig wrote:
> >> Yeah plus Cc: stable for backporting and I think an igt or similar for
> >> panfrost to check this works correctly would be pretty good too. Since
> >> if it took us over 1 year to notice this bug it's pretty clear that
> >> normal testing doesn't catch this. So very likely we'll break this
> >> again.
> >
> > Unfortunately there are a lot of kernel bugs which are noticed during actual
> > use (but not CI runs), some of which have never been fixed. I do know
> > the shrinker impl is buggy for us, if this is the fix I'm very happy.
>
> I doubt this will actually "fix" anything - if I understand correctly
> then the sequence which is broken is:
>
>   * allocate BO, mmap to CPU
>   * madvise(DONTNEED)
>   * trigger purge
>   * try to access the BO memory
>
> which is an invalid sequence for user space - the attempt to access
> memory should cause a SIGSEGV. However because drm_vma_node_unmap() is
> unable to find the mappings there may still be page table entries
> present which would provide access to memory the kernel has freed. Which
> is of course a big security hole and so this fix is needed.
>
> In what way do you find the shrinker impl buggy? I'm aware there's some
> dodgy locking (although I haven't worked out how to fix it) - but AFAICT
> it's more deadlock territory rather than lacking in locks. Are there
> correctness issues?

What's there was largely a result of getting lockdep happy.

> >> btw for testing shrinkers recommended way is to have a debugfs file
> >> that just force-shrinks everything. That way you avoid all the trouble
> >> that tend to happen when you drive a system close to OOM on linux, and
> >> it's also much faster.
> >
> > 2nding this as a good idea.
> >
>
> Sounds like a good idea to me too. But equally I'm wondering whether the
> best (short term) solution is to actually disable the shrinker. I'm
> somewhat surprised that nobody has got fed up with the "Purging xxx
> bytes" message spam - which makes me think that most people are not
> hitting memory pressure to trigger the shrinker.

If the shrinker is dodgy, then it's probably good to have the messages
to know if it ran.

> The shrinker on kbase caused a lot of grief - and the only way I managed
> to get that under control was by writing a static analysis tool for the
> locking, and by upsetting people by enforcing the (rather dumb) rules of
> the tool on the code base. I've been meaning to look at whether sparse
> can do a similar check of locks.

Lockdep doesn't cover it?

Rob
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-18 Thread Alyssa Rosenzweig
> Yeah plus Cc: stable for backporting and I think an igt or similar for
> panfrost to check this works correctly would be pretty good too. Since
> if it took us over 1 year to notice this bug it's pretty clear that
> normal testing doesn't catch this. So very likely we'll break this
> again.

Unfortunately there are a lot of kernel bugs which are noticed during actual
use (but not CI runs), some of which have never been fixed. I do know
the shrinker impl is buggy for us, if this is the fix I'm very happy.

> btw for testing shrinkers recommended way is to have a debugfs file
> that just force-shrinks everything. That way you avoid all the trouble
> that tend to happen when you drive a system close to OOM on linux, and
> it's also much faster.

2nding this as a good idea.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-18 Thread Steven Price

On 18/02/2021 15:45, Alyssa Rosenzweig wrote:

Yeah plus Cc: stable for backporting and I think an igt or similar for
panfrost to check this works correctly would be pretty good too. Since
if it took us over 1 year to notice this bug it's pretty clear that
normal testing doesn't catch this. So very likely we'll break this
again.


Unfortunately there are a lot of kernel bugs which are noticed during actual
use (but not CI runs), some of which have never been fixed. I do know
the shrinker impl is buggy for us, if this is the fix I'm very happy.


I doubt this will actually "fix" anything - if I understand correctly 
then the sequence which is broken is:


 * allocate BO, mmap to CPU
 * madvise(DONTNEED)
 * trigger purge
 * try to access the BO memory

which is an invalid sequence for user space - the attempt to access 
memory should cause a SIGSEGV. However because drm_vma_node_unmap() is 
unable to find the mappings there may still be page table entries 
present which would provide access to memory the kernel has freed. Which 
is of course a big security hole and so this fix is needed.


In what way do you find the shrinker impl buggy? I'm aware there's some 
dodgy locking (although I haven't worked out how to fix it) - but AFAICT 
it's more deadlock territory rather than lacking in locks. Are there 
correctness issues?



btw for testing shrinkers recommended way is to have a debugfs file
that just force-shrinks everything. That way you avoid all the trouble
that tend to happen when you drive a system close to OOM on linux, and
it's also much faster.


2nding this as a good idea.



Sounds like a good idea to me too. But equally I'm wondering whether the 
best (short term) solution is to actually disable the shrinker. I'm 
somewhat surprised that nobody has got fed up with the "Purging xxx 
bytes" message spam - which makes me think that most people are not 
hitting memory pressure to trigger the shrinker.


The shrinker on kbase caused a lot of grief - and the only way I managed 
to get that under control was by writing a static analysis tool for the 
locking, and by upsetting people by enforcing the (rather dumb) rules of 
the tool on the code base. I've been meaning to look at whether sparse 
can do a similar check of locks.


Sadly at the moment I'm struggling to find time to look at such things.

Steve
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-18 Thread Daniel Vetter
On Thu, Feb 18, 2021 at 4:26 PM Steven Price  wrote:
>
> On 17/02/2021 16:59, Neil Roberts wrote:
> > When mmapping the shmem, it would previously adjust the pgoff in the
> > vm_area_struct to remove the fake offset that is added to be able to
> > identify the buffer. This patch removes the adjustment and makes the
> > fault handler use the vm_fault address to calculate the page offset
> > instead. Although using this address is apparently discouraged, several
> > DRM drivers seem to be doing it anyway.
> >
> > The problem with removing the pgoff is that it prevents
> > drm_vma_node_unmap from working because that searches the mapping tree
> > by address. That doesn't work because all of the mappings are at offset
> > 0. drm_vma_node_unmap is being used by the shmem helpers when purging
> > the buffer.
> >
> > It looks like panfrost is using drm_gem_shmem_purge so this might fix a
> > potential bug there.
> >
> > Signed-off-by: Neil Roberts 
>
> As the test robot points out pgoff_t is unsigned, so the <0 test makes
> no sense. But apart from that it looks good to me.
>
> I think this is worth a "Fixes:" line too - as you point out
> drm_vma_node_unmap() won't be working correctly - which means we're
> potentially leaving user space with pages pointing to freed memory - not
> good! 17acb9f35ed7 is my best guess at the commit that introduced this.

Yeah plus Cc: stable for backporting and I think an igt or similar for
panfrost to check this works correctly would be pretty good too. Since
if it took us over 1 year to notice this bug it's pretty clear that
normal testing doesn't catch this. So very likely we'll break this
again.

btw for testing shrinkers recommended way is to have a debugfs file
that just force-shrinks everything. That way you avoid all the trouble
that tend to happen when you drive a system close to OOM on linux, and
it's also much faster.
-Daniel

> Steve
>
> > ---
> >   drivers/gpu/drm/drm_gem_shmem_helper.c | 12 +++-
> >   1 file changed, 7 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> > b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > index 9825c378dfa6..4b14157f1962 100644
> > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > @@ -526,11 +526,16 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault 
> > *vmf)
> >   struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
> >   loff_t num_pages = obj->size >> PAGE_SHIFT;
> >   struct page *page;
> > + pgoff_t page_offset;
> >
> > - if (vmf->pgoff >= num_pages || WARN_ON_ONCE(!shmem->pages))
> > + /* We don't use vmf->pgoff since that has the fake offset */
> > + page_offset = (vmf->address - vma->vm_start) >> PAGE_SHIFT;
> > +
> > + if (page_offset < 0 || page_offset >= num_pages ||
> > + WARN_ON_ONCE(!shmem->pages))
> >   return VM_FAULT_SIGBUS;
> >
> > - page = shmem->pages[vmf->pgoff];
> > + page = shmem->pages[page_offset];
> >
> >   return vmf_insert_page(vma, vmf->address, page);
> >   }
> > @@ -581,9 +586,6 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, 
> > struct vm_area_struct *vma)
> >   struct drm_gem_shmem_object *shmem;
> >   int ret;
> >
> > - /* Remove the fake offset */
> > - vma->vm_pgoff -= drm_vma_node_start(>vma_node);
> > -
> >   if (obj->import_attach) {
> >   /* Drop the reference drm_gem_mmap_obj() acquired.*/
> >   drm_gem_object_put(obj);
> >
>
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-18 Thread Steven Price

On 17/02/2021 16:59, Neil Roberts wrote:

When mmapping the shmem, it would previously adjust the pgoff in the
vm_area_struct to remove the fake offset that is added to be able to
identify the buffer. This patch removes the adjustment and makes the
fault handler use the vm_fault address to calculate the page offset
instead. Although using this address is apparently discouraged, several
DRM drivers seem to be doing it anyway.

The problem with removing the pgoff is that it prevents
drm_vma_node_unmap from working because that searches the mapping tree
by address. That doesn't work because all of the mappings are at offset
0. drm_vma_node_unmap is being used by the shmem helpers when purging
the buffer.

It looks like panfrost is using drm_gem_shmem_purge so this might fix a
potential bug there.

Signed-off-by: Neil Roberts 


As the test robot points out pgoff_t is unsigned, so the <0 test makes 
no sense. But apart from that it looks good to me.


I think this is worth a "Fixes:" line too - as you point out 
drm_vma_node_unmap() won't be working correctly - which means we're 
potentially leaving user space with pages pointing to freed memory - not 
good! 17acb9f35ed7 is my best guess at the commit that introduced this.


Steve


---
  drivers/gpu/drm/drm_gem_shmem_helper.c | 12 +++-
  1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 9825c378dfa6..4b14157f1962 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -526,11 +526,16 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault 
*vmf)
struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
loff_t num_pages = obj->size >> PAGE_SHIFT;
struct page *page;
+   pgoff_t page_offset;
  
-	if (vmf->pgoff >= num_pages || WARN_ON_ONCE(!shmem->pages))

+   /* We don't use vmf->pgoff since that has the fake offset */
+   page_offset = (vmf->address - vma->vm_start) >> PAGE_SHIFT;
+
+   if (page_offset < 0 || page_offset >= num_pages ||
+   WARN_ON_ONCE(!shmem->pages))
return VM_FAULT_SIGBUS;
  
-	page = shmem->pages[vmf->pgoff];

+   page = shmem->pages[page_offset];
  
  	return vmf_insert_page(vma, vmf->address, page);

  }
@@ -581,9 +586,6 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, struct 
vm_area_struct *vma)
struct drm_gem_shmem_object *shmem;
int ret;
  
-	/* Remove the fake offset */

-   vma->vm_pgoff -= drm_vma_node_start(>vma_node);
-
if (obj->import_attach) {
/* Drop the reference drm_gem_mmap_obj() acquired.*/
drm_gem_object_put(obj);



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

2021-02-17 Thread kernel test robot
Hi Neil,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.11 next-20210217]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Neil-Roberts/drm-shmem-helper-Don-t-remove-the-offset-in-vm_area_struct-pgoff/20210218-012028
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
f40ddce88593482919761f74910f42f4b84c004b
config: i386-randconfig-m021-20210215 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

New smatch warnings:
drivers/gpu/drm/drm_gem_shmem_helper.c:534 drm_gem_shmem_fault() warn: unsigned 
'page_offset' is never less than zero.

Old smatch warnings:
include/drm/drm_vma_manager.h:225 drm_vma_node_unmap() warn: should 
'drm_vma_node_size(node) << 12' be a 64 bit type?

vim +/page_offset +534 drivers/gpu/drm/drm_gem_shmem_helper.c

   521  
   522  static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
   523  {
   524  struct vm_area_struct *vma = vmf->vma;
   525  struct drm_gem_object *obj = vma->vm_private_data;
   526  struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
   527  loff_t num_pages = obj->size >> PAGE_SHIFT;
   528  struct page *page;
   529  pgoff_t page_offset;
   530  
   531  /* We don't use vmf->pgoff since that has the fake offset */
   532  page_offset = (vmf->address - vma->vm_start) >> PAGE_SHIFT;
   533  
 > 534  if (page_offset < 0 || page_offset >= num_pages ||
   535  WARN_ON_ONCE(!shmem->pages))
   536  return VM_FAULT_SIGBUS;
   537  
   538  page = shmem->pages[page_offset];
   539  
   540  return vmf_insert_page(vma, vmf->address, page);
   541  }
   542  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel