Re: pages pinned for BO lifetime and security

2020-07-23 Thread Intel


On 2020-07-22 09:46, Daniel Vetter wrote:

On Wed, Jul 22, 2020 at 9:19 AM Christian König
 wrote:

Am 22.07.20 um 02:22 schrieb Gurchetan Singh:

Of the desktop GPU drivers, i915's shrinker certainly supports purging
to swap.  TTM is a bit hard to follow.  I can't really tell if amdgpu
or nouveau supports that.  virtio-gpu is more commonly found on
systems with swaps so I think it should follow the desktop practices?


What we do at least in the amdgpu, radeon, i915 and nouveau is to only allow it 
for scanout and that in turn is limited by the physical number of CRTCs on the 
board.

Somewhat aside, but I'm not sure the ttm shrinker really works like
that. I think there's two parts:
1. kernel thread which takes buffers and unbinds them when we're over
the ttm global limit. This is the ttm_shrink_work stuff, and it only
shrinks if the zone is over a hard limit. Below that it just leaves
buffers pinned.

2. Actual core mm shrinker, which releases buffers held in cache by
ttm_page_alloc_dma.c. But that only happens when buffers have been
unbound by the first thread, so anything below those hard limits is
not shrinkable. And iirc those hard limits are like half of system
memory or so (last time I looked through this stuff at least).

No idea why exactly things are like they are, since the first thread
already does a dma_resv_trylock, and that's enough to avoid locking
inversions when being called from 2. Or well, should be at least, for
reasonable driver design.

The only other thing I'm seeing is the global lru, but that could be
fixed by having a per-device core mm shrinker instance which directly
shrinks the per-device lru. And then we just globally balance like
with all shrinkers through the core mm "shrink everyone equally"
approach. You can even keep the separate page alloc shrinker, since
core mm always loops over all shrinkers - we're not the only ones
where shrinking one cache makes more memory available for another
cache to shrink, e.g. you can't throw out an inode without first
throwing out all the dentry pointing at them.

Another problem would be allocating memory while holding per-device
lru locks (since trylock on such a global lock in shrinkers is a
really bad idea, we know that from all the dev->struct_mutex lolz in
i915). But for ttm that's not a problem since all lru are spinlock, so
only GFP_ATOMIC allowed anyway, hence no problem.

Adding Thomas for this ttm tangent.
-Daniel


Hmm, so yes the TTM avoidance of 'DOS by pinning' mechanism is not 
really optimal but nobody has really cared to improve it yet.


TTM allows a certain amount of memory to be pinned across all TTM 
drivers in a system. That should include kmalloced memory for graphics 
objects, intended at that time to stop attacks like guessing a gem name 
and calling gem open repeatedly to pin all available kmalloc memory. 
Buffer object pages are also typically considered "pinned, but 
unpinnable" by this accounting. When the hard limit is reached, there is 
a direct reclaim by unbinding unpinnable pages from graphics, and 
copying the contents to shmem, on which the shrinkers then may operate. 
There is also a lower soft limit above which a separate kernel thread 
copies memory to shmem. These limits are run-time configurable.


There are also a couple of page pools that were added to TTM to cache 
unused device-coherent pages and uncached / write-combined pages. Since 
allocating such memory may be painfully slow. These page pools have 
their own shrinkers.


/Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: pages pinned for BO lifetime and security

2020-07-22 Thread Daniel Vetter
On Wed, Jul 22, 2020 at 9:27 PM Chia-I Wu  wrote:
>
> On Wed, Jul 22, 2020 at 4:28 AM Daniel Vetter  wrote:
> >
> > On Wed, Jul 22, 2020 at 1:12 PM Christian König
> >  wrote:
> > >
> > > Am 22.07.20 um 09:32 schrieb Daniel Vetter:
> > > > On Wed, Jul 22, 2020 at 9:19 AM Christian König
> > > >  wrote:
> > > >> Am 22.07.20 um 02:22 schrieb Gurchetan Singh:
> > > >>
> > > >> +Christian who added DMABUF_MOVE_NOTIFY which added the relevant blurb:
> > > >>
> > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fdma-buf%2FKconfig%23n46data=02%7C01%7Cchristian.koenig%40amd.com%7C916f6a9b64884e21fd7f08d82e115c8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63731122752271sdata=WKs3KSr3K1DdVmDaIZ%2FnV8VEBPWMGMSGweeay0HIOxw%3Dreserved=0
> > > >>
> > > >> Currently, the user seems to amdgpu for P2P dma-buf and it seems to 
> > > >> plumb ttm (*move_notify) callback to dma-buf.  We're not sure if it's 
> > > >> a security issue occurring across DRM drivers, or one more specific to 
> > > >> the new amdgpu use case.
> > > >>
> > > >> On Tue, Jul 21, 2020 at 1:03 PM Chia-I Wu  wrote:
> > > >>> Hi list,
> > > >>>
> > > >>> virtio-gpu is moving in the direction where BO pages are pinned for
> > > >>> the lifetime for simplicity.  I am wondering if that is considered a
> > > >>> security issue in general, especially aCan you elaborate a little bit 
> > > >>> what these other problems might be?  Memory fragmentation?fter 
> > > >>> running into the
> > > >>> description of the new DMABUF_MOVE_NOTIFY config option.
> > > >>
> > > >> Yes, that is generally considered a deny of service possibility and so 
> > > >> far Dave and Daniel have rejected all tries to upstream stuff like 
> > > >> this as far as I know.
> > > > Uh we have merged pretty much all arm-soc drivers without real
> > > > shrinkers. Whether that was a good idea or not is maybe different
> > > > question - now that we do have pretty good helpers maybe we should
> > > > poke this a bit more. But then SoCs Suck (tm).
> > >
> > > I was under the impression that those SoC drivers still use the GEM
> > > helpers which unpinns stuff when it is not in use. But I might be wrong.
> >
> > It's kinda mostly there, even some helpers for shrinking but a)
> > helpers on, not all drivers use it b) for purgeable objects only, not
> > generally for inactive stuff - there's no active use tracking c) cma
> > helpers (ok that one is only for vc4 as the render driver) don't even
> > have that. I had some slow burner series to get us towards dma_resv
> > locking in shmem helpers and then maybe even a common shrinker helper
> > with some "actually kick it out now" callback, but yeah never got
> > there.
> My quick survey of the SoC drivers also told me that they tend to
> demonstrate a) or b).
>
> About b), I was thinking maybe that's because the systems the drivers
> run on are historically swap-less.  There is no place to write the
> dirty pages back to and thus less incentive to support shrinking
> inactive objects.
>
> You both mentioned that the lack of swap is irrelevant (or at least
> not the only factor).  Can you elaborate a little bit on that?
> Shrinking inactive objects returns the pages to swap cache... hmm, I
> guess that helps memory defragmentation?

Yup, kernel can then move it around as it sees fit.

Also, zswap is a thing nowadays, and I think used by default on android now.
-Daniel

> >
> > So maybe per-device object shrinker helper would be something neat we
> > could lift out of ttm (when it's happening), maybe with a simple
> > callback somewhere in it's lru tracking. Probably best if the shrinker
> > lru is outright separate from anything else or it just gets messy.
> > -Daniel
> >
> > > > But for real gpus they do indeed all have shrinkers, and not just "pin
> > > > everything forever" model. Real gpus = stuff you might run on servers
> > > > or multi-app and all that stuff, not with a simple "we just kill all
> > > > background jobs if memory gets low" model like on android and other
> > > > such things.
> > > >
> > > >> DMA-buf an pinning for scanout are the only exceptions since the 
> > > >> implementation wouldn't have been possible otherwise.
> > > >>
> > > >>> Most drivers do not have a shrinker, or whether a BO is purgeable is
> > > >>> entirely controlled by the userspace (madvice).  They can be
> > > >>> categorized as "a security problem where userspace is able to pin
> > > >>> unrestricted amounts of memory".  But those drivers are normally found
> > > >>> on systems without swap.  I don't think the issue applies.
> > > >>
> > > >> This is completely independent of the availability of swap or not.
> > > >>
> > > >> Pinning of pages in large quantities can result in all kind of 
> > > >> problems and needs to be prevented even without swap.
> > > > Yeah you don't just kill swap, you kill a ton of other kernel services
> > > > with 

Re: pages pinned for BO lifetime and security

2020-07-22 Thread Chia-I Wu
On Wed, Jul 22, 2020 at 4:28 AM Daniel Vetter  wrote:
>
> On Wed, Jul 22, 2020 at 1:12 PM Christian König
>  wrote:
> >
> > Am 22.07.20 um 09:32 schrieb Daniel Vetter:
> > > On Wed, Jul 22, 2020 at 9:19 AM Christian König
> > >  wrote:
> > >> Am 22.07.20 um 02:22 schrieb Gurchetan Singh:
> > >>
> > >> +Christian who added DMABUF_MOVE_NOTIFY which added the relevant blurb:
> > >>
> > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fdma-buf%2FKconfig%23n46data=02%7C01%7Cchristian.koenig%40amd.com%7C916f6a9b64884e21fd7f08d82e115c8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63731122752271sdata=WKs3KSr3K1DdVmDaIZ%2FnV8VEBPWMGMSGweeay0HIOxw%3Dreserved=0
> > >>
> > >> Currently, the user seems to amdgpu for P2P dma-buf and it seems to 
> > >> plumb ttm (*move_notify) callback to dma-buf.  We're not sure if it's a 
> > >> security issue occurring across DRM drivers, or one more specific to the 
> > >> new amdgpu use case.
> > >>
> > >> On Tue, Jul 21, 2020 at 1:03 PM Chia-I Wu  wrote:
> > >>> Hi list,
> > >>>
> > >>> virtio-gpu is moving in the direction where BO pages are pinned for
> > >>> the lifetime for simplicity.  I am wondering if that is considered a
> > >>> security issue in general, especially aCan you elaborate a little bit 
> > >>> what these other problems might be?  Memory fragmentation?fter running 
> > >>> into the
> > >>> description of the new DMABUF_MOVE_NOTIFY config option.
> > >>
> > >> Yes, that is generally considered a deny of service possibility and so 
> > >> far Dave and Daniel have rejected all tries to upstream stuff like this 
> > >> as far as I know.
> > > Uh we have merged pretty much all arm-soc drivers without real
> > > shrinkers. Whether that was a good idea or not is maybe different
> > > question - now that we do have pretty good helpers maybe we should
> > > poke this a bit more. But then SoCs Suck (tm).
> >
> > I was under the impression that those SoC drivers still use the GEM
> > helpers which unpinns stuff when it is not in use. But I might be wrong.
>
> It's kinda mostly there, even some helpers for shrinking but a)
> helpers on, not all drivers use it b) for purgeable objects only, not
> generally for inactive stuff - there's no active use tracking c) cma
> helpers (ok that one is only for vc4 as the render driver) don't even
> have that. I had some slow burner series to get us towards dma_resv
> locking in shmem helpers and then maybe even a common shrinker helper
> with some "actually kick it out now" callback, but yeah never got
> there.
My quick survey of the SoC drivers also told me that they tend to
demonstrate a) or b).

About b), I was thinking maybe that's because the systems the drivers
run on are historically swap-less.  There is no place to write the
dirty pages back to and thus less incentive to support shrinking
inactive objects.

You both mentioned that the lack of swap is irrelevant (or at least
not the only factor).  Can you elaborate a little bit on that?
Shrinking inactive objects returns the pages to swap cache... hmm, I
guess that helps memory defragmentation?

>
> So maybe per-device object shrinker helper would be something neat we
> could lift out of ttm (when it's happening), maybe with a simple
> callback somewhere in it's lru tracking. Probably best if the shrinker
> lru is outright separate from anything else or it just gets messy.
> -Daniel
>
> > > But for real gpus they do indeed all have shrinkers, and not just "pin
> > > everything forever" model. Real gpus = stuff you might run on servers
> > > or multi-app and all that stuff, not with a simple "we just kill all
> > > background jobs if memory gets low" model like on android and other
> > > such things.
> > >
> > >> DMA-buf an pinning for scanout are the only exceptions since the 
> > >> implementation wouldn't have been possible otherwise.
> > >>
> > >>> Most drivers do not have a shrinker, or whether a BO is purgeable is
> > >>> entirely controlled by the userspace (madvice).  They can be
> > >>> categorized as "a security problem where userspace is able to pin
> > >>> unrestricted amounts of memory".  But those drivers are normally found
> > >>> on systems without swap.  I don't think the issue applies.
> > >>
> > >> This is completely independent of the availability of swap or not.
> > >>
> > >> Pinning of pages in large quantities can result in all kind of problems 
> > >> and needs to be prevented even without swap.
> > > Yeah you don't just kill swap, you kill a ton of other kernel services
> > > with mass pinning. I think even the pinning of scanout buffers for
> > > i915 from system memory is somewhat questionable (but I guess small
> > > enough to not matter in practice).
> >
> > Yeah, we had a really hard time explaining that internally as well.
> >
> > Christian.
> >
> > >> Otherwise you can ran into problems even with simple I/O operations for 
> > >> 

Re: pages pinned for BO lifetime and security

2020-07-22 Thread Daniel Vetter
On Wed, Jul 22, 2020 at 1:12 PM Christian König
 wrote:
>
> Am 22.07.20 um 09:32 schrieb Daniel Vetter:
> > On Wed, Jul 22, 2020 at 9:19 AM Christian König
> >  wrote:
> >> Am 22.07.20 um 02:22 schrieb Gurchetan Singh:
> >>
> >> +Christian who added DMABUF_MOVE_NOTIFY which added the relevant blurb:
> >>
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fdma-buf%2FKconfig%23n46data=02%7C01%7Cchristian.koenig%40amd.com%7C916f6a9b64884e21fd7f08d82e115c8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63731122752271sdata=WKs3KSr3K1DdVmDaIZ%2FnV8VEBPWMGMSGweeay0HIOxw%3Dreserved=0
> >>
> >> Currently, the user seems to amdgpu for P2P dma-buf and it seems to plumb 
> >> ttm (*move_notify) callback to dma-buf.  We're not sure if it's a security 
> >> issue occurring across DRM drivers, or one more specific to the new amdgpu 
> >> use case.
> >>
> >> On Tue, Jul 21, 2020 at 1:03 PM Chia-I Wu  wrote:
> >>> Hi list,
> >>>
> >>> virtio-gpu is moving in the direction where BO pages are pinned for
> >>> the lifetime for simplicity.  I am wondering if that is considered a
> >>> security issue in general, especially after running into the
> >>> description of the new DMABUF_MOVE_NOTIFY config option.
> >>
> >> Yes, that is generally considered a deny of service possibility and so far 
> >> Dave and Daniel have rejected all tries to upstream stuff like this as far 
> >> as I know.
> > Uh we have merged pretty much all arm-soc drivers without real
> > shrinkers. Whether that was a good idea or not is maybe different
> > question - now that we do have pretty good helpers maybe we should
> > poke this a bit more. But then SoCs Suck (tm).
>
> I was under the impression that those SoC drivers still use the GEM
> helpers which unpinns stuff when it is not in use. But I might be wrong.

It's kinda mostly there, even some helpers for shrinking but a)
helpers on, not all drivers use it b) for purgeable objects only, not
generally for inactive stuff - there's no active use tracking c) cma
helpers (ok that one is only for vc4 as the render driver) don't even
have that. I had some slow burner series to get us towards dma_resv
locking in shmem helpers and then maybe even a common shrinker helper
with some "actually kick it out now" callback, but yeah never got
there.

So maybe per-device object shrinker helper would be something neat we
could lift out of ttm (when it's happening), maybe with a simple
callback somewhere in it's lru tracking. Probably best if the shrinker
lru is outright separate from anything else or it just gets messy.
-Daniel

> > But for real gpus they do indeed all have shrinkers, and not just "pin
> > everything forever" model. Real gpus = stuff you might run on servers
> > or multi-app and all that stuff, not with a simple "we just kill all
> > background jobs if memory gets low" model like on android and other
> > such things.
> >
> >> DMA-buf an pinning for scanout are the only exceptions since the 
> >> implementation wouldn't have been possible otherwise.
> >>
> >>> Most drivers do not have a shrinker, or whether a BO is purgeable is
> >>> entirely controlled by the userspace (madvice).  They can be
> >>> categorized as "a security problem where userspace is able to pin
> >>> unrestricted amounts of memory".  But those drivers are normally found
> >>> on systems without swap.  I don't think the issue applies.
> >>
> >> This is completely independent of the availability of swap or not.
> >>
> >> Pinning of pages in large quantities can result in all kind of problems 
> >> and needs to be prevented even without swap.
> > Yeah you don't just kill swap, you kill a ton of other kernel services
> > with mass pinning. I think even the pinning of scanout buffers for
> > i915 from system memory is somewhat questionable (but I guess small
> > enough to not matter in practice).
>
> Yeah, we had a really hard time explaining that internally as well.
>
> Christian.
>
> >> Otherwise you can ran into problems even with simple I/O operations for 
> >> example.
> >>
> >>> Of the desktop GPU drivers, i915's shrinker certainly supports purging
> >>> to swap.  TTM is a bit hard to follow.  I can't really tell if amdgpu
> >>> or nouveau supports that.  virtio-gpu is more commonly found on
> >>> systems with swaps so I think it should follow the desktop practices?
> >>
> >> What we do at least in the amdgpu, radeon, i915 and nouveau is to only 
> >> allow it for scanout and that in turn is limited by the physical number of 
> >> CRTCs on the board.
> >>
> >>> Truth is, the emulated virtio-gpu device always supports page moves
> >>> with VIRTIO_GPU_CMD_RESOURCE_{ATTACH,DETACH}_BACKING.  It is just that
> >>> the driver does not make use of them.  That makes this less of an
> >>> issue because the driver can be fixed anytime (finger crossed that the
> >>> emulator won't have bugs in these untested paths).  This issue 

Re: pages pinned for BO lifetime and security

2020-07-22 Thread Christian König

Am 22.07.20 um 09:32 schrieb Daniel Vetter:

On Wed, Jul 22, 2020 at 9:19 AM Christian König
 wrote:

Am 22.07.20 um 02:22 schrieb Gurchetan Singh:

+Christian who added DMABUF_MOVE_NOTIFY which added the relevant blurb:

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fdma-buf%2FKconfig%23n46data=02%7C01%7Cchristian.koenig%40amd.com%7C916f6a9b64884e21fd7f08d82e115c8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63731122752271sdata=WKs3KSr3K1DdVmDaIZ%2FnV8VEBPWMGMSGweeay0HIOxw%3Dreserved=0

Currently, the user seems to amdgpu for P2P dma-buf and it seems to plumb ttm 
(*move_notify) callback to dma-buf.  We're not sure if it's a security issue 
occurring across DRM drivers, or one more specific to the new amdgpu use case.

On Tue, Jul 21, 2020 at 1:03 PM Chia-I Wu  wrote:

Hi list,

virtio-gpu is moving in the direction where BO pages are pinned for
the lifetime for simplicity.  I am wondering if that is considered a
security issue in general, especially after running into the
description of the new DMABUF_MOVE_NOTIFY config option.


Yes, that is generally considered a deny of service possibility and so far Dave 
and Daniel have rejected all tries to upstream stuff like this as far as I know.

Uh we have merged pretty much all arm-soc drivers without real
shrinkers. Whether that was a good idea or not is maybe different
question - now that we do have pretty good helpers maybe we should
poke this a bit more. But then SoCs Suck (tm).


I was under the impression that those SoC drivers still use the GEM 
helpers which unpinns stuff when it is not in use. But I might be wrong.




But for real gpus they do indeed all have shrinkers, and not just "pin
everything forever" model. Real gpus = stuff you might run on servers
or multi-app and all that stuff, not with a simple "we just kill all
background jobs if memory gets low" model like on android and other
such things.


DMA-buf an pinning for scanout are the only exceptions since the implementation 
wouldn't have been possible otherwise.


Most drivers do not have a shrinker, or whether a BO is purgeable is
entirely controlled by the userspace (madvice).  They can be
categorized as "a security problem where userspace is able to pin
unrestricted amounts of memory".  But those drivers are normally found
on systems without swap.  I don't think the issue applies.


This is completely independent of the availability of swap or not.

Pinning of pages in large quantities can result in all kind of problems and 
needs to be prevented even without swap.

Yeah you don't just kill swap, you kill a ton of other kernel services
with mass pinning. I think even the pinning of scanout buffers for
i915 from system memory is somewhat questionable (but I guess small
enough to not matter in practice).


Yeah, we had a really hard time explaining that internally as well.

Christian.


Otherwise you can ran into problems even with simple I/O operations for example.


Of the desktop GPU drivers, i915's shrinker certainly supports purging
to swap.  TTM is a bit hard to follow.  I can't really tell if amdgpu
or nouveau supports that.  virtio-gpu is more commonly found on
systems with swaps so I think it should follow the desktop practices?


What we do at least in the amdgpu, radeon, i915 and nouveau is to only allow it 
for scanout and that in turn is limited by the physical number of CRTCs on the 
board.


Truth is, the emulated virtio-gpu device always supports page moves
with VIRTIO_GPU_CMD_RESOURCE_{ATTACH,DETACH}_BACKING.  It is just that
the driver does not make use of them.  That makes this less of an
issue because the driver can be fixed anytime (finger crossed that the
emulator won't have bugs in these untested paths).  This issue becomes
more urgent because we are considering adding a new HW command[1]
where page moves will be disallowed.  We definitely don't want a HW
command that is inherently insecure, if BO pages pinned for the
lifetime is considered a security issue on desktops.


Yeah, that's probably not such a good idea :)

Well if the pinning is just for the duration of the hw command, it's
fine, just like batch buffers. But if it's long term pinning then that
doesn't sound like a good idea. RDMA has this as their inherit hw
programming model (except if your hw is really fancy and has hw page
fault handling on the rdma nic), and they hard limit such pins to what
you can mlock (or something similar within rdma).
-Daniel


Regards,
Christian.


[1] VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB

Re: pages pinned for BO lifetime and security

2020-07-22 Thread Christian König

Am 22.07.20 um 09:46 schrieb Daniel Vetter:

On Wed, Jul 22, 2020 at 9:19 AM Christian König
 wrote:

Am 22.07.20 um 02:22 schrieb Gurchetan Singh:

Of the desktop GPU drivers, i915's shrinker certainly supports purging
to swap.  TTM is a bit hard to follow.  I can't really tell if amdgpu
or nouveau supports that.  virtio-gpu is more commonly found on
systems with swaps so I think it should follow the desktop practices?


What we do at least in the amdgpu, radeon, i915 and nouveau is to only allow it 
for scanout and that in turn is limited by the physical number of CRTCs on the 
board.

Somewhat aside, but I'm not sure the ttm shrinker really works like
that. I think there's two parts:
1. kernel thread which takes buffers and unbinds them when we're over
the ttm global limit. This is the ttm_shrink_work stuff, and it only
shrinks if the zone is over a hard limit. Below that it just leaves
buffers pinned.

2. Actual core mm shrinker, which releases buffers held in cache by
ttm_page_alloc_dma.c. But that only happens when buffers have been
unbound by the first thread, so anything below those hard limits is
not shrinkable. And iirc those hard limits are like half of system
memory or so (last time I looked through this stuff at least).

No idea why exactly things are like they are, since the first thread
already does a dma_resv_trylock, and that's enough to avoid locking
inversions when being called from 2. Or well, should be at least, for
reasonable driver design.


Yes, that's currently a bit messy in TTM and not such a good design over 
all.



The only other thing I'm seeing is the global lru, but that could be
fixed by having a per-device core mm shrinker instance which directly
shrinks the per-device lru. And then we just globally balance like
with all shrinkers through the core mm "shrink everyone equally"
approach. You can even keep the separate page alloc shrinker, since
core mm always loops over all shrinkers - we're not the only ones
where shrinking one cache makes more memory available for another
cache to shrink, e.g. you can't throw out an inode without first
throwing out all the dentry pointing at them.


My plan is to replace all this with an explicit SWAP domain for buffer 
objects.


One idea was to make the SYSTEM and SWAP domain global and express all 
this with transits between the different domains. But having one 
shrinker per device sounds like an even better idea now.



Another problem would be allocating memory while holding per-device
lru locks (since trylock on such a global lock in shrinkers is a
really bad idea, we know that from all the dev->struct_mutex lolz in
i915). But for ttm that's not a problem since all lru are spinlock, so
only GFP_ATOMIC allowed anyway, hence no problem.


Yes, exactly.

Christian.



Adding Thomas for this ttm tangent.
-Daniel


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: pages pinned for BO lifetime and security

2020-07-22 Thread Daniel Vetter
On Wed, Jul 22, 2020 at 9:19 AM Christian König
 wrote:
> Am 22.07.20 um 02:22 schrieb Gurchetan Singh:
>> Of the desktop GPU drivers, i915's shrinker certainly supports purging
>> to swap.  TTM is a bit hard to follow.  I can't really tell if amdgpu
>> or nouveau supports that.  virtio-gpu is more commonly found on
>> systems with swaps so I think it should follow the desktop practices?
>
>
> What we do at least in the amdgpu, radeon, i915 and nouveau is to only allow 
> it for scanout and that in turn is limited by the physical number of CRTCs on 
> the board.

Somewhat aside, but I'm not sure the ttm shrinker really works like
that. I think there's two parts:
1. kernel thread which takes buffers and unbinds them when we're over
the ttm global limit. This is the ttm_shrink_work stuff, and it only
shrinks if the zone is over a hard limit. Below that it just leaves
buffers pinned.

2. Actual core mm shrinker, which releases buffers held in cache by
ttm_page_alloc_dma.c. But that only happens when buffers have been
unbound by the first thread, so anything below those hard limits is
not shrinkable. And iirc those hard limits are like half of system
memory or so (last time I looked through this stuff at least).

No idea why exactly things are like they are, since the first thread
already does a dma_resv_trylock, and that's enough to avoid locking
inversions when being called from 2. Or well, should be at least, for
reasonable driver design.

The only other thing I'm seeing is the global lru, but that could be
fixed by having a per-device core mm shrinker instance which directly
shrinks the per-device lru. And then we just globally balance like
with all shrinkers through the core mm "shrink everyone equally"
approach. You can even keep the separate page alloc shrinker, since
core mm always loops over all shrinkers - we're not the only ones
where shrinking one cache makes more memory available for another
cache to shrink, e.g. you can't throw out an inode without first
throwing out all the dentry pointing at them.

Another problem would be allocating memory while holding per-device
lru locks (since trylock on such a global lock in shrinkers is a
really bad idea, we know that from all the dev->struct_mutex lolz in
i915). But for ttm that's not a problem since all lru are spinlock, so
only GFP_ATOMIC allowed anyway, hence no problem.

Adding Thomas for this ttm tangent.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: pages pinned for BO lifetime and security

2020-07-22 Thread Daniel Vetter
On Wed, Jul 22, 2020 at 9:19 AM Christian König
 wrote:
>
> Am 22.07.20 um 02:22 schrieb Gurchetan Singh:
>
> +Christian who added DMABUF_MOVE_NOTIFY which added the relevant blurb:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/dma-buf/Kconfig#n46
>
> Currently, the user seems to amdgpu for P2P dma-buf and it seems to plumb ttm 
> (*move_notify) callback to dma-buf.  We're not sure if it's a security issue 
> occurring across DRM drivers, or one more specific to the new amdgpu use case.
>
> On Tue, Jul 21, 2020 at 1:03 PM Chia-I Wu  wrote:
>>
>> Hi list,
>>
>> virtio-gpu is moving in the direction where BO pages are pinned for
>> the lifetime for simplicity.  I am wondering if that is considered a
>> security issue in general, especially after running into the
>> description of the new DMABUF_MOVE_NOTIFY config option.
>
>
> Yes, that is generally considered a deny of service possibility and so far 
> Dave and Daniel have rejected all tries to upstream stuff like this as far as 
> I know.

Uh we have merged pretty much all arm-soc drivers without real
shrinkers. Whether that was a good idea or not is maybe different
question - now that we do have pretty good helpers maybe we should
poke this a bit more. But then SoCs Suck (tm).

But for real gpus they do indeed all have shrinkers, and not just "pin
everything forever" model. Real gpus = stuff you might run on servers
or multi-app and all that stuff, not with a simple "we just kill all
background jobs if memory gets low" model like on android and other
such things.

> DMA-buf an pinning for scanout are the only exceptions since the 
> implementation wouldn't have been possible otherwise.
>
>>
>> Most drivers do not have a shrinker, or whether a BO is purgeable is
>> entirely controlled by the userspace (madvice).  They can be
>> categorized as "a security problem where userspace is able to pin
>> unrestricted amounts of memory".  But those drivers are normally found
>> on systems without swap.  I don't think the issue applies.
>
>
> This is completely independent of the availability of swap or not.
>
> Pinning of pages in large quantities can result in all kind of problems and 
> needs to be prevented even without swap.

Yeah you don't just kill swap, you kill a ton of other kernel services
with mass pinning. I think even the pinning of scanout buffers for
i915 from system memory is somewhat questionable (but I guess small
enough to not matter in practice).

> Otherwise you can ran into problems even with simple I/O operations for 
> example.
>
>>
>> Of the desktop GPU drivers, i915's shrinker certainly supports purging
>> to swap.  TTM is a bit hard to follow.  I can't really tell if amdgpu
>> or nouveau supports that.  virtio-gpu is more commonly found on
>> systems with swaps so I think it should follow the desktop practices?
>
>
> What we do at least in the amdgpu, radeon, i915 and nouveau is to only allow 
> it for scanout and that in turn is limited by the physical number of CRTCs on 
> the board.
>
>>
>> Truth is, the emulated virtio-gpu device always supports page moves
>> with VIRTIO_GPU_CMD_RESOURCE_{ATTACH,DETACH}_BACKING.  It is just that
>> the driver does not make use of them.  That makes this less of an
>> issue because the driver can be fixed anytime (finger crossed that the
>> emulator won't have bugs in these untested paths).  This issue becomes
>> more urgent because we are considering adding a new HW command[1]
>> where page moves will be disallowed.  We definitely don't want a HW
>> command that is inherently insecure, if BO pages pinned for the
>> lifetime is considered a security issue on desktops.
>
>
> Yeah, that's probably not such a good idea :)

Well if the pinning is just for the duration of the hw command, it's
fine, just like batch buffers. But if it's long term pinning then that
doesn't sound like a good idea. RDMA has this as their inherit hw
programming model (except if your hw is really fancy and has hw page
fault handling on the rdma nic), and they hard limit such pins to what
you can mlock (or something similar within rdma).
-Daniel

>
> Regards,
> Christian.
>
>>
>> [1] VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB
>> https://gitlab.freedesktop.org/virgl/drm-misc-next/-/blob/virtio-gpu-next/include/uapi/linux/virtio_gpu.h#L396
>
>
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: pages pinned for BO lifetime and security

2020-07-22 Thread Christian König

Am 22.07.20 um 09:19 schrieb Christian König:

Am 22.07.20 um 02:22 schrieb Gurchetan Singh:

+Christian who added DMABUF_MOVE_NOTIFY which added the relevant blurb:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/dma-buf/Kconfig#n46 



Currently, the user seems to amdgpu for P2P dma-buf and it seems to 
plumb ttm (*move_notify) callback to dma-buf.  We're not sure if it's 
a security issue occurring across DRM drivers, or one more specific 
to the new amdgpu use case.


On Tue, Jul 21, 2020 at 1:03 PM Chia-I Wu > wrote:


Hi list,

virtio-gpu is moving in the direction where BO pages are pinned for
the lifetime for simplicity.  I am wondering if that is considered a
security issue in general, especially after running into the
description of the new DMABUF_MOVE_NOTIFY config option.



Yes, that is generally considered a deny of service possibility and so 
far Dave and Daniel have rejected all tries to upstream stuff like 
this as far as I know.


DMA-buf an pinning for scanout are the only exceptions since the 
implementation wouldn't have been possible otherwise.


Or better say for scanout pinning is a hardware requirement. For DMA-buf 
we obviously can have a better approach :)


Christian.





Most drivers do not have a shrinker, or whether a BO is purgeable is
entirely controlled by the userspace (madvice).  They can be
categorized as "a security problem where userspace is able to pin
unrestricted amounts of memory".  But those drivers are normally
found
on systems without swap.  I don't think the issue applies.



This is completely independent of the availability of swap or not.

Pinning of pages in large quantities can result in all kind of 
problems and needs to be prevented even without swap.


Otherwise you can ran into problems even with simple I/O operations 
for example.




Of the desktop GPU drivers, i915's shrinker certainly supports
purging
to swap.  TTM is a bit hard to follow.  I can't really tell if amdgpu
or nouveau supports that.  virtio-gpu is more commonly found on
systems with swaps so I think it should follow the desktop practices?



What we do at least in the amdgpu, radeon, i915 and nouveau is to only 
allow it for scanout and that in turn is limited by the physical 
number of CRTCs on the board.




Truth is, the emulated virtio-gpu device always supports page moves
with VIRTIO_GPU_CMD_RESOURCE_{ATTACH,DETACH}_BACKING.  It is just
that
the driver does not make use of them.  That makes this less of an
issue because the driver can be fixed anytime (finger crossed
that the
emulator won't have bugs in these untested paths).  This issue
becomes
more urgent because we are considering adding a new HW command[1]
where page moves will be disallowed.  We definitely don't want a HW
command that is inherently insecure, if BO pages pinned for the
lifetime is considered a security issue on desktops.



Yeah, that's probably not such a good idea :)

Regards,
Christian.



[1] VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB

https://gitlab.freedesktop.org/virgl/drm-misc-next/-/blob/virtio-gpu-next/include/uapi/linux/virtio_gpu.h#L396







___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: pages pinned for BO lifetime and security

2020-07-22 Thread Christian König

Am 22.07.20 um 02:22 schrieb Gurchetan Singh:

+Christian who added DMABUF_MOVE_NOTIFY which added the relevant blurb:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/dma-buf/Kconfig#n46 



Currently, the user seems to amdgpu for P2P dma-buf and it seems to 
plumb ttm (*move_notify) callback to dma-buf.  We're not sure if it's 
a security issue occurring across DRM drivers, or one more specific to 
the new amdgpu use case.


On Tue, Jul 21, 2020 at 1:03 PM Chia-I Wu > wrote:


Hi list,

virtio-gpu is moving in the direction where BO pages are pinned for
the lifetime for simplicity.  I am wondering if that is considered a
security issue in general, especially after running into the
description of the new DMABUF_MOVE_NOTIFY config option.



Yes, that is generally considered a deny of service possibility and so 
far Dave and Daniel have rejected all tries to upstream stuff like this 
as far as I know.


DMA-buf an pinning for scanout are the only exceptions since the 
implementation wouldn't have been possible otherwise.




Most drivers do not have a shrinker, or whether a BO is purgeable is
entirely controlled by the userspace (madvice).  They can be
categorized as "a security problem where userspace is able to pin
unrestricted amounts of memory".  But those drivers are normally found
on systems without swap.  I don't think the issue applies.



This is completely independent of the availability of swap or not.

Pinning of pages in large quantities can result in all kind of problems 
and needs to be prevented even without swap.


Otherwise you can ran into problems even with simple I/O operations for 
example.




Of the desktop GPU drivers, i915's shrinker certainly supports purging
to swap.  TTM is a bit hard to follow.  I can't really tell if amdgpu
or nouveau supports that.  virtio-gpu is more commonly found on
systems with swaps so I think it should follow the desktop practices?



What we do at least in the amdgpu, radeon, i915 and nouveau is to only 
allow it for scanout and that in turn is limited by the physical number 
of CRTCs on the board.




Truth is, the emulated virtio-gpu device always supports page moves
with VIRTIO_GPU_CMD_RESOURCE_{ATTACH,DETACH}_BACKING.  It is just that
the driver does not make use of them.  That makes this less of an
issue because the driver can be fixed anytime (finger crossed that the
emulator won't have bugs in these untested paths).  This issue becomes
more urgent because we are considering adding a new HW command[1]
where page moves will be disallowed.  We definitely don't want a HW
command that is inherently insecure, if BO pages pinned for the
lifetime is considered a security issue on desktops.



Yeah, that's probably not such a good idea :)

Regards,
Christian.



[1] VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB

https://gitlab.freedesktop.org/virgl/drm-misc-next/-/blob/virtio-gpu-next/include/uapi/linux/virtio_gpu.h#L396





___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: pages pinned for BO lifetime and security

2020-07-21 Thread Gurchetan Singh
+Christian who added DMABUF_MOVE_NOTIFY which added the relevant blurb:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/dma-buf/Kconfig#n46

Currently, the user seems to amdgpu for P2P dma-buf and it seems to plumb
ttm (*move_notify) callback to dma-buf.  We're not sure if it's a security
issue occurring across DRM drivers, or one more specific to the new amdgpu
use case.

On Tue, Jul 21, 2020 at 1:03 PM Chia-I Wu  wrote:

> Hi list,
>
> virtio-gpu is moving in the direction where BO pages are pinned for
> the lifetime for simplicity.  I am wondering if that is considered a
> security issue in general, especially after running into the
> description of the new DMABUF_MOVE_NOTIFY config option.
>
> Most drivers do not have a shrinker, or whether a BO is purgeable is
> entirely controlled by the userspace (madvice).  They can be
> categorized as "a security problem where userspace is able to pin
> unrestricted amounts of memory".  But those drivers are normally found
> on systems without swap.  I don't think the issue applies.
>
> Of the desktop GPU drivers, i915's shrinker certainly supports purging
> to swap.  TTM is a bit hard to follow.  I can't really tell if amdgpu
> or nouveau supports that.  virtio-gpu is more commonly found on
> systems with swaps so I think it should follow the desktop practices?
>
> Truth is, the emulated virtio-gpu device always supports page moves
> with VIRTIO_GPU_CMD_RESOURCE_{ATTACH,DETACH}_BACKING.  It is just that
> the driver does not make use of them.  That makes this less of an
> issue because the driver can be fixed anytime (finger crossed that the
> emulator won't have bugs in these untested paths).  This issue becomes
> more urgent because we are considering adding a new HW command[1]
> where page moves will be disallowed.  We definitely don't want a HW
> command that is inherently insecure, if BO pages pinned for the
> lifetime is considered a security issue on desktops.
>
> [1] VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB
>
> https://gitlab.freedesktop.org/virgl/drm-misc-next/-/blob/virtio-gpu-next/include/uapi/linux/virtio_gpu.h#L396
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


pages pinned for BO lifetime and security

2020-07-21 Thread Chia-I Wu
Hi list,

virtio-gpu is moving in the direction where BO pages are pinned for
the lifetime for simplicity.  I am wondering if that is considered a
security issue in general, especially after running into the
description of the new DMABUF_MOVE_NOTIFY config option.

Most drivers do not have a shrinker, or whether a BO is purgeable is
entirely controlled by the userspace (madvice).  They can be
categorized as "a security problem where userspace is able to pin
unrestricted amounts of memory".  But those drivers are normally found
on systems without swap.  I don't think the issue applies.

Of the desktop GPU drivers, i915's shrinker certainly supports purging
to swap.  TTM is a bit hard to follow.  I can't really tell if amdgpu
or nouveau supports that.  virtio-gpu is more commonly found on
systems with swaps so I think it should follow the desktop practices?

Truth is, the emulated virtio-gpu device always supports page moves
with VIRTIO_GPU_CMD_RESOURCE_{ATTACH,DETACH}_BACKING.  It is just that
the driver does not make use of them.  That makes this less of an
issue because the driver can be fixed anytime (finger crossed that the
emulator won't have bugs in these untested paths).  This issue becomes
more urgent because we are considering adding a new HW command[1]
where page moves will be disallowed.  We definitely don't want a HW
command that is inherently insecure, if BO pages pinned for the
lifetime is considered a security issue on desktops.

[1] VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB
https://gitlab.freedesktop.org/virgl/drm-misc-next/-/blob/virtio-gpu-next/include/uapi/linux/virtio_gpu.h#L396
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel