RE: [PATCH] drm: add func to better detect wether swiotlb is needed
> -Original Message- > From: Michael D Labriola [mailto:michael.d.labri...@gmail.com] > Sent: 19 February 2019 23:08 > To: dri-devel@lists.freedesktop.org; Alex Deucher > ; Christian Koenig ; > Chunming Zhou ; amd-...@lists.freedesktop.org; Monk > Liu > Cc: Juergen Gross ; Christoph Hellwig > ; Andrew Cooper ; Paul > Durrant ; xen-de...@lists.xen.org; Jan Beulich > ; Konrad Rzeszutek Wilk ; > Michael D Labriola > Subject: [PATCH] drm: add func to better detect wether swiotlb is needed > > This commit fixes DRM failures on Xen PV systems that were introduced in > v4.17 by the following commits: > > 82626363 drm: add func to get max iomem address v2 > fd5fd480 drm/amdgpu: only enable swiotlb alloc when need v2 > 1bc3d3cc drm/radeon: only enable swiotlb path when need v2 > > The introduction of ->need_swiotlb to the ttm_dma_populate() conditionals > in the radeon and amdgpu device drivers causes Gnome to immediately crash > on Xen PV systems, returning the user to the login screen. The following > kernel errors get logged: > > [ 28.554259] radeon_dp_aux_transfer_native: 200 callbacks suppressed > [ 31.219821] radeon :01:00.0: swiotlb buffer is full (sz: 2097152 > bytes) > [ 31.220030] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to > allocate GEM object (16384000, 2, 4096, -14) > [ 31.226109] radeon :01:00.0: swiotlb buffer is full (sz: 2097152 > bytes) > [ 31.226300] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to > allocate GEM object (16384000, 2, 4096, -14) > [ 31.300734] gnome-shell[1935]: segfault at 88 ip 7f39151cd904 sp > 7ffc97611ad8 error 4 in libmutter-cogl.so[7f3915178000+aa000] > [ 31.300745] Code: 5f c3 0f 1f 40 00 48 8b 47 78 48 8b 40 40 ff e0 66 0f > 1f 44 00 00 48 8b 47 78 48 8b 40 48 ff e0 66 0f 1f 44 00 00 48 8b 47 78 > <48> 8b 80 88 00 00 00 ff e0 0f 1f 00 48 8b 47 78 48 8b 40 68 ff e0 > [ 38.193302] radeon_dp_aux_transfer_native: 116 callbacks suppressed > [ 40.009317] radeon :01:00.0: swiotlb buffer is full (sz: 2097152 > bytes) > [ 40.009488] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to > allocate GEM object (16384000, 2, 4096, -14) > [ 40.015114] radeon :01:00.0: swiotlb buffer is full (sz: 2097152 > bytes) > [ 40.015297] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to > allocate GEM object (16384000, 2, 4096, -14) > [ 40.028302] gnome-shell[2431]: segfault at 2dadf40 ip 02dadf40 > sp 7ffcd24ea5f8 error 15 > [ 40.028306] Code: 20 6e 31 00 00 00 00 00 00 00 00 37 e3 3d 2d 7f 00 00 > 80 f4 e6 3d 2d 7f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > <00> 00 00 00 00 00 00 00 c1 00 00 00 00 00 00 00 80 e1 d2 03 00 00 > > This commit renames drm_get_max_iomem() to drm_need_swiotlb(), adds a > xen_pv_domain() check to it, and moves the bit shifting comparison that > always follows its usage into the function (simplifying the drm driver > code). > > Signed-off-by: Michael D Labriola > --- > drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 2 +- > drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 2 +- > drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 2 +- > drivers/gpu/drm/drm_memory.c | 19 --- > drivers/gpu/drm/radeon/radeon_device.c | 2 +- > include/drm/drm_cache.h| 2 +- > 6 files changed, 21 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c > b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c > index 910c4ce..6bc0266 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c > @@ -1029,7 +1029,7 @@ static int gmc_v7_0_sw_init(void *handle) > pci_set_consistent_dma_mask(adev->pdev, DMA_BIT_MASK(32)); > pr_warn("amdgpu: No coherent DMA available\n"); > } > - adev->need_swiotlb = drm_get_max_iomem() > ((u64)1 << dma_bits); > + adev->need_swiotlb = drm_need_swiotlb(dma_bits); > > r = gmc_v7_0_init_microcode(adev); > if (r) { > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > index 747c068..8638adf 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c > @@ -1155,7 +1155,7 @@ static int gmc_v8_0_sw_init(void *handle) > pci_set_consistent_dma_mask(adev->pdev, DMA_BIT_MASK(32)); > pr_warn("amdgpu: No coherent DMA available\n"); > } > - adev->need_swiotlb = drm_get_max_iomem() > ((u64)1 << dma_bits); > + adev->need_swiotlb = drm_need_swiotlb(dma_bits); > > r = gmc_v8_0_init_microcode(adev); > if (r) { > diff --git
RE: [Xen-devel] [PATCH 0/1] drm/xen-zcopy: Add Xen zero-copy helper DRM driver
> -Original Message- > From: Oleksandr Andrushchenko [mailto:andr2...@gmail.com] > Sent: 18 April 2018 11:21 > To: Paul Durrant ; Roger Pau Monne > > Cc: jgr...@suse.com; Artem Mygaiev ; > Dongwon Kim ; airl...@linux.ie; > oleksandr_andrushche...@epam.com; linux-ker...@vger.kernel.org; dri- > de...@lists.freedesktop.org; Potrola, MateuszX > ; xen-de...@lists.xenproject.org; > daniel.vet...@intel.com; boris.ostrov...@oracle.com; Matt Roper > > Subject: Re: [Xen-devel] [PATCH 0/1] drm/xen-zcopy: Add Xen zero-copy > helper DRM driver > > On 04/18/2018 01:18 PM, Paul Durrant wrote: > >> -Original Message- > >> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On > Behalf > >> Of Roger Pau Monné > >> Sent: 18 April 2018 11:11 > >> To: Oleksandr Andrushchenko > >> Cc: jgr...@suse.com; Artem Mygaiev ; > >> Dongwon Kim ; airl...@linux.ie; > >> oleksandr_andrushche...@epam.com; linux-ker...@vger.kernel.org; > dri- > >> de...@lists.freedesktop.org; Potrola, MateuszX > >> ; xen-de...@lists.xenproject.org; > >> daniel.vet...@intel.com; boris.ostrov...@oracle.com; Matt Roper > >> > >> Subject: Re: [Xen-devel] [PATCH 0/1] drm/xen-zcopy: Add Xen zero-copy > >> helper DRM driver > >> > >> On Wed, Apr 18, 2018 at 11:01:12AM +0300, Oleksandr Andrushchenko > >> wrote: > >>> On 04/18/2018 10:35 AM, Roger Pau Monné wrote: > >>>> On Wed, Apr 18, 2018 at 09:38:39AM +0300, Oleksandr Andrushchenko > >> wrote: > >>>>> On 04/17/2018 11:57 PM, Dongwon Kim wrote: > >>>>>> On Tue, Apr 17, 2018 at 09:59:28AM +0200, Daniel Vetter wrote: > >>>>>>> On Mon, Apr 16, 2018 at 12:29:05PM -0700, Dongwon Kim wrote: > >>>>> 3.2 Backend exports dma-buf to xen-front > >>>>> > >>>>> In this case Dom0 pages are shared with DomU. As before, DomU can > >> only write > >>>>> to these pages, not any other page from Dom0, so it can be still > >> considered > >>>>> safe. > >>>>> But, the following must be considered (highlighted in xen-front's > Kernel > >>>>> documentation): > >>>>> - If guest domain dies then pages/grants received from the backend > >> cannot > >>>>> be claimed back - think of it as memory lost to Dom0 (won't be used > >> for > >>>>> any > >>>>> other guest) > >>>>> - Misbehaving guest may send too many requests to the backend > >> exhausting > >>>>> its grant references and memory (consider this from security POV). > >> As the > >>>>> backend runs in the trusted domain we also assume that it is > trusted > >> as > >>>>> well, > >>>>> e.g. must take measures to prevent DDoS attacks. > >>>> I cannot parse the above sentence: > >>>> > >>>> "As the backend runs in the trusted domain we also assume that it is > >>>> trusted as well, e.g. must take measures to prevent DDoS attacks." > >>>> > >>>> What's the relation between being trusted and protecting from DoS > >>>> attacks? > >>> I mean that we trust the backend that it can prevent Dom0 > >>> from crashing in case DomU's frontend misbehaves, e.g. > >>> if the frontend sends too many memory requests etc. > >>>> In any case, all? PV protocols are implemented with the frontend > >>>> sharing pages to the backend, and I think there's a reason why this > >>>> model is used, and it should continue to be used. > >>> This is the first use-case above. But there are real-world > >>> use-cases (embedded in my case) when physically contiguous memory > >>> needs to be shared, one of the possible ways to achieve this is > >>> to share contiguous memory from Dom0 to DomU (the second use-case > >> above) > >>>> Having to add logic in the backend to prevent such attacks means > >>>> that: > >>>> > >>>>- We need more code in the backend, which increases complexity and > >>>> chances of bugs. > >>>>- Such code/logic could be wrong, thus allowing DoS. > >>> You can live without this code at all, but this is then up to > >>> backend which may make Dom0 down because of DomU's frontend > doin
RE: [Xen-devel] [PATCH 0/1] drm/xen-zcopy: Add Xen zero-copy helper DRM driver
> -Original Message- > From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf > Of Roger Pau Monné > Sent: 18 April 2018 11:11 > To: Oleksandr Andrushchenko > Cc: jgr...@suse.com; Artem Mygaiev ; > Dongwon Kim ; airl...@linux.ie; > oleksandr_andrushche...@epam.com; linux-ker...@vger.kernel.org; dri- > de...@lists.freedesktop.org; Potrola, MateuszX > ; xen-de...@lists.xenproject.org; > daniel.vet...@intel.com; boris.ostrov...@oracle.com; Matt Roper > > Subject: Re: [Xen-devel] [PATCH 0/1] drm/xen-zcopy: Add Xen zero-copy > helper DRM driver > > On Wed, Apr 18, 2018 at 11:01:12AM +0300, Oleksandr Andrushchenko > wrote: > > On 04/18/2018 10:35 AM, Roger Pau Monné wrote: > > > On Wed, Apr 18, 2018 at 09:38:39AM +0300, Oleksandr Andrushchenko > wrote: > > > > On 04/17/2018 11:57 PM, Dongwon Kim wrote: > > > > > On Tue, Apr 17, 2018 at 09:59:28AM +0200, Daniel Vetter wrote: > > > > > > On Mon, Apr 16, 2018 at 12:29:05PM -0700, Dongwon Kim wrote: > > > > 3.2 Backend exports dma-buf to xen-front > > > > > > > > In this case Dom0 pages are shared with DomU. As before, DomU can > only write > > > > to these pages, not any other page from Dom0, so it can be still > considered > > > > safe. > > > > But, the following must be considered (highlighted in xen-front's Kernel > > > > documentation): > > > > - If guest domain dies then pages/grants received from the backend > cannot > > > > be claimed back - think of it as memory lost to Dom0 (won't be used > for > > > > any > > > > other guest) > > > > - Misbehaving guest may send too many requests to the backend > exhausting > > > > its grant references and memory (consider this from security POV). > As the > > > > backend runs in the trusted domain we also assume that it is trusted > as > > > > well, > > > > e.g. must take measures to prevent DDoS attacks. > > > I cannot parse the above sentence: > > > > > > "As the backend runs in the trusted domain we also assume that it is > > > trusted as well, e.g. must take measures to prevent DDoS attacks." > > > > > > What's the relation between being trusted and protecting from DoS > > > attacks? > > I mean that we trust the backend that it can prevent Dom0 > > from crashing in case DomU's frontend misbehaves, e.g. > > if the frontend sends too many memory requests etc. > > > In any case, all? PV protocols are implemented with the frontend > > > sharing pages to the backend, and I think there's a reason why this > > > model is used, and it should continue to be used. > > This is the first use-case above. But there are real-world > > use-cases (embedded in my case) when physically contiguous memory > > needs to be shared, one of the possible ways to achieve this is > > to share contiguous memory from Dom0 to DomU (the second use-case > above) > > > Having to add logic in the backend to prevent such attacks means > > > that: > > > > > > - We need more code in the backend, which increases complexity and > > > chances of bugs. > > > - Such code/logic could be wrong, thus allowing DoS. > > You can live without this code at all, but this is then up to > > backend which may make Dom0 down because of DomU's frontend doing > evil > > things > > IMO we should design protocols that do not allow such attacks instead > of having to defend against them. > > > > > 4. xen-front/backend/xen-zcopy synchronization > > > > > > > > 4.1. As I already said in 2) all the inter VM communication happens > between > > > > xen-front and the backend, xen-zcopy is NOT involved in that. > > > > When xen-front wants to destroy a display buffer (dumb/dma-buf) it > issues a > > > > XENDISPL_OP_DBUF_DESTROY command (opposite to > XENDISPL_OP_DBUF_CREATE). > > > > This call is synchronous, so xen-front expects that backend does free > the > > > > buffer pages on return. > > > > > > > > 4.2. Backend, on XENDISPL_OP_DBUF_DESTROY: > > > > - closes all dumb handles/fd's of the buffer according to [3] > > > > - issues DRM_IOCTL_XEN_ZCOPY_DUMB_WAIT_FREE IOCTL to xen- > zcopy to make > > > > sure > > > > the buffer is freed (think of it as it waits for dma-buf->release > > > > callback) > > > So this zcopy thing keeps some kind of track of the memory usage? Why > > > can't the user-space backend keep track of the buffer usage? > > Because there is no dma-buf UAPI which allows to track the buffer life cycle > > (e.g. wait until dma-buf's .release callback is called) > > > > - replies to xen-front that the buffer can be destroyed. > > > > This way deletion of the buffer happens synchronously on both Dom0 > and DomU > > > > sides. In case if DRM_IOCTL_XEN_ZCOPY_DUMB_WAIT_FREE returns > with time-out > > > > error > > > > (BTW, wait time is a parameter of this IOCTL), Xen will defer grant > > > > reference > > > > removal and will retry later until those are free. > > > > > > > > Hope this helps understand how buffers are synchronously deleted in > case > > > > of xen-zcopy with a single protocol co