Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-07-02 Thread Nathan Chancellor
Hi Will and Robin,

On Fri, Jul 02, 2021 at 04:13:50PM +0100, Robin Murphy wrote:
> On 2021-07-02 14:58, Will Deacon wrote:
> > Hi Nathan,
> > 
> > On Thu, Jul 01, 2021 at 12:52:20AM -0700, Nathan Chancellor wrote:
> > > On 7/1/2021 12:40 AM, Will Deacon wrote:
> > > > On Wed, Jun 30, 2021 at 08:56:51AM -0700, Nathan Chancellor wrote:
> > > > > On Wed, Jun 30, 2021 at 12:43:48PM +0100, Will Deacon wrote:
> > > > > > On Wed, Jun 30, 2021 at 05:17:27PM +0800, Claire Chang wrote:
> > > > > > > `BUG: unable to handle page fault for address: 003a8290` 
> > > > > > > and
> > > > > > > the fact it crashed at `_raw_spin_lock_irqsave` look like the 
> > > > > > > memory
> > > > > > > (maybe dev->dma_io_tlb_mem) was corrupted?
> > > > > > > The dev->dma_io_tlb_mem should be set here
> > > > > > > (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/pci/probe.c#n2528)
> > > > > > > through device_initialize.
> > > > > > 
> > > > > > I'm less sure about this. 'dma_io_tlb_mem' should be pointing at
> > > > > > 'io_tlb_default_mem', which is a page-aligned allocation from 
> > > > > > memblock.
> > > > > > The spinlock is at offset 0x24 in that structure, and looking at the
> > > > > > register dump from the crash:
> > > > > > 
> > > > > > Jun 29 18:28:42 hp-4300G kernel: RSP: 0018:adb4013db9e8 EFLAGS: 
> > > > > > 00010006
> > > > > > Jun 29 18:28:42 hp-4300G kernel: RAX: 003a8290 RBX: 
> > > > > >  RCX: 8900572ad580
> > > > > > Jun 29 18:28:42 hp-4300G kernel: RDX: 89005653f024 RSI: 
> > > > > > 000c RDI: 1d17
> > > > > > Jun 29 18:28:42 hp-4300G kernel: RBP: 0a20d000 R08: 
> > > > > > 000c R09: 
> > > > > > Jun 29 18:28:42 hp-4300G kernel: R10: 0a20d000 R11: 
> > > > > > 89005653f000 R12: 0212
> > > > > > Jun 29 18:28:42 hp-4300G kernel: R13: 1000 R14: 
> > > > > > 0002 R15: 0020
> > > > > > Jun 29 18:28:42 hp-4300G kernel: FS:  7f1f8898ea40() 
> > > > > > GS:89005728() knlGS:
> > > > > > Jun 29 18:28:42 hp-4300G kernel: CS:  0010 DS:  ES:  CR0: 
> > > > > > 80050033
> > > > > > Jun 29 18:28:42 hp-4300G kernel: CR2: 003a8290 CR3: 
> > > > > > 0001020d CR4: 00350ee0
> > > > > > Jun 29 18:28:42 hp-4300G kernel: Call Trace:
> > > > > > Jun 29 18:28:42 hp-4300G kernel:  _raw_spin_lock_irqsave+0x39/0x50
> > > > > > Jun 29 18:28:42 hp-4300G kernel:  swiotlb_tbl_map_single+0x12b/0x4c0
> > > > > > 
> > > > > > Then that correlates with R11 holding the 'dma_io_tlb_mem' pointer 
> > > > > > and
> > > > > > RDX pointing at the spinlock. Yet RAX is holding junk :/
> > > > > > 
> > > > > > I agree that enabling KASAN would be a good idea, but I also think 
> > > > > > we
> > > > > > probably need to get some more information out of 
> > > > > > swiotlb_tbl_map_single()
> > > > > > to see see what exactly is going wrong in there.
> > > > > 
> > > > > I can certainly enable KASAN and if there is any debug print I can add
> > > > > or dump anything, let me know!
> > > > 
> > > > I bit the bullet and took v5.13 with swiotlb/for-linus-5.14 merged in, 
> > > > built
> > > > x86 defconfig and ran it on my laptop. However, it seems to work fine!
> > > > 
> > > > Please can you share your .config?
> > > 
> > > Sure thing, it is attached. It is just Arch Linux's config run through
> > > olddefconfig. The original is below in case you need to diff it.
> > > 
> > > https://raw.githubusercontent.com/archlinux/svntogit-packages/9045405dc835527164f3034b3ceb9a67c7a53cd4/trunk/config
> > > 
> > > If there is anything more that I can provide, please let me know.
> > 
> > I eventually got this booting (for some reason it was causing LD to SEGV
> > trying to link it for a while...) and sadly it works fine on my laptop. Hmm.

Seems like it might be something specific to the amdgpu module?

> > Did you manage to try again with KASAN?

Yes, it took a few times to reproduce the issue but I did manage to get
a dmesg, please find it attached. I build from commit 7d31f1c65cc9 ("swiotlb:
fix implicit debugfs declarations") in Konrad's tree.

> > It might also be worth taking the IOMMU out of the equation, since that
> > interfaces differently with SWIOTLB and I couldn't figure out the code path
> > from the log you provided. What happens if you boot with "amd_iommu=off
> > swiotlb=force"?
> 
> Oh, now there's a thing... the chat from the IOMMU API in the boot log
> implies that the IOMMU *should* be in the picture - we see that default
> domains are IOMMU_DOMAIN_DMA default and the GPU :0c:00.0 was added to a
> group. That means dev->dma_ops should be set and DMA API calls should be
> going through iommu-dma, yet the callstack in the crash says we've gone
> straight from dma_map_page_attrs() to swiotlb_map(), implying the inline
> dma_direct_map_page() path.
> 
> If dev->dma_ops didn't 

Re: [PATCH 2/2] drm/vc4: hdmi: Convert to gpiod

2021-07-02 Thread Nathan Chancellor
On Fri, Jul 02, 2021 at 03:16:46PM +0200, Maxime Ripard wrote:
> Hi Nathan,
> 
> On Thu, Jul 01, 2021 at 08:29:34PM -0700, Nathan Chancellor wrote:
> > On Mon, May 24, 2021 at 03:18:52PM +0200, Maxime Ripard wrote:
> > > The new gpiod interface takes care of parsing the GPIO flags and to
> > > return the logical value when accessing an active-low GPIO, so switching
> > > to it simplifies a lot the driver.
> > > 
> > > Signed-off-by: Maxime Ripard 
> > > ---
> > >  drivers/gpu/drm/vc4/vc4_hdmi.c | 24 +++-
> > >  drivers/gpu/drm/vc4/vc4_hdmi.h |  3 +--
> > >  2 files changed, 8 insertions(+), 19 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c 
> > > b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > index ccc6c8079dc6..34622c59f6a7 100644
> > > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > @@ -159,10 +159,9 @@ vc4_hdmi_connector_detect(struct drm_connector 
> > > *connector, bool force)
> > >   struct vc4_hdmi *vc4_hdmi = connector_to_vc4_hdmi(connector);
> > >   bool connected = false;
> > >  
> > > - if (vc4_hdmi->hpd_gpio) {
> > > - if (gpio_get_value_cansleep(vc4_hdmi->hpd_gpio) ^
> > > - vc4_hdmi->hpd_active_low)
> > > - connected = true;
> > > + if (vc4_hdmi->hpd_gpio &&
> > > + gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio)) {
> > > + connected = true;
> > >   } else if (drm_probe_ddc(vc4_hdmi->ddc)) {
> > >   connected = true;
> > >   } else if (HDMI_READ(HDMI_HOTPLUG) & VC4_HDMI_HOTPLUG_CONNECTED) {
> > > @@ -1993,7 +1992,6 @@ static int vc4_hdmi_bind(struct device *dev, struct 
> > > device *master, void *data)
> > >   struct vc4_hdmi *vc4_hdmi;
> > >   struct drm_encoder *encoder;
> > >   struct device_node *ddc_node;
> > > - u32 value;
> > >   int ret;
> > >  
> > >   vc4_hdmi = devm_kzalloc(dev, sizeof(*vc4_hdmi), GFP_KERNEL);
> > > @@ -2031,18 +2029,10 @@ static int vc4_hdmi_bind(struct device *dev, 
> > > struct device *master, void *data)
> > >   /* Only use the GPIO HPD pin if present in the DT, otherwise
> > >* we'll use the HDMI core's register.
> > >*/
> > > - if (of_find_property(dev->of_node, "hpd-gpios", )) {
> > > - enum of_gpio_flags hpd_gpio_flags;
> > > -
> > > - vc4_hdmi->hpd_gpio = of_get_named_gpio_flags(dev->of_node,
> > > -  "hpd-gpios", 0,
> > > -  _gpio_flags);
> > > - if (vc4_hdmi->hpd_gpio < 0) {
> > > - ret = vc4_hdmi->hpd_gpio;
> > > - goto err_put_ddc;
> > > - }
> > > -
> > > - vc4_hdmi->hpd_active_low = hpd_gpio_flags & OF_GPIO_ACTIVE_LOW;
> > > + vc4_hdmi->hpd_gpio = devm_gpiod_get_optional(dev, "hpd", GPIOD_IN);
> > > + if (IS_ERR(vc4_hdmi->hpd_gpio)) {
> > > + ret = PTR_ERR(vc4_hdmi->hpd_gpio);
> > > + goto err_put_ddc;
> > >   }
> > >  
> > >   vc4_hdmi->disable_wifi_frequencies =
> > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h 
> > > b/drivers/gpu/drm/vc4/vc4_hdmi.h
> > > index 060bcaefbeb5..2688a55461d6 100644
> > > --- a/drivers/gpu/drm/vc4/vc4_hdmi.h
> > > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.h
> > > @@ -146,8 +146,7 @@ struct vc4_hdmi {
> > >   /* VC5 Only */
> > >   void __iomem *rm_regs;
> > >  
> > > - int hpd_gpio;
> > > - bool hpd_active_low;
> > > + struct gpio_desc *hpd_gpio;
> > >  
> > >   /*
> > >* On some systems (like the RPi4), some modes are in the same
> > > -- 
> > > 2.31.1
> > 
> > This patch as commit 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod")
> > causes my Raspberry Pi 3 to lock up shortly after boot in combination
> > with commit 411efa18e4b0 ("drm/vc4: hdmi: Move the HSM clock enable to
> > runtime_pm"). The serial console and ssh are completely unresponsive and
> > I do not see any messages in dmesg with "debug ignore_loglevel". The
> > device is running with a 32-bit kernel (multi_v7_defconfig) with 32-bit
> > userspace. If there is any further information that I can provide,
> > please let me know.
> 
> Thanks for reporting this. The same bug has been reported on wednesday
> on the RPi repo here:
> https://github.com/raspberrypi/linux/pull/4418
> 
> More specifically, this commit should fix it:
> https://github.com/raspberrypi/linux/pull/4418/commits/6d404373c20a794da3d6a7b4f1373903183bb5d0
> 
> Even though it's based on the 5.10 kernel, it should apply without any
> warning on a mainline tree. Let me know if it fixes your issue too

Thank you for the links and the quick reply. Unfortunately, I applied
this patch on top of commit d6b63b5b7d7f ("Merge tag 'sound-5.14-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound") in mainline,
which does reproduce this issue still and it did not fix the issue. In
fact, it did not even get to the raspberrypi login prompt before it
locked up, yet again without any real output in the serial console
except for maybe this message?

[7.582480] vc4-drm 

Re: Start fixing the shared to exclusive fence dependencies.

2021-07-02 Thread Daniel Vetter
On Fri, Jul 02, 2021 at 01:16:38PM +0200, Christian König wrote:
> Hey Daniel,
> 
> even when you are not 100% done with the driver audit I think we should
> push that patch set here to drm-misc-next now so that it can end up in
> 5.15.

So I think I got them all, just need to type up some good docs all over
the place next week and send it out.
-Daniel

> 
> Not having any dependency between the exclusive and the shared fence
> signaling order is just way more defensive than the current model.
> 
> As discussed I'm holding back any amdgpu and TTM workarounds which could
> be removed for now.
> 
> Thoughts?
> 
> Thanks,
> Christian.
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH] drm/amdgpu: Return error if no RAS

2021-07-02 Thread Luben Tuikov
In amdgpu_ras_query_error_count() return an error
if the device doesn't support RAS. This prevents
that function from having to always set the values
of the integer pointers (if set), and thus
prevents function side effects--always to have to
set values of integers if integer pointers set,
regardless of whether RAS is supported or
not--with this change this side effect is
mitigated.

Also, if no pointers are set, don't count, since
we've no way of reporting the counts.

Also, give this function a kernel-doc.

Cc: Alexander Deucher 
Cc: John Clements 
Cc: Hawking Zhang 
Reported-by: Tom Rix 
Fixes: a46751fbcde505 ("drm/amdgpu: Fix RAS function interface")
Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 49 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h |  6 +--
 2 files changed, 38 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index c6ae63893dbdb2..ed698b2be79023 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -813,7 +813,7 @@ static int amdgpu_ras_enable_all_features(struct 
amdgpu_device *adev,
 
 /* query/inject/cure begin */
 int amdgpu_ras_query_error_status(struct amdgpu_device *adev,
-   struct ras_query_if *info)
+ struct ras_query_if *info)
 {
struct ras_manager *obj = amdgpu_ras_find_obj(adev, >head);
struct ras_err_data err_data = {0, 0, 0, NULL};
@@ -1047,17 +1047,32 @@ int amdgpu_ras_error_inject(struct amdgpu_device *adev,
return ret;
 }
 
-/* get the total error counts on all IPs */
-void amdgpu_ras_query_error_count(struct amdgpu_device *adev,
- unsigned long *ce_count,
- unsigned long *ue_count)
+/**
+ * amdgpu_ras_query_error_count -- Get error counts of all IPs
+ * adev: pointer to AMD GPU device
+ * ce_count: pointer to an integer to be set to the count of correctible 
errors.
+ * ue_count: pointer to an integer to be set to the count of uncorrectible
+ * errors.
+ *
+ * If set, @ce_count or @ue_count, count and return the corresponding
+ * error counts in those integer pointers. Return 0 if the device
+ * supports RAS. Return -EINVAL if the device doesn't support RAS.
+ */
+int amdgpu_ras_query_error_count(struct amdgpu_device *adev,
+unsigned long *ce_count,
+unsigned long *ue_count)
 {
struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
struct ras_manager *obj;
unsigned long ce, ue;
 
if (!adev->ras_enabled || !con)
-   return;
+   return -EINVAL;
+
+   /* Don't count since no reporting.
+*/
+   if (!ce_count && !ue_count)
+   return 0;
 
ce = 0;
ue = 0;
@@ -1065,9 +1080,11 @@ void amdgpu_ras_query_error_count(struct amdgpu_device 
*adev,
struct ras_query_if info = {
.head = obj->head,
};
+   int res;
 
-   if (amdgpu_ras_query_error_status(adev, ))
-   return;
+   res = amdgpu_ras_query_error_status(adev, );
+   if (res)
+   return res;
 
ce += info.ce_count;
ue += info.ue_count;
@@ -1078,6 +1095,8 @@ void amdgpu_ras_query_error_count(struct amdgpu_device 
*adev,
 
if (ue_count)
*ue_count = ue;
+
+   return 0;
 }
 /* query/inject/cure end */
 
@@ -2145,9 +2164,10 @@ static void amdgpu_ras_counte_dw(struct work_struct 
*work)
 
/* Cache new values.
 */
-   amdgpu_ras_query_error_count(adev, _count, _count);
-   atomic_set(>ras_ce_count, ce_count);
-   atomic_set(>ras_ue_count, ue_count);
+   if (amdgpu_ras_query_error_count(adev, _count, _count) == 0) {
+   atomic_set(>ras_ce_count, ce_count);
+   atomic_set(>ras_ue_count, ue_count);
+   }
 
pm_runtime_mark_last_busy(dev->dev);
 Out:
@@ -2320,9 +2340,10 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
 
/* Those are the cached values at init.
 */
-   amdgpu_ras_query_error_count(adev, _count, _count);
-   atomic_set(>ras_ce_count, ce_count);
-   atomic_set(>ras_ue_count, ue_count);
+   if (amdgpu_ras_query_error_count(adev, _count, _count) == 0) {
+   atomic_set(>ras_ce_count, ce_count);
+   atomic_set(>ras_ue_count, ue_count);
+   }
 
return 0;
 cleanup:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
index 283afd791db107..4d9c63f2f37718 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
@@ -491,9 +491,9 @@ int amdgpu_ras_request_reset_on_boot(struct amdgpu_device 
*adev,
 void amdgpu_ras_resume(struct amdgpu_device *adev);
 void 

Re: [PATCH 4/4] drm/msm: always wait for the exclusive fence

2021-07-02 Thread Daniel Vetter
On Fri, Jul 02, 2021 at 01:16:42PM +0200, Christian König wrote:
> Drivers also need to to sync to the exclusive fence when
> a shared one is present.
> 
> Completely untested since the driver won't even compile on !ARM.

It's really not that hard to set up a cross-compiler, reasonable distros
have that now all packages. Does explain though why you tend to break the
arm build with drm-misc patches.

Please fix this.

> Signed-off-by: Christian König 

Reviewed-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/msm/msm_gem.c | 16 +++-
>  1 file changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> index a94a43de95ef..72a07e311de3 100644
> --- a/drivers/gpu/drm/msm/msm_gem.c
> +++ b/drivers/gpu/drm/msm/msm_gem.c
> @@ -817,17 +817,15 @@ int msm_gem_sync_object(struct drm_gem_object *obj,
>   struct dma_fence *fence;
>   int i, ret;
>  
> - fobj = dma_resv_shared_list(obj->resv);
> - if (!fobj || (fobj->shared_count == 0)) {
> - fence = dma_resv_excl_fence(obj->resv);
> - /* don't need to wait on our own fences, since ring is fifo */
> - if (fence && (fence->context != fctx->context)) {
> - ret = dma_fence_wait(fence, true);
> - if (ret)
> - return ret;
> - }
> + fence = dma_resv_excl_fence(obj->resv);
> + /* don't need to wait on our own fences, since ring is fifo */
> + if (fence && (fence->context != fctx->context)) {
> + ret = dma_fence_wait(fence, true);
> + if (ret)
> + return ret;
>   }
>  
> + fobj = dma_resv_shared_list(obj->resv);
>   if (!exclusive || !fobj)
>   return 0;
>  
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 3/4] drm/nouveau: always wait for the exclusive fence

2021-07-02 Thread Daniel Vetter
On Fri, Jul 02, 2021 at 01:16:41PM +0200, Christian König wrote:
> Drivers also need to to sync to the exclusive fence when
> a shared one is present.
> 
> Signed-off-by: Christian König 

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
> b/drivers/gpu/drm/nouveau/nouveau_fence.c
> index 6b43918035df..05d0b3eb3690 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> @@ -358,7 +358,7 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct 
> nouveau_channel *chan, bool e
>   fobj = dma_resv_shared_list(resv);
>   fence = dma_resv_excl_fence(resv);
>  
> - if (fence && (!exclusive || !fobj || !fobj->shared_count)) {
> + if (fence) {
>   struct nouveau_channel *prev = NULL;
>   bool must_wait = true;
>  
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/amdgpu: initialize amdgpu_ras_query_error_count() error count parameters

2021-07-02 Thread Luben Tuikov
That's a good find, but I'd rather functions have no side effects. I'll follow 
up with a patch which correctly fixes this.

Regards,
Luben

On 2021-07-02 3:52 p.m., t...@redhat.com wrote:
> From: Tom Rix 
>
> Static analysis reports this problem
> amdgpu_ras.c:2324:2: warning: 2nd function call argument is an
>   uninitialized value
> atomic_set(>ras_ce_count, ce_count);
> ^~~~
>
> ce_count is normally set by the earlier call to
> amdgpu_ras_query_error_count().  But amdgpu_ras_query_error_count()
> can return early without setting, leaving its error count parameters
> in a garbage state.
>
> Initialize the error count parameters earlier.
>
> Fixes: a46751fbcde5 ("drm/amdgpu: Fix RAS function interface")
> Signed-off-by: Tom Rix 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index 875874ea745ec..c80fa545aa2b8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1056,6 +1056,12 @@ void amdgpu_ras_query_error_count(struct amdgpu_device 
> *adev,
>   struct ras_manager *obj;
>   unsigned long ce, ue;
>  
> + if (ce_count)
> + *ce_count = 0;
> +
> + if (ue_count)
> + *ue_count = 0;
> +
>   if (!adev->ras_enabled || !con)
>   return;
>  



Re: [PATCH 2/4] dma-buf: fix dma_resv_test_signaled test_all handling v2

2021-07-02 Thread Daniel Vetter
On Fri, Jul 02, 2021 at 01:16:40PM +0200, Christian König wrote:
> As the name implies if testing all fences is requested we
> should indeed test all fences and not skip the exclusive
> one because we see shared ones.
> 
> v2: fix logic once more
> 
> Signed-off-by: Christian König 

Reviewed-by: Daniel Vetter 

> ---
>  drivers/dma-buf/dma-resv.c | 33 -
>  1 file changed, 12 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index 4ab02b6c387a..18dd5a6ca06c 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -618,25 +618,21 @@ static inline int dma_resv_test_signaled_single(struct 
> dma_fence *passed_fence)
>   */
>  bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
>  {
> - unsigned int seq, shared_count;
> + struct dma_fence *fence;
> + unsigned int seq;
>   int ret;
>  
>   rcu_read_lock();
>  retry:
>   ret = true;
> - shared_count = 0;
>   seq = read_seqcount_begin(>seq);
>  
>   if (test_all) {
>   struct dma_resv_list *fobj = dma_resv_shared_list(obj);
> - unsigned int i;
> -
> - if (fobj)
> - shared_count = fobj->shared_count;
> + unsigned int i, shared_count;
>  
> + shared_count = fobj ? fobj->shared_count : 0;
>   for (i = 0; i < shared_count; ++i) {
> - struct dma_fence *fence;
> -
>   fence = rcu_dereference(fobj->shared[i]);
>   ret = dma_resv_test_signaled_single(fence);
>   if (ret < 0)
> @@ -644,24 +640,19 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool 
> test_all)
>   else if (!ret)
>   break;
>   }
> -
> - if (read_seqcount_retry(>seq, seq))
> - goto retry;
>   }
>  
> - if (!shared_count) {
> - struct dma_fence *fence_excl = dma_resv_excl_fence(obj);
> -
> - if (fence_excl) {
> - ret = dma_resv_test_signaled_single(fence_excl);
> - if (ret < 0)
> - goto retry;
> + fence = dma_resv_excl_fence(obj);
> + if (ret && fence) {
> + ret = dma_resv_test_signaled_single(fence);
> + if (ret < 0)
> + goto retry;
>  
> - if (read_seqcount_retry(>seq, seq))
> - goto retry;
> - }
>   }
>  
> + if (read_seqcount_retry(>seq, seq))
> + goto retry;
> +
>   rcu_read_unlock();
>   return ret;
>  }
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/4] dma-buf: add some more kerneldoc to dma_resv_add_shared_fence

2021-07-02 Thread Daniel Vetter
On Fri, Jul 02, 2021 at 01:16:39PM +0200, Christian König wrote:
> Explicitly document that code can't assume that shared fences
> signal after the exclusive fence.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/dma-buf/dma-resv.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index f26c71747d43..4ab02b6c387a 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -235,7 +235,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
>   * @fence: the shared fence to add
>   *
>   * Add a fence to a shared slot, obj->lock must be held, and
> - * dma_resv_reserve_shared() has been called.
> + * dma_resv_reserve_shared() has been called. The shared fences can signal in
> + * any order and there is especially no guarantee that shared fences signal
> + * after the exclusive one. Code relying on any signaling order is broken and
> + * needs to be fixed.

This feels like the last place I'd go look for how I should handle
dependencies. It's the function for adding shared fences after all, has
absolutely nothing to do with whether we should wait for them.

I'll type up something else.
-Daniel

>   */
>  void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
>  {
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH v2 09/11] drm/gem: Delete gem array fencing helpers

2021-07-02 Thread Daniel Vetter
Integrated into the scheduler now and all users converted over.

Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/drm_gem.c | 96 ---
 include/drm/drm_gem.h |  5 --
 2 files changed, 101 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 68deb1de8235..24d49a2636e0 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1294,99 +1294,3 @@ drm_gem_unlock_reservations(struct drm_gem_object 
**objs, int count,
ww_acquire_fini(acquire_ctx);
 }
 EXPORT_SYMBOL(drm_gem_unlock_reservations);
-
-/**
- * drm_gem_fence_array_add - Adds the fence to an array of fences to be
- * waited on, deduplicating fences from the same context.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @fence: the dma_fence to add to the list of dependencies.
- *
- * This functions consumes the reference for @fence both on success and error
- * cases.
- *
- * Returns:
- * 0 on success, or an error on failing to expand the array.
- */
-int drm_gem_fence_array_add(struct xarray *fence_array,
-   struct dma_fence *fence)
-{
-   struct dma_fence *entry;
-   unsigned long index;
-   u32 id = 0;
-   int ret;
-
-   if (!fence)
-   return 0;
-
-   /* Deduplicate if we already depend on a fence from the same context.
-* This lets the size of the array of deps scale with the number of
-* engines involved, rather than the number of BOs.
-*/
-   xa_for_each(fence_array, index, entry) {
-   if (entry->context != fence->context)
-   continue;
-
-   if (dma_fence_is_later(fence, entry)) {
-   dma_fence_put(entry);
-   xa_store(fence_array, index, fence, GFP_KERNEL);
-   } else {
-   dma_fence_put(fence);
-   }
-   return 0;
-   }
-
-   ret = xa_alloc(fence_array, , fence, xa_limit_32b, GFP_KERNEL);
-   if (ret != 0)
-   dma_fence_put(fence);
-
-   return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add);
-
-/**
- * drm_gem_fence_array_add_implicit - Adds the implicit dependencies tracked
- * in the GEM object's reservation object to an array of dma_fences for use in
- * scheduling a rendering job.
- *
- * This should be called after drm_gem_lock_reservations() on your array of
- * GEM objects used in the job but before updating the reservations with your
- * own fences.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @obj: the gem object to add new dependencies from.
- * @write: whether the job might write the object (so we need to depend on
- * shared fences in the reservation object).
- */
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
-struct drm_gem_object *obj,
-bool write)
-{
-   int ret;
-   struct dma_fence **fences;
-   unsigned int i, fence_count;
-
-   if (!write) {
-   struct dma_fence *fence =
-   dma_resv_get_excl_unlocked(obj->resv);
-
-   return drm_gem_fence_array_add(fence_array, fence);
-   }
-
-   ret = dma_resv_get_fences(obj->resv, NULL,
-   _count, );
-   if (ret || !fence_count)
-   return ret;
-
-   for (i = 0; i < fence_count; i++) {
-   ret = drm_gem_fence_array_add(fence_array, fences[i]);
-   if (ret)
-   break;
-   }
-
-   for (; i < fence_count; i++)
-   dma_fence_put(fences[i]);
-   kfree(fences);
-   return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add_implicit);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 240049566592..6d5e33b89074 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -409,11 +409,6 @@ int drm_gem_lock_reservations(struct drm_gem_object 
**objs, int count,
  struct ww_acquire_ctx *acquire_ctx);
 void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
 struct ww_acquire_ctx *acquire_ctx);
-int drm_gem_fence_array_add(struct xarray *fence_array,
-   struct dma_fence *fence);
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
-struct drm_gem_object *obj,
-bool write);
 int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
u32 handle, u64 *offset);
 
-- 
2.32.0.rc2



[PATCH v2 10/11] drm/sched: Don't store self-dependencies

2021-07-02 Thread Daniel Vetter
This is essentially part of drm_sched_dependency_optimized(), which
only amdgpu seems to make use of. Use it a bit more.

This would mean that as-is amdgpu can't use the dependency helpers, at
least not with the current approach amdgpu has for deciding whether a
vm_flush is needed. Since amdgpu also has very special rules around
implicit fencing it can't use those helpers either, and adding a
drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too
onerous. That way the special case handling for amdgpu sticks even
more out and we have higher chances that reviewers that go across all
drivers wont miss it.

Reviewed-by: Lucas Stach 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 
Cc: Jack Zhang 
---
 drivers/gpu/drm/scheduler/sched_main.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 12d533486518..de76f7e14e0d 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -651,6 +651,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job,
if (!fence)
return 0;
 
+   /* if it's a fence from us it's guaranteed to be earlier */
+   if (fence->context == job->entity->fence_context ||
+   fence->context == job->entity->fence_context + 1) {
+   dma_fence_put(fence);
+   return 0;
+   }
+
/* Deduplicate if we already depend on a fence from the same context.
 * This lets the size of the array of deps scale with the number of
 * engines involved, rather than the number of BOs.
-- 
2.32.0.rc2



[PATCH v2 11/11] drm/sched: Check locking in drm_sched_job_await_implicit

2021-07-02 Thread Daniel Vetter
You really need to hold the reservation here or all kinds of funny
things can happen between grabbing the dependencies and inserting the
new fences.

Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 
Cc: Jack Zhang 
---
 drivers/gpu/drm/scheduler/sched_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index de76f7e14e0d..47f869aff335 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -705,6 +705,8 @@ int drm_sched_job_await_implicit(struct drm_sched_job *job,
struct dma_fence **fences;
unsigned int i, fence_count;
 
+   dma_resv_assert_held(obj->resv);
+
if (!write) {
struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
 
-- 
2.32.0.rc2



[PATCH v2 06/11] drm/v3d: Move drm_sched_job_init to v3d_job_init

2021-07-02 Thread Daniel Vetter
Prep work for using the scheduler dependency handling. We need to call
drm_sched_job_init earlier so we can use the new drm_sched_job_await*
functions for dependency handling here.

v2: Slightly better commit message and rebase to include the
drm_sched_job_arm() call (Emma).

v3: Cleanup jobs under construction correctly (Emma)

Signed-off-by: Daniel Vetter 
Cc: Emma Anholt 
---
 drivers/gpu/drm/v3d/v3d_drv.h   |  1 +
 drivers/gpu/drm/v3d/v3d_gem.c   | 88 ++---
 drivers/gpu/drm/v3d/v3d_sched.c | 15 +++---
 3 files changed, 44 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 8a390738d65b..1d870261eaac 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -332,6 +332,7 @@ int v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
 int v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
  struct drm_file *file_priv);
+void v3d_job_cleanup(struct v3d_job *job);
 void v3d_job_put(struct v3d_job *job);
 void v3d_reset(struct v3d_dev *v3d);
 void v3d_invalidate_caches(struct v3d_dev *v3d);
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 69ac20e11b09..5eccd3658938 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -392,6 +392,12 @@ v3d_render_job_free(struct kref *ref)
v3d_job_free(ref);
 }
 
+void v3d_job_cleanup(struct v3d_job *job)
+{
+   drm_sched_job_cleanup(>base);
+   v3d_job_put(job);
+}
+
 void v3d_job_put(struct v3d_job *job)
 {
kref_put(>refcount, job->free);
@@ -433,9 +439,10 @@ v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
 static int
 v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 struct v3d_job *job, void (*free)(struct kref *ref),
-u32 in_sync)
+u32 in_sync, enum v3d_queue queue)
 {
struct dma_fence *in_fence = NULL;
+   struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
int ret;
 
job->v3d = v3d;
@@ -446,35 +453,33 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
return ret;
 
xa_init_flags(>deps, XA_FLAGS_ALLOC);
+   ret = drm_sched_job_init(>base, _priv->sched_entity[queue],
+v3d_priv);
+   if (ret)
+   goto fail;
 
ret = drm_syncobj_find_fence(file_priv, in_sync, 0, 0, _fence);
if (ret == -EINVAL)
-   goto fail;
+   goto fail_job;
 
ret = drm_gem_fence_array_add(>deps, in_fence);
if (ret)
-   goto fail;
+   goto fail_job;
 
kref_init(>refcount);
 
return 0;
+fail_job:
+   drm_sched_job_cleanup(>base);
 fail:
xa_destroy(>deps);
pm_runtime_put_autosuspend(v3d->drm.dev);
return ret;
 }
 
-static int
-v3d_push_job(struct v3d_file_priv *v3d_priv,
-struct v3d_job *job, enum v3d_queue queue)
+static void
+v3d_push_job(struct v3d_job *job)
 {
-   int ret;
-
-   ret = drm_sched_job_init(>base, _priv->sched_entity[queue],
-v3d_priv);
-   if (ret)
-   return ret;
-
drm_sched_job_arm(>base);
 
job->done_fence = dma_fence_get(>base.s_fence->finished);
@@ -483,8 +488,6 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
kref_get(>refcount);
 
drm_sched_entity_push_job(>base);
-
-   return 0;
 }
 
 static void
@@ -530,7 +533,6 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
 {
struct v3d_dev *v3d = to_v3d_dev(dev);
-   struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
struct drm_v3d_submit_cl *args = data;
struct v3d_bin_job *bin = NULL;
struct v3d_render_job *render;
@@ -556,7 +558,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
INIT_LIST_HEAD(>unref_list);
 
ret = v3d_job_init(v3d, file_priv, >base,
-  v3d_render_job_free, args->in_sync_rcl);
+  v3d_render_job_free, args->in_sync_rcl, V3D_RENDER);
if (ret) {
kfree(render);
return ret;
@@ -570,7 +572,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
}
 
ret = v3d_job_init(v3d, file_priv, >base,
-  v3d_job_free, args->in_sync_bcl);
+  v3d_job_free, args->in_sync_bcl, V3D_BIN);
if (ret) {
v3d_job_put(>base);
kfree(bin);
@@ -592,7 +594,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
goto fail;
}
 
-   ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0);
+   ret = v3d_job_init(v3d, file_priv, clean_job, 

[PATCH v2 07/11] drm/v3d: Use scheduler dependency handling

2021-07-02 Thread Daniel Vetter
With the prep work out of the way this isn't tricky anymore.

Aside: The chaining of the various jobs is a bit awkward, with the
possibility of failure in bad places. I think with the
drm_sched_job_init/arm split and maybe preloading the
job->dependencies xarray this should be fixable.

Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/v3d/v3d_drv.h   |  5 -
 drivers/gpu/drm/v3d/v3d_gem.c   | 25 -
 drivers/gpu/drm/v3d/v3d_sched.c | 29 +
 3 files changed, 9 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 1d870261eaac..f80f4ff1f7aa 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -192,11 +192,6 @@ struct v3d_job {
struct drm_gem_object **bo;
u32 bo_count;
 
-   /* Array of struct dma_fence * to block on before submitting this job.
-*/
-   struct xarray deps;
-   unsigned long last_dep;
-
/* v3d fence to be signaled by IRQ handler when the job is complete. */
struct dma_fence *irq_fence;
 
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 5eccd3658938..42b07ffbea5e 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -257,8 +257,8 @@ v3d_lock_bo_reservations(struct v3d_job *job,
return ret;
 
for (i = 0; i < job->bo_count; i++) {
-   ret = drm_gem_fence_array_add_implicit(>deps,
-  job->bo[i], true);
+   ret = drm_sched_job_await_implicit(>base,
+  job->bo[i], true);
if (ret) {
drm_gem_unlock_reservations(job->bo, job->bo_count,
acquire_ctx);
@@ -354,8 +354,6 @@ static void
 v3d_job_free(struct kref *ref)
 {
struct v3d_job *job = container_of(ref, struct v3d_job, refcount);
-   unsigned long index;
-   struct dma_fence *fence;
int i;
 
for (i = 0; i < job->bo_count; i++) {
@@ -364,11 +362,6 @@ v3d_job_free(struct kref *ref)
}
kvfree(job->bo);
 
-   xa_for_each(>deps, index, fence) {
-   dma_fence_put(fence);
-   }
-   xa_destroy(>deps);
-
dma_fence_put(job->irq_fence);
dma_fence_put(job->done_fence);
 
@@ -452,7 +445,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
if (ret < 0)
return ret;
 
-   xa_init_flags(>deps, XA_FLAGS_ALLOC);
ret = drm_sched_job_init(>base, _priv->sched_entity[queue],
 v3d_priv);
if (ret)
@@ -462,7 +454,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
if (ret == -EINVAL)
goto fail_job;
 
-   ret = drm_gem_fence_array_add(>deps, in_fence);
+   ret = drm_sched_job_await_fence(>base, in_fence);
if (ret)
goto fail_job;
 
@@ -472,7 +464,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
 fail_job:
drm_sched_job_cleanup(>base);
 fail:
-   xa_destroy(>deps);
pm_runtime_put_autosuspend(v3d->drm.dev);
return ret;
 }
@@ -619,8 +610,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
if (bin) {
v3d_push_job(>base);
 
-   ret = drm_gem_fence_array_add(>base.deps,
- 
dma_fence_get(bin->base.done_fence));
+   ret = drm_sched_job_await_fence(>base.base,
+   
dma_fence_get(bin->base.done_fence));
if (ret)
goto fail_unreserve;
}
@@ -630,7 +621,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
if (clean_job) {
struct dma_fence *render_fence =
dma_fence_get(render->base.done_fence);
-   ret = drm_gem_fence_array_add(_job->deps, render_fence);
+   ret = drm_sched_job_await_fence(_job->base, render_fence);
if (ret)
goto fail_unreserve;
v3d_push_job(clean_job);
@@ -820,8 +811,8 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
mutex_lock(>sched_lock);
v3d_push_job(>base);
 
-   ret = drm_gem_fence_array_add(_job->deps,
- dma_fence_get(job->base.done_fence));
+   ret = drm_sched_job_await_fence(_job->base,
+   dma_fence_get(job->base.done_fence));
if (ret)
goto fail_unreserve;
 
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 3f352d73af9c..f0de584f452c 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -13,7 +13,7 @@
  * jobs when bulk background jobs are queued up, we submit a new job
  * 

[PATCH v2 08/11] drm/etnaviv: Use scheduler dependency handling

2021-07-02 Thread Daniel Vetter
We need to pull the drm_sched_job_init much earlier, but that's very
minor surgery.

Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: etna...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.h|  5 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 32 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 61 +---
 drivers/gpu/drm/etnaviv/etnaviv_sched.h  |  3 +-
 4 files changed, 20 insertions(+), 81 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
index 98e60df882b6..63688e6e4580 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
@@ -80,9 +80,6 @@ struct etnaviv_gem_submit_bo {
u64 va;
struct etnaviv_gem_object *obj;
struct etnaviv_vram_mapping *mapping;
-   struct dma_fence *excl;
-   unsigned int nr_shared;
-   struct dma_fence **shared;
 };
 
 /* Created per submit-ioctl, to track bo's and cmdstream bufs, etc,
@@ -95,7 +92,7 @@ struct etnaviv_gem_submit {
struct etnaviv_file_private *ctx;
struct etnaviv_gpu *gpu;
struct etnaviv_iommu_context *mmu_context, *prev_mmu_context;
-   struct dma_fence *out_fence, *in_fence;
+   struct dma_fence *out_fence;
int out_fence_id;
struct list_head node; /* GPU active submit list */
struct etnaviv_cmdbuf cmdbuf;
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 4dd7d9d541c0..92478a50a580 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -188,16 +188,10 @@ static int submit_fence_sync(struct etnaviv_gem_submit 
*submit)
if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
continue;
 
-   if (bo->flags & ETNA_SUBMIT_BO_WRITE) {
-   ret = dma_resv_get_fences(robj, >excl,
- >nr_shared,
- >shared);
-   if (ret)
-   return ret;
-   } else {
-   bo->excl = dma_resv_get_excl_unlocked(robj);
-   }
-
+   ret = drm_sched_job_await_implicit(>sched_job, 
>obj->base,
+  bo->flags & 
ETNA_SUBMIT_BO_WRITE);
+   if (ret)
+   return ret;
}
 
return ret;
@@ -403,8 +397,6 @@ static void submit_cleanup(struct kref *kref)
 
wake_up_all(>gpu->fence_event);
 
-   if (submit->in_fence)
-   dma_fence_put(submit->in_fence);
if (submit->out_fence) {
/* first remove from IDR, so fence can not be found anymore */
mutex_lock(>gpu->fence_lock);
@@ -537,6 +529,12 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
*data,
submit->exec_state = args->exec_state;
submit->flags = args->flags;
 
+   ret = drm_sched_job_init(>sched_job,
+>sched_entity[args->pipe],
+submit->ctx);
+   if (ret)
+   goto err_submit_objects;
+
ret = submit_lookup_objects(submit, file, bos, args->nr_bos);
if (ret)
goto err_submit_objects;
@@ -549,11 +547,15 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
*data,
}
 
if (args->flags & ETNA_SUBMIT_FENCE_FD_IN) {
-   submit->in_fence = sync_file_get_fence(args->fence_fd);
-   if (!submit->in_fence) {
+   struct dma_fence *in_fence = 
sync_file_get_fence(args->fence_fd);
+   if (!in_fence) {
ret = -EINVAL;
goto err_submit_objects;
}
+
+   ret = drm_sched_job_await_fence(>sched_job, in_fence);
+   if (ret)
+   goto err_submit_objects;
}
 
ret = submit_pin_objects(submit);
@@ -579,7 +581,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
*data,
if (ret)
goto err_submit_objects;
 
-   ret = etnaviv_sched_push_job(>sched_entity[args->pipe], submit);
+   ret = etnaviv_sched_push_job(submit);
if (ret)
goto err_submit_objects;
 
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 180bb633d5c5..c98d67320be3 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -17,58 +17,6 @@ module_param_named(job_hang_limit, etnaviv_job_hang_limit, 
int , 0444);
 static int etnaviv_hw_jobs_limit = 4;
 module_param_named(hw_job_limit, etnaviv_hw_jobs_limit, int , 0444);
 
-static struct 

[PATCH v2 05/11] drm/lima: use scheduler dependency tracking

2021-07-02 Thread Daniel Vetter
Nothing special going on here.

Aside reviewing the code, it seems like drm_sched_job_arm() should be
moved into lima_sched_context_queue_task and put under some mutex
together with drm_sched_push_job(). See the kerneldoc for
drm_sched_push_job().

Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/lima/lima_gem.c   |  4 ++--
 drivers/gpu/drm/lima/lima_sched.c | 21 -
 drivers/gpu/drm/lima/lima_sched.h |  3 ---
 3 files changed, 2 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index c528f40981bb..e54a88d5037a 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -267,7 +267,7 @@ static int lima_gem_sync_bo(struct lima_sched_task *task, 
struct lima_bo *bo,
if (explicit)
return 0;
 
-   return drm_gem_fence_array_add_implicit(>deps, >base.base, 
write);
+   return drm_sched_job_await_implicit(>base, >base.base, write);
 }
 
 static int lima_gem_add_deps(struct drm_file *file, struct lima_submit *submit)
@@ -285,7 +285,7 @@ static int lima_gem_add_deps(struct drm_file *file, struct 
lima_submit *submit)
if (err)
return err;
 
-   err = drm_gem_fence_array_add(>task->deps, fence);
+   err = drm_sched_job_await_fence(>task->base, fence);
if (err) {
dma_fence_put(fence);
return err;
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index e968b5a8f0b0..99d5f6f1a882 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -134,24 +134,15 @@ int lima_sched_task_init(struct lima_sched_task *task,
task->num_bos = num_bos;
task->vm = lima_vm_get(vm);
 
-   xa_init_flags(>deps, XA_FLAGS_ALLOC);
-
return 0;
 }
 
 void lima_sched_task_fini(struct lima_sched_task *task)
 {
-   struct dma_fence *fence;
-   unsigned long index;
int i;
 
drm_sched_job_cleanup(>base);
 
-   xa_for_each(>deps, index, fence) {
-   dma_fence_put(fence);
-   }
-   xa_destroy(>deps);
-
if (task->bos) {
for (i = 0; i < task->num_bos; i++)
drm_gem_object_put(>bos[i]->base.base);
@@ -186,17 +177,6 @@ struct dma_fence *lima_sched_context_queue_task(struct 
lima_sched_task *task)
return fence;
 }
 
-static struct dma_fence *lima_sched_dependency(struct drm_sched_job *job,
-  struct drm_sched_entity *entity)
-{
-   struct lima_sched_task *task = to_lima_task(job);
-
-   if (!xa_empty(>deps))
-   return xa_erase(>deps, task->last_dep++);
-
-   return NULL;
-}
-
 static int lima_pm_busy(struct lima_device *ldev)
 {
int ret;
@@ -472,7 +452,6 @@ static void lima_sched_free_job(struct drm_sched_job *job)
 }
 
 static const struct drm_sched_backend_ops lima_sched_ops = {
-   .dependency = lima_sched_dependency,
.run_job = lima_sched_run_job,
.timedout_job = lima_sched_timedout_job,
.free_job = lima_sched_free_job,
diff --git a/drivers/gpu/drm/lima/lima_sched.h 
b/drivers/gpu/drm/lima/lima_sched.h
index ac70006b0e26..6a11764d87b3 100644
--- a/drivers/gpu/drm/lima/lima_sched.h
+++ b/drivers/gpu/drm/lima/lima_sched.h
@@ -23,9 +23,6 @@ struct lima_sched_task {
struct lima_vm *vm;
void *frame;
 
-   struct xarray deps;
-   unsigned long last_dep;
-
struct lima_bo **bos;
int num_bos;
 
-- 
2.32.0.rc2



[PATCH v2 04/11] drm/panfrost: use scheduler dependency tracking

2021-07-02 Thread Daniel Vetter
Just deletes some code that's now more shared.

Note that thanks to the split into drm_sched_job_init/arm we can now
easily pull the _init() part from under the submission lock way ahead
where we're adding the sync file in-fences as dependencies.

v2: Correctly clean up the partially set up job, now that job_init()
and job_arm() are apart (Emma).

Reviewed-by: Steven Price  (v1)
Signed-off-by: Daniel Vetter 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 16 ---
 drivers/gpu/drm/panfrost/panfrost_job.c | 37 +++--
 drivers/gpu/drm/panfrost/panfrost_job.h |  5 +---
 3 files changed, 17 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 1ffaef5ec5ff..9f53bea07d61 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev,
if (ret)
goto fail;
 
-   ret = drm_gem_fence_array_add(>deps, fence);
+   ret = drm_sched_job_await_fence(>base, fence);
 
if (ret)
goto fail;
@@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
struct drm_panfrost_submit *args = data;
struct drm_syncobj *sync_out = NULL;
struct panfrost_job *job;
-   int ret = 0;
+   int ret = 0, slot;
 
if (!args->jc)
return -EINVAL;
@@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
 
kref_init(>refcount);
 
-   xa_init_flags(>deps, XA_FLAGS_ALLOC);
-
job->pfdev = pfdev;
job->jc = args->jc;
job->requirements = args->requirements;
job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
job->file_priv = file->driver_priv;
 
+   slot = panfrost_job_get_slot(job);
+
+   ret = drm_sched_job_init(>base,
+>file_priv->sched_entity[slot],
+NULL);
+   if (ret)
+   goto fail_job_put;
+
ret = panfrost_copy_in_sync(dev, file, args, job);
if (ret)
goto fail_job;
@@ -283,6 +289,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
drm_syncobj_replace_fence(sync_out, job->render_done_fence);
 
 fail_job:
+   drm_sched_job_cleanup(>base);
+fail_job_put:
panfrost_job_put(job);
 fail_out_sync:
if (sync_out)
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 4bc962763e1f..86c843d8822e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -102,7 +102,7 @@ static struct dma_fence *panfrost_fence_create(struct 
panfrost_device *pfdev, in
return >base;
 }
 
-static int panfrost_job_get_slot(struct panfrost_job *job)
+int panfrost_job_get_slot(struct panfrost_job *job)
 {
/* JS0: fragment jobs.
 * JS1: vertex/tiler jobs
@@ -242,13 +242,13 @@ static void panfrost_job_hw_submit(struct panfrost_job 
*job, int js)
 
 static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
  int bo_count,
- struct xarray *deps)
+ struct drm_sched_job *job)
 {
int i, ret;
 
for (i = 0; i < bo_count; i++) {
/* panfrost always uses write mode in its current uapi */
-   ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
+   ret = drm_sched_job_await_implicit(job, bos[i], true);
if (ret)
return ret;
}
@@ -269,31 +269,21 @@ static void panfrost_attach_object_fences(struct 
drm_gem_object **bos,
 int panfrost_job_push(struct panfrost_job *job)
 {
struct panfrost_device *pfdev = job->pfdev;
-   int slot = panfrost_job_get_slot(job);
-   struct drm_sched_entity *entity = >file_priv->sched_entity[slot];
struct ww_acquire_ctx acquire_ctx;
int ret = 0;
 
-
ret = drm_gem_lock_reservations(job->bos, job->bo_count,
_ctx);
if (ret)
return ret;
 
mutex_lock(>sched_lock);
-
-   ret = drm_sched_job_init(>base, entity, NULL);
-   if (ret) {
-   mutex_unlock(>sched_lock);
-   goto unlock;
-   }
-
drm_sched_job_arm(>base);
 
job->render_done_fence = dma_fence_get(>base.s_fence->finished);
 
ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
->deps);
+  

[PATCH v2 02/11] drm/sched: Add dependency tracking

2021-07-02 Thread Daniel Vetter
Instead of just a callback we can just glue in the gem helpers that
panfrost, v3d and lima currently use. There's really not that many
ways to skin this cat.

On the naming bikeshed: The idea for using _await_ to denote adding
dependencies to a job comes from i915, where that's used quite
extensively all over the place, in lots of datastructures.

v2: Rebased.

Reviewed-by: Steven Price  (v1)
Signed-off-by: Daniel Vetter 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Nirmoy Das 
Cc: Boris Brezillon 
Cc: Luben Tuikov 
Cc: Alex Deucher 
Cc: Jack Zhang 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
 drivers/gpu/drm/scheduler/sched_main.c   | 103 +++
 include/drm/gpu_scheduler.h  |  31 ++-
 3 files changed, 146 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index f7347c284886..b6f72fafd504 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence 
*f,
job->sched->ops->free_job(job);
 }
 
+static struct dma_fence *
+drm_sched_job_dependency(struct drm_sched_job *job,
+struct drm_sched_entity *entity)
+{
+   if (!xa_empty(>dependencies))
+   return xa_erase(>dependencies, job->last_dependency++);
+
+   if (job->sched->ops->dependency)
+   return job->sched->ops->dependency(job, entity);
+
+   return NULL;
+}
+
 /**
  * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
  *
@@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct 
drm_sched_entity *entity)
struct drm_sched_fence *s_fence = job->s_fence;
 
/* Wait for all dependencies to avoid data corruptions */
-   while ((f = job->sched->ops->dependency(job, entity)))
+   while ((f = drm_sched_job_dependency(job, entity)))
dma_fence_wait(f, false);
 
drm_sched_fence_scheduled(s_fence);
@@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct 
drm_sched_entity *entity)
  */
 struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 {
-   struct drm_gpu_scheduler *sched = entity->rq->sched;
struct drm_sched_job *sched_job;
 
sched_job = to_drm_sched_job(spsc_queue_peek(>job_queue));
@@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
return NULL;
 
while ((entity->dependency =
-   sched->ops->dependency(sched_job, entity))) {
+   drm_sched_job_dependency(sched_job, entity))) {
trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
 
if (drm_sched_entity_add_dependency_cb(entity))
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 5e84e1500c32..12d533486518 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -605,6 +605,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
 
INIT_LIST_HEAD(>list);
 
+   xa_init_flags(>dependencies, XA_FLAGS_ALLOC);
+
return 0;
 }
 EXPORT_SYMBOL(drm_sched_job_init);
@@ -628,6 +630,98 @@ void drm_sched_job_arm(struct drm_sched_job *job)
 }
 EXPORT_SYMBOL(drm_sched_job_arm);
 
+/**
+ * drm_sched_job_await_fence - adds the fence as a job dependency
+ * @job: scheduler job to add the dependencies to
+ * @fence: the dma_fence to add to the list of dependencies.
+ *
+ * Note that @fence is consumed in both the success and error cases.
+ *
+ * Returns:
+ * 0 on success, or an error on failing to expand the array.
+ */
+int drm_sched_job_await_fence(struct drm_sched_job *job,
+ struct dma_fence *fence)
+{
+   struct dma_fence *entry;
+   unsigned long index;
+   u32 id = 0;
+   int ret;
+
+   if (!fence)
+   return 0;
+
+   /* Deduplicate if we already depend on a fence from the same context.
+* This lets the size of the array of deps scale with the number of
+* engines involved, rather than the number of BOs.
+*/
+   xa_for_each(>dependencies, index, entry) {
+   if (entry->context != fence->context)
+   continue;
+
+   if (dma_fence_is_later(fence, entry)) {
+   dma_fence_put(entry);
+   xa_store(>dependencies, index, fence, GFP_KERNEL);
+   } else {
+   dma_fence_put(fence);
+   }
+   return 0;
+   }
+
+   ret = xa_alloc(>dependencies, , fence, xa_limit_32b, 
GFP_KERNEL);
+   if (ret != 0)
+   

[PATCH v2 03/11] drm/sched: drop entity parameter from drm_sched_push_job

2021-07-02 Thread Daniel Vetter
Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

Reviewed-by: Steven Price  (v1)
Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: Emma Anholt 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Alex Deucher 
Cc: Nirmoy Das 
Cc: Dave Airlie 
Cc: Chen Li 
Cc: Lee Jones 
Cc: Deepak R Varma 
Cc: Kevin Wang 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Dennis Li 
Cc: Boris Brezillon 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  | 2 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 2 +-
 drivers/gpu/drm/lima/lima_gem.c  | 3 +--
 drivers/gpu/drm/lima/lima_sched.c| 5 ++---
 drivers/gpu/drm/lima/lima_sched.h| 3 +--
 drivers/gpu/drm/panfrost/panfrost_job.c  | 2 +-
 drivers/gpu/drm/scheduler/sched_entity.c | 6 ++
 drivers/gpu/drm/v3d/v3d_gem.c| 2 +-
 include/drm/gpu_scheduler.h  | 3 +--
 10 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index a4ec092af9a7..18f63567fb69 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 
trace_amdgpu_cs_ioctl(job);
amdgpu_vm_bo_trace_cs(>vm, >ticket);
-   drm_sched_entity_push_job(>base, entity);
+   drm_sched_entity_push_job(>base);
 
amdgpu_vm_move_to_lru_tail(p->adev, >vm);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 5ddb955d2315..b86099c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
 
*f = dma_fence_get(>base.s_fence->finished);
amdgpu_job_free_resources(job);
-   drm_sched_entity_push_job(>base, entity);
+   drm_sched_entity_push_job(>base);
 
return 0;
 }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 05f412204118..180bb633d5c5 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
/* the scheduler holds on to the job now */
kref_get(>refcount);
 
-   drm_sched_entity_push_job(>sched_job, sched_entity);
+   drm_sched_entity_push_job(>sched_job);
 
 out_unlock:
mutex_unlock(>gpu->fence_lock);
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index de62966243cd..c528f40981bb 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct 
lima_submit *submit)
goto err_out2;
}
 
-   fence = lima_sched_context_queue_task(
-   submit->ctx->context + submit->pipe, submit->task);
+   fence = lima_sched_context_queue_task(submit->task);
 
for (i = 0; i < submit->nr_bos; i++) {
if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index 38f755580507..e968b5a8f0b0 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe,
drm_sched_entity_fini(>base);
 }
 
-struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context 
*context,
-   struct lima_sched_task *task)
+struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
 {
struct dma_fence *fence = dma_fence_get(>base.s_fence->finished);
 
trace_lima_task_submit(task);
-   drm_sched_entity_push_job(>base, >base);
+   drm_sched_entity_push_job(>base);
return fence;
 }
 
diff --git a/drivers/gpu/drm/lima/lima_sched.h 
b/drivers/gpu/drm/lima/lima_sched.h
index 90f03c48ef4a..ac70006b0e26 100644
--- a/drivers/gpu/drm/lima/lima_sched.h
+++ b/drivers/gpu/drm/lima/lima_sched.h
@@ -98,8 +98,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
atomic_t *guilty);
 void lima_sched_context_fini(struct lima_sched_pipe *pipe,
 struct lima_sched_context 

[PATCH v2 01/11] drm/sched: Split drm_sched_job_init

2021-07-02 Thread Daniel Vetter
This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

Acked-by: Steven Price  (v2)
Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Masahiro Yamada 
Cc: Kees Cook 
Cc: Adam Borowski 
Cc: Nick Terrell 
Cc: Mauro Carvalho Chehab 
Cc: Paul Menzel 
Cc: Sami Tolvanen 
Cc: Viresh Kumar 
Cc: Alex Deucher 
Cc: Dave Airlie 
Cc: Nirmoy Das 
Cc: Deepak R Varma 
Cc: Lee Jones 
Cc: Kevin Wang 
Cc: Chen Li 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Dennis Li 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Sonny Jiang 
Cc: Boris Brezillon 
Cc: Tian Tao 
Cc: Jack Zhang 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Emma Anholt 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
 drivers/gpu/drm/lima/lima_sched.c|  2 ++
 drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
 drivers/gpu/drm/scheduler/sched_entity.c |  6 ++--
 drivers/gpu/drm/scheduler/sched_fence.c  | 17 +
 drivers/gpu/drm/scheduler/sched_main.c   | 46 +---
 drivers/gpu/drm/v3d/v3d_gem.c|  2 ++
 include/drm/gpu_scheduler.h  |  7 +++-
 10 files changed, 74 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index c5386d13eb4a..a4ec092af9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
if (r)
goto error_unlock;
 
+   drm_sched_job_arm(>base);
+
/* No memory allocation is allowed while holding the notifier lock.
 * The lock is held until amdgpu_cs_submit is finished and fence is
 * added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
if (r)
return r;
 
+   drm_sched_job_arm(>base);
+
*f = dma_fence_get(>base.s_fence->finished);
amdgpu_job_free_resources(job);
drm_sched_entity_push_job(>base, entity);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index feb6da1b6ceb..05f412204118 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
if (ret)
goto out_unlock;
 
+   drm_sched_job_arm(>sched_job);
+
submit->out_fence = dma_fence_get(>sched_job.s_fence->finished);
submit->out_fence_id = idr_alloc_cyclic(>gpu->fence_idr,
submit->out_fence, 0,
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index dba8329937a3..38f755580507 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
return err;
}
 
+   drm_sched_job_arm(>base);
+
task->num_bos = num_bos;
task->vm = lima_vm_get(vm);
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 71a72fb50e6b..2992dc85325f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job)
goto unlock;
}
 
+   drm_sched_job_arm(>base);
+
job->render_done_fence = dma_fence_get(>base.s_fence->finished);
 
ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 

[PATCH v2 00/11] drm/scheduler dependency tracking

2021-07-02 Thread Daniel Vetter
Hi all

2nd major round of my scheduler dependency handling patches.

Emma noticed a big fumble in that I just didn't bother cleaning up between
drm_sched_job_init() and drm_sched_job_arm(). This here should fix it now.

Review and testing very much welcome.

Cheers, Daniel

Daniel Vetter (11):
  drm/sched: Split drm_sched_job_init
  drm/sched: Add dependency tracking
  drm/sched: drop entity parameter from drm_sched_push_job
  drm/panfrost: use scheduler dependency tracking
  drm/lima: use scheduler dependency tracking
  drm/v3d: Move drm_sched_job_init to v3d_job_init
  drm/v3d: Use scheduler dependency handling
  drm/etnaviv: Use scheduler dependency handling
  drm/gem: Delete gem array fencing helpers
  drm/sched: Don't store self-dependencies
  drm/sched: Check locking in drm_sched_job_await_implicit

 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |   4 +-
 drivers/gpu/drm/drm_gem.c|  96 ---
 drivers/gpu/drm/etnaviv/etnaviv_gem.h|   5 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c |  32 ++--
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  63 +---
 drivers/gpu/drm/etnaviv/etnaviv_sched.h  |   3 +-
 drivers/gpu/drm/lima/lima_gem.c  |   7 +-
 drivers/gpu/drm/lima/lima_sched.c|  28 +---
 drivers/gpu/drm/lima/lima_sched.h|   6 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c  |  16 +-
 drivers/gpu/drm/panfrost/panfrost_job.c  |  39 +
 drivers/gpu/drm/panfrost/panfrost_job.h  |   5 +-
 drivers/gpu/drm/scheduler/sched_entity.c |  30 ++--
 drivers/gpu/drm/scheduler/sched_fence.c  |  17 +-
 drivers/gpu/drm/scheduler/sched_main.c   | 158 ++-
 drivers/gpu/drm/v3d/v3d_drv.h|   6 +-
 drivers/gpu/drm/v3d/v3d_gem.c| 115 ++
 drivers/gpu/drm/v3d/v3d_sched.c  |  44 +-
 include/drm/drm_gem.h|   5 -
 include/drm/gpu_scheduler.h  |  41 -
 21 files changed, 330 insertions(+), 394 deletions(-)

-- 
2.32.0.rc2



Re: [PATCH] drm/vc4: dsi: Only register our component once a DSI device is attached

2021-07-02 Thread Laurent Pinchart
Hi Dave,

On Fri, Jul 02, 2021 at 06:44:22PM +0100, Dave Stevenson wrote:
> On Fri, 2 Jul 2021 at 17:47, Laurent Pinchart wrote:
> > On Mon, Jun 21, 2021 at 04:59:51PM +0300, Laurent Pinchart wrote:
> >> On Mon, Jun 21, 2021 at 04:09:05PM +0300, Laurent Pinchart wrote:
> >>> On Mon, Jun 21, 2021 at 03:56:16PM +0300, Laurent Pinchart wrote:
>  On Mon, Jun 21, 2021 at 12:49:14PM +0100, Dave Stevenson wrote:
> > On Sun, 20 Jun 2021 at 23:49, Laurent Pinchart wrote:
> >> On Sun, Jun 20, 2021 at 09:42:27PM +0300, Laurent Pinchart wrote:
> >>> On Sun, Jun 20, 2021 at 03:29:03PM +0100, Dave Stevenson wrote:
>  On Sun, 20 Jun 2021 at 04:26, Laurent Pinchart wrote:
> >
> > Hi Maxime,
> >
> > I'm testing this, and I'm afraid it causes an issue with all the
> > I2C-controlled bridges. I'm focussing on the newly merged 
> > ti-sn65dsi83
> > driver at the moment, but other are affected the same way.
> >
> > With this patch, the DSI component is only added when the DSI 
> > device is
> > attached to the host with mipi_dsi_attach(). In the ti-sn65dsi83 
> > driver,
> > this happens in the bridge attach callback, which is called when the
> > bridge is attached by a call to drm_bridge_attach() in 
> > vc4_dsi_bind().
> > This creates a circular dependency, and the DRM/KMS device is never
> > created.
> >
> > How should this be solved ? Dave, I think you have shown an 
> > interest in
> > the sn65dsi83 recently, any help would be appreciated. On a side 
> > note,
> > I've tested the ti-sn65dsi83 driver on a v5.10 RPi kernel, without 
> > much
> > success (on top of commit e1499baa0b0c I get a very weird frame 
> > rate -
> > 147 fps of 99 fps instead of 60 fps - and nothing on the screen, 
> > and on
> > top of the latest v5.10 RPi branch, I get lock-related warnings at 
> > every
> > page flip), which is why I tried v5.12 and noticed this patch. Is it
> > worth trying to bring up the display on the v5.10 RPi kernel in 
> > parallel
> > to fixing the issue introduced in this patch, or is DSI known to be
> > broken there ?
> 
>  I've been looking at SN65DSI83/4, but as I don't have any hardware
>  I've largely been suggesting things to try to those on the forums who
>  do [1].
> 
>  My branch at 
>  https://github.com/6by9/linux/tree/rpi-5.10.y-sn65dsi8x-marek
>  is the latest one I've worked on. It's rpi-5.10.y with Marek's driver
>  cherry-picked, and an overlay and simple-panel definition by others.
>  It also has a rework for vc4_dsi to use pm_runtime, instead of
>  breaking up the DSI bridge chain (which is flawed as it never calls
>  the bridge mode_set or mode_valid functions which sn65dsi83 relies
>  on).
> >>
> >> I've looked at that, and I'm afraid it doesn't go in the right
> >> direction. The drm_encoder.crtc field is deprecated and documented as
> >> only meaningful for non-atomic drivers. You're not introducing its
> >> usage, but moving the configuration code from .enable() to the runtime
> >> PM resume handler will make it impossible to fix this. The driver should
> >> instead move to the .atomic_enable() function. If you need
> >> enable/pre_enable in the DSI encoder, then you should turn it into a
> >> drm_bridge.
> >
> > Is this something you're looking at by any chance ? I'm testing the
> > ti-sn65dsi83 driver with VC4. I've spent a couple of hours debugging,
> > only to realise that the vc4_dsi driver (before the rework you mention
> > above) doesn't call .mode_set() on the bridges... Applying my sn65dsi83
> > series that removes .mode_set() didn't help much as vc4_dsi doesn't call
> > the atomic operations either :-) I'll test your branch now.
> 
> This is one of the reasons for my email earlier today - thank you for
> your reply.
> 
> The current mainline vc4_dsi driver deliberately breaks the bridge
> chain so that it gets called before the panel/bridge pre_enable and
> can power everything up, therefore pre_enable can call host_transfer
> to configure the panel/bridge over the DSI interface.
> However we've both noted that it doesn't forward on the mode_set and
> mode_valid calls, and my investigations say that it doesn't have
> enough information to make those calls.
> 
> My branch returns the chain to normal, and tries to use pm_runtime to
> power up the PHY at the first usage (host_transfer or _enable). The
> PHY enable needs to know the link frequency to use, hence my question
> over how that should be determined.
> Currently it's coming from drm_encoder.crtc, but you say that's
> deprecated. If a mode hasn't been set then we have no clock
> information and bad things will happen.

To make sure 

[PATCH] drm/i915: Improve debug Kconfig texts a bit

2021-07-02 Thread Daniel Vetter
We're not consistently recommending these for developers only.

I stumbled over this due to DRM_I915_LOW_LEVEL_TRACEPOINTS, which was
added in

commit 354d036fcf70654cff2e2cbdda54a835d219b9d2
Author: Tvrtko Ursulin 
Date:   Tue Feb 21 11:01:42 2017 +

drm/i915/tracepoints: Add request submit and execute tracepoints

to "alleviate the performance impact concerns."

Which is nonsense.

Tvrtko and Joonas pointed out on irc that the real (but undocumented
reason) was stable abi concerns for tracepoints, see

https://lwn.net/Articles/705270/

and the specific change that was blocked around tracepoints:

https://lwn.net/Articles/442113/

Anyway to make it a notch clearer why we have this Kconfig option
consistly add the "Recommended for driver developers only." to it and
all the other debug options we have.

Cc: Tvrtko Ursulin 
Cc: Joonas Lahtinen 
Cc: Matthew Brost 
Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/Kconfig.debug | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index 2ca88072d30f..f27c0b5873f7 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -215,6 +215,8 @@ config DRM_I915_LOW_LEVEL_TRACEPOINTS
  This provides the ability to precisely monitor engine utilisation
  and also analyze the request dependency resolving timeline.
 
+ Recommended for driver developers only.
+
  If in doubt, say "N".
 
 config DRM_I915_DEBUG_VBLANK_EVADE
@@ -228,6 +230,8 @@ config DRM_I915_DEBUG_VBLANK_EVADE
  is exceeded, even if there isn't an actual risk of missing
  the vblank.
 
+ Recommended for driver developers only.
+
  If in doubt, say "N".
 
 config DRM_I915_DEBUG_RUNTIME_PM
@@ -240,4 +244,6 @@ config DRM_I915_DEBUG_RUNTIME_PM
  runtime PM functionality. This may introduce overhead during
  driver loading, suspend and resume operations.
 
+ Recommended for driver developers only.
+
  If in doubt, say "N"
-- 
2.32.0.rc2



[PATCH] drm/amdgpu: initialize amdgpu_ras_query_error_count() error count parameters

2021-07-02 Thread trix
From: Tom Rix 

Static analysis reports this problem
amdgpu_ras.c:2324:2: warning: 2nd function call argument is an
  uninitialized value
atomic_set(>ras_ce_count, ce_count);
^~~~

ce_count is normally set by the earlier call to
amdgpu_ras_query_error_count().  But amdgpu_ras_query_error_count()
can return early without setting, leaving its error count parameters
in a garbage state.

Initialize the error count parameters earlier.

Fixes: a46751fbcde5 ("drm/amdgpu: Fix RAS function interface")
Signed-off-by: Tom Rix 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 875874ea745ec..c80fa545aa2b8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1056,6 +1056,12 @@ void amdgpu_ras_query_error_count(struct amdgpu_device 
*adev,
struct ras_manager *obj;
unsigned long ce, ue;
 
+   if (ce_count)
+   *ce_count = 0;
+
+   if (ue_count)
+   *ue_count = 0;
+
if (!adev->ras_enabled || !con)
return;
 
-- 
2.26.3



Re: [PATCH v5 0/2] drm/i915: IRQ fixes

2021-07-02 Thread Daniel Vetter
On Thu, Jul 01, 2021 at 07:36:16PM +0200, Thomas Zimmermann wrote:
> Fix a bug in the usage of IRQs and cleanup references to the DRM
> IRQ midlayer.
> 
> Preferably this patchset would be merged through drm-misc-next.
> 
> v5:
>   * go back to _hardirq() after CI tests reported atomic
> context in PCI probe; add rsp comment
> v4:
>   * switch IRQ code to intel_synchronize_irq() (Daniel)
> v3:
>   * also use intel_synchronize_hardirq() from other callsite
> v2:
>   * split patch
>   * also fix comment
>   * add intel_synchronize_hardirq() (Ville)
>   * update Fixes tag (Daniel)

Ok now I actually pushed the right patch set.
-Daniel

> 
> Thomas Zimmermann (2):
>   drm/i915: Use the correct IRQ during resume
>   drm/i915: Drop all references to DRM IRQ midlayer
> 
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c   |  2 +-
>  drivers/gpu/drm/i915/gt/intel_ring_submission.c |  7 +--
>  drivers/gpu/drm/i915/i915_drv.c |  1 -
>  drivers/gpu/drm/i915/i915_irq.c | 10 +-
>  drivers/gpu/drm/i915/i915_irq.h |  1 +
>  5 files changed, 12 insertions(+), 9 deletions(-)
> 
> 
> base-commit: 67f5a18128770817e4218a9e496d2bf5047c51e8
> prerequisite-patch-id: c2b2f08f0eccc9f5df0c0da49fa1d36267deb11d
> prerequisite-patch-id: c67e5d886a47b7d0266d81100837557fda34cb24
> prerequisite-patch-id: 0cca17365e65370fa95d193ed2f1c88917ee1aef
> prerequisite-patch-id: 12b9894350a0b56579d29542943465ef5134751c
> prerequisite-patch-id: 3e1c37d3425f4820fe36ea3da57c65e166fe0ee5
> prerequisite-patch-id: 1017c860a0bf95ce370d82b8db1745f5548fb321
> prerequisite-patch-id: dcc022baab7c172978de9809702c2f4f54323047
> prerequisite-patch-id: 0d05ee247042b43d5ab8f3af216e708a8e09bee8
> prerequisite-patch-id: 110c411161bed6072c32185940fcd052d0bdb09a
> prerequisite-patch-id: d2d1aeccffdfadf2b951487b8605f59c795d84cf
> prerequisite-patch-id: 85fe31e27ca13adc0d1bcc7c19b1ce238a77ee6a
> prerequisite-patch-id: c61fdacbe035ba5c17f1ff393bc9087f16aaea7b
> prerequisite-patch-id: c4821af5dbba4d121769f1da85d91fbb53020ec0
> prerequisite-patch-id: 0b20ef3302abfe6dc123dbc54b9dd087865f935b
> prerequisite-patch-id: d34eb96cbbdeb91870ace4250ea75920b1653dc2
> prerequisite-patch-id: 7f64fce347d15232134d7636ca7a8d9f5bf1a3a0
> prerequisite-patch-id: c83be7a285eb6682cdae0df401ab5d4c208f036b
> prerequisite-patch-id: eb1a44d2eb2685cea154dd3f17f5f463dfafd39a
> prerequisite-patch-id: 92a8c37dae4b8394fd6702f4af58ac7815ac3069
> prerequisite-patch-id: f0237988fe4ae6eba143432d1ace8beb52d935f8
> prerequisite-patch-id: bcf4d29437ed7cb78225dec4c99249eb40c18302
> prerequisite-patch-id: 6407b4c7f1b80af8d329d5f796b30da11959e936
> prerequisite-patch-id: 4a69e6e49d691b555f0e0874d638cd204dcb0c48
> prerequisite-patch-id: be09cfa8a67dd435a25103b85bd4b1649c5190a3
> prerequisite-patch-id: 813ecc9f94251c3d669155faf64c0c9e6a458393
> prerequisite-patch-id: beb2b5000a1682cbd74a7e2ab1566fcae5bccbf0
> prerequisite-patch-id: 754c8878611864475a0b75fd49ff38e71a21c795
> prerequisite-patch-id: d7d4bac3c19f94ba9593143b3c147d83d82cb71f
> prerequisite-patch-id: 983d1efbe060743f5951e474961fa431d886d757
> prerequisite-patch-id: 3c78b20c3b9315cd39e0ae9ea1510c6121bf9ca9
> --
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2] drm/dp_mst: Fix return code on sideband message failure

2021-07-02 Thread Lyude Paul
JFYI: will try to take a look at this at the start of next week

On Tue, 2021-06-29 at 16:07 -0700, Kuogee Hsieh wrote:
> From: Rajkumar Subbiah 
> 
> Commit 2f015ec6eab6 ("drm/dp_mst: Add sideband down request tracing +
> selftests") added some debug code for sideband message tracing. But
> it seems to have unintentionally changed the behavior on sideband message
> failure. It catches and returns failure only if DRM_UT_DP is enabled.
> Otherwise it ignores the error code and returns success. So on an MST
> unplug, the caller is unaware that the clear payload message failed and
> ends up waiting for 4 seconds for the response. Fixes the issue by
> returning the proper error code.
> 
> Changes in V2:
> -- Revise commit text as review comment
> -- add Fixes text
> 
> Fixes: 2f015ec6eab6 ("drm/dp_mst: Add sideband down request tracing +
> selftests")
> 
> Signed-off-by: Rajkumar Subbiah 
> Signed-off-by: Kuogee Hsieh 
> 
> Reviewed-by: Stephen Boyd 
> ---
>  drivers/gpu/drm/drm_dp_mst_topology.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
> b/drivers/gpu/drm/drm_dp_mst_topology.c
> index 1590144..8d97430 100644
> --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> @@ -2887,11 +2887,13 @@ static int process_single_tx_qlock(struct
> drm_dp_mst_topology_mgr *mgr,
> idx += tosend + 1;
>  
> ret = drm_dp_send_sideband_msg(mgr, up, chunk, idx);
> -   if (unlikely(ret) && drm_debug_enabled(DRM_UT_DP)) {
> -   struct drm_printer p = drm_debug_printer(DBG_PREFIX);
> +   if (unlikely(ret)) {
> +   if (drm_debug_enabled(DRM_UT_DP)) {
> +   struct drm_printer p =
> drm_debug_printer(DBG_PREFIX);
>  
> -   drm_printf(, "sideband msg failed to send\n");
> -   drm_dp_mst_dump_sideband_msg_tx(, txmsg);
> +   drm_printf(, "sideband msg failed to send\n");
> +   drm_dp_mst_dump_sideband_msg_tx(, txmsg);
> +   }
> return ret;
> }
>  

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat



Re: [PATCH] dma-buf: fix and rework dma_buf_poll v5

2021-07-02 Thread Daniel Vetter
On Fri, Jul 02, 2021 at 12:31:43PM +0200, Christian König wrote:
> Daniel pointed me towards this function and there are multiple obvious 
> problems
> in the implementation.
> 
> First of all the retry loop is not working as intended. In general the retry
> makes only sense if you grab the reference first and then check the sequence
> values.
> 
> Then we should always also wait for the exclusive fence.
> 
> It's also good practice to keep the reference around when installing callbacks
> to fences you don't own.
> 
> And last the whole implementation was unnecessary complex and rather hard to
> understand which could lead to probably unexpected behavior of the IOCTL.
> 
> Fix all this by reworking the implementation from scratch. Dropping the
> whole RCU approach and taking the lock instead.
> 
> Only mildly tested and needs a thoughtful review of the code.
> 
> v2: fix the reference counting as well
> v3: keep the excl fence handling as is for stable
> v4: back to testing all fences, drop RCU
> v5: handle in and out separately
> 
> Signed-off-by: Christian König 
> CC: sta...@vger.kernel.org
> ---
>  drivers/dma-buf/dma-buf.c | 152 +-
>  include/linux/dma-buf.h   |   2 +-
>  2 files changed, 68 insertions(+), 86 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index eadd1eaa2fb5..439e2379e1cb 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -72,7 +72,7 @@ static void dma_buf_release(struct dentry *dentry)
>* If you hit this BUG() it means someone dropped their ref to the
>* dma-buf while still having pending operation to the buffer.
>*/
> - BUG_ON(dmabuf->cb_shared.active || dmabuf->cb_excl.active);
> + BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
>  
>   dmabuf->ops->release(dmabuf);
>  
> @@ -202,16 +202,57 @@ static void dma_buf_poll_cb(struct dma_fence *fence, 
> struct dma_fence_cb *cb)
>   wake_up_locked_poll(dcb->poll, dcb->active);
>   dcb->active = 0;
>   spin_unlock_irqrestore(>poll->lock, flags);
> + dma_fence_put(fence);
> +}
> +
> +static bool dma_buf_poll_shared(struct dma_resv *resv,
> + struct dma_buf_poll_cb_t *dcb)
> +{
> + struct dma_resv_list *fobj = dma_resv_get_list(resv);
> + struct dma_fence *fence;
> + int i, r;
> +
> + if (!fobj)
> + return false;
> +
> + for (i = 0; i < fobj->shared_count; ++i) {
> + fence = rcu_dereference_protected(fobj->shared[i],
> +   dma_resv_held(resv));
> + dma_fence_get(fence);
> + r = dma_fence_add_callback(fence, >cb, dma_buf_poll_cb);
> + if (!r)
> + return true;
> + dma_fence_put(fence);
> + }
> +
> + return false;
> +}
> +
> +static bool dma_buf_poll_excl(struct dma_resv *resv,
> +   struct dma_buf_poll_cb_t *dcb)
> +{
> + struct dma_fence *fence = dma_resv_get_excl(resv);
> + int r;
> +
> + if (!fence)
> + return false;
> +
> + dma_fence_get(fence);
> + r = dma_fence_add_callback(fence, >cb, dma_buf_poll_cb);
> + if (!r)
> + return true;
> + dma_fence_put(fence);
> +
> + return false;
>  }
>  
>  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>  {
>   struct dma_buf *dmabuf;
>   struct dma_resv *resv;
> - struct dma_resv_list *fobj;
> - struct dma_fence *fence_excl;
> + unsigned shared_count;
>   __poll_t events;
> - unsigned shared_count, seq;
> + int r, i;
>  
>   dmabuf = file->private_data;
>   if (!dmabuf || !dmabuf->resv)
> @@ -225,101 +266,42 @@ static __poll_t dma_buf_poll(struct file *file, 
> poll_table *poll)
>   if (!events)
>   return 0;
>  
> -retry:
> - seq = read_seqcount_begin(>seq);
> - rcu_read_lock();
> -
> - fobj = rcu_dereference(resv->fence);
> - if (fobj)
> - shared_count = fobj->shared_count;
> - else
> - shared_count = 0;
> - fence_excl = rcu_dereference(resv->fence_excl);
> - if (read_seqcount_retry(>seq, seq)) {
> - rcu_read_unlock();
> - goto retry;
> - }
> -
> - if (fence_excl && (!(events & EPOLLOUT) || shared_count == 0)) {
> - struct dma_buf_poll_cb_t *dcb = >cb_excl;
> - __poll_t pevents = EPOLLIN;
> + dma_resv_lock(resv, NULL);
>  
> - if (shared_count == 0)
> - pevents |= EPOLLOUT;
> + if (events & EPOLLOUT) {
> + struct dma_buf_poll_cb_t *dcb = >cb_out;
>  
> + /* Check that callback isn't busy */
>   spin_lock_irq(>poll.lock);
> - if (dcb->active) {
> - dcb->active |= pevents;
> - events &= ~pevents;
> - } else
> - dcb->active = pevents;
> + 

Re: [Intel-gfx] [PATCH v2 3/3] drm/i915/uapi: reject set_domain for discrete

2021-07-02 Thread Daniel Vetter
On Fri, Jul 02, 2021 at 03:31:08PM +0100, Tvrtko Ursulin wrote:
> 
> On 01/07/2021 16:10, Matthew Auld wrote:
> > The CPU domain should be static for discrete, and on DG1 we don't need
> > any flushing since everything is already coherent, so really all this
> 
> Knowledge of the write combine buffer is assumed to be had by anyone involved?
> 
> > does is an object wait, for which we have an ioctl. Longer term the
> > desired caching should be an immutable creation time property for the
> > BO, which can be set with something like gem_create_ext.
> > 
> > One other user is iris + userptr, which uses the set_domain to probe all
> > the pages to check if the GUP succeeds, however keeping the set_domain
> > around just for that seems rather scuffed. We could equally just submit
> > a dummy batch, which should hopefully be good enough, otherwise adding a
> > new creation time flag for userptr might be an option. Although longer
> > term we will also have vm_bind, which should also be a nice fit for
> > this, so adding a whole new flag is likely overkill.
> 
> Execbuf sounds horrible. But it all reminds me of past work by Chris which is 
> surprisingly hard to find in the archives. Patches like:
> 
> commit 7706a433388016983052a27c0fd74a64b1897ae7
> Author: Chris Wilson 
> Date:   Wed Nov 8 17:04:07 2017 +
> 
> drm/i915/userptr: Probe existence of backing struct pages upon creation
> Jason Ekstrand requested a more efficient method than userptr+set-domain
> to determine if the userptr object was backed by a complete set of pages
> upon creation. To be more efficient than simply populating the userptr
> using get_user_pages() (as done by the call to set-domain or execbuf),
> we can walk the tree of vm_area_struct and check for gaps or vma not
> backed by struct page (VM_PFNMAP). The question is how to handle
> VM_MIXEDMAP which may be either struct page or pfn backed...
> 
> commit 7ca21d3390eec23db99b8131ed18bc036efaba18
> Author: Chris Wilson 
> Date:   Wed Nov 8 17:48:22 2017 +
> 
> drm/i915/userptr: Add a flag to populate the userptr on creation
> Acquiring the backing struct pages for the userptr range is not free;
> the first client for userptr would insist on frequently creating userptr
> objects ahead of time and not use them. For that first client, deferring
> the cost of populating the userptr (calling get_user_pages()) to the
> actual execbuf was a substantial improvement. However, not all clients
> are the same, and most would like to validate that the userptr is valid
> and backed by struct pages upon creation, so offer a
> I915_USERPTR_POPULATE flag to do just that.
> Note that big difference between I915_USERPTR_POPULATE and the deferred
> scheme is that POPULATE is guaranteed to be synchronous, the result is
> known before the ioctl returns (and the handle exposed). However, due to
> system memory pressure, the object may be paged out before use,
> requiring them to be paged back in on execbuf (as may always happen).
> 
> At least with the first one I think I was skeptical, since probing at
> point A makes a weak test versus userptr getting used at point B.
> Populate is kind of same really when user controls the backing store. At
> least these two arguments I think stand if we are trying to sell these
> flags as validation. But if the idea is limited to pure preload, with no
> guarantees that it keeps working by time of real use, then I guess it
> may be passable.

Well we've thrown this out again because there was no userspace. But if
this is requested by mesa, then the _PROBE flag should be entirely
sufficient.

Since I don't want to hold up dg1 pciids on this it'd be nice if we could
just go ahead with the dummy batch, if Ken/Jordan don't object - iris is
the only umd that needs this.

> Disclaimer that I haven't been following the story on why it is
> desirable to abandon set domain. Only judging from this series, mmap
> caching mode is implied from the object? Should set domain availability
> be driven by the object backing store instead of outright rejection?

In theory yes.

In practice umd have allowed and all the api are now allocating objects
with static properties, and the only reason we ever call set_domain is due
to slightly outdated buffer caching schemes dating back to og libdrm from
12+ years ago.

The other practical reason is that clflush is simply the slowest way to
upload data of all the ones we have :-)

So even when this comes back I don't expect this ioctl will come back.
> 
> Regards,
> 
> Tvrtko
> > Suggested-by: Daniel Vetter 
> > Signed-off-by: Matthew Auld 
> > Cc: Thomas Hellström 
> > Cc: Maarten Lankhorst 
> > Cc: Jordan Justen 
> > Cc: Kenneth Graunke 
> > Cc: Jason Ekstrand 
> > Cc: Daniel Vetter 
> > Cc: Ramalingam C 
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git 

Re: [PATCH 2/3] drm/i915/uapi: reject caching ioctls for discrete

2021-07-02 Thread Daniel Vetter
On Thu, Jul 01, 2021 at 03:36:49PM +0100, Matthew Auld wrote:
> It's a noop on DG1, and in the future when need to support other devices
> which let us control the coherency, then it should be an immutable
> creation time property for the BO.
> 
> Suggested-by: Daniel Vetter 
> Signed-off-by: Matthew Auld 
> Cc: Thomas Hellström 
> Cc: Maarten Lankhorst 
> Cc: Kenneth Graunke 
> Cc: Jason Ekstrand 
> Cc: Daniel Vetter 
> Cc: Ramalingam C 

For this and the next can you pls add kerneldoc for the uapi structs and
then add a note there that on dgfx they're disallowed? Same for the next
one.

At least I'd like if we can document uapi here as we go, so that we have
something to point people to when they as "what has changed? what should I
do in my userspace driver?".

Also please make sure these two have acks from mesa devs before you land
them.

Thanks, Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_domain.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> index 7d1400b13429..43004bef55cb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> @@ -268,6 +268,9 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, 
> void *data,
>   struct drm_i915_gem_object *obj;
>   int err = 0;
>  
> + if (IS_DGFX(to_i915(dev)))
> + return -ENODEV;
> +
>   rcu_read_lock();
>   obj = i915_gem_object_lookup_rcu(file, args->handle);
>   if (!obj) {
> @@ -303,6 +306,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, 
> void *data,
>   enum i915_cache_level level;
>   int ret = 0;
>  
> + if (IS_DGFX(i915))
> + return -ENODEV;
> +
>   switch (args->caching) {
>   case I915_CACHING_NONE:
>   level = I915_CACHE_NONE;
> -- 
> 2.26.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v4 0/2] drm/i915: IRQ fixes

2021-07-02 Thread Daniel Vetter
On Thu, Jul 01, 2021 at 10:58:31AM +0200, Thomas Zimmermann wrote:
> Fix a bug in the usage of IRQs and cleanup references to the DRM
> IRQ midlayer.
> 
> Preferably this patchset would be merged through drm-misc-next.
> 
> v4:
>   * switch IRQ code to intel_synchronize_irq() (Daniel)
> v3:
>   * also use intel_synchronize_hardirq() from other callsite
> v2:
>   * split patch
>   * also fix comment
>   * add intel_synchronize_hardirq() (Ville)
>   * update Fixes tag (Daniel)
> 
> Thomas Zimmermann (2):
>   drm/i915: Use the correct IRQ during resume
>   drm/i915: Drop all references to DRM IRQ midlayer

Both pushed to drm-intel-gt-next, thanks for your patches.
-Daniel

> 
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 2 +-
>  drivers/gpu/drm/i915/gt/intel_ring_submission.c | 2 +-
>  drivers/gpu/drm/i915/i915_drv.c | 1 -
>  drivers/gpu/drm/i915/i915_irq.c | 5 -
>  4 files changed, 2 insertions(+), 8 deletions(-)
> 
> 
> base-commit: 67f5a18128770817e4218a9e496d2bf5047c51e8
> --
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm: mxsfb: Enable recovery on underflow

2021-07-02 Thread Marek Vasut

On 7/1/21 12:49 AM, Marek Vasut wrote:

On 6/21/21 2:13 PM, Laurent Pinchart wrote:

Hi Marek,

Thank you for the patch.

On Mon, Jun 21, 2021 at 12:47:01AM +0200, Marek Vasut wrote:

There is some sort of corner case behavior of the controller,
which could rarely be triggered at least on i.MX6SX connected
to 800x480 DPI panel and i.MX8MM connected to DPI->DSI->LVDS
bridged 1920x1080 panel (and likely on other setups too), where
the image on the panel shifts to the right and wraps around.
This happens either when the controller is enabled on boot or
even later during run time. The condition does not correct
itself automatically, i.e. the display image remains shifted.

It seems this problem is known and is due to sporadic underflows
of the LCDIF FIFO. While the LCDIF IP does have underflow/overflow
IRQs, neither of the IRQs trigger and neither IRQ status bit is
asserted when this condition occurs.

All known revisions of the LCDIF IP have CTRL1 RECOVER_ON_UNDERFLOW
bit, which is described in the reference manual since i.MX23 as
"
   Set this bit to enable the LCDIF block to recover in the next
   field/frame if there was an underflow in the current field/frame.
"
Enable this bit to mitigate the sporadic underflows.

Fixes: 45d59d704080 ("drm: Add new driver for MXSFB controller")
Signed-off-by: Marek Vasut 
Cc: Daniel Abrecht 
Cc: Emil Velikov 
Cc: Laurent Pinchart 
Cc: Lucas Stach 
Cc: Stefan Agner 
---
  drivers/gpu/drm/mxsfb/mxsfb_kms.c  | 29 +
  drivers/gpu/drm/mxsfb/mxsfb_regs.h |  1 +
  2 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/mxsfb/mxsfb_kms.c 
b/drivers/gpu/drm/mxsfb/mxsfb_kms.c

index 300e7bab0f43..01e0f525360f 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_kms.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
@@ -115,6 +115,35 @@ static void mxsfb_enable_controller(struct 
mxsfb_drm_private *mxsfb)

  reg |= VDCTRL4_SYNC_SIGNALS_ON;
  writel(reg, mxsfb->base + LCDC_VDCTRL4);
+    /*
+ * Enable recovery on underflow.
+ *
+ * There is some sort of corner case behavior of the controller,
+ * which could rarely be triggered at least on i.MX6SX connected
+ * to 800x480 DPI panel and i.MX8MM connected to DPI->DSI->LVDS
+ * bridged 1920x1080 panel (and likely on other setups too), where
+ * the image on the panel shifts to the right and wraps around.
+ * This happens either when the controller is enabled on boot or
+ * even later during run time. The condition does not correct
+ * itself automatically, i.e. the display image remains shifted.
+ *
+ * It seems this problem is known and is due to sporadic underflows
+ * of the LCDIF FIFO. While the LCDIF IP does have 
underflow/overflow

+ * IRQs, neither of the IRQs trigger and neither IRQ status bit is
+ * asserted when this condition occurs.
+ *
+ * All known revisions of the LCDIF IP have CTRL1 
RECOVER_ON_UNDERFLOW

+ * bit, which is described in the reference manual since i.MX23 as
+ * "
+ *   Set this bit to enable the LCDIF block to recover in the next
+ *   field/frame if there was an underflow in the current 
field/frame.

+ * "
+ * Enable this bit to mitigate the sporadic underflows.
+ */
+    reg = readl(mxsfb->base + LCDC_CTRL1);
+    reg |= CTRL1_RECOVER_ON_UNDERFLOW;
+    writel(reg, mxsfb->base + LCDC_CTRL1);


Looks good to me. Thanks for the detailed explanation.

Reviewed-by: Laurent Pinchart 


So who do I CC to pick it? Robert ? There are a few more mxsfb fixes 
which are RB'd and would be nice if they were picked too.


+CC Daniel, can those RB'd mxsfb patches be picked ?


Re: [PATCH -next] drm: vmwgfx: add header file for ttm_range_manager

2021-07-02 Thread Daniel Vetter
On Wed, Jun 30, 2021 at 08:36:29PM +, Zack Rusin wrote:
> 
> 
> > On Jun 30, 2021, at 16:32, Randy Dunlap  wrote:
> > 
> > Add a header file for ttm_range_manager function prototypes to
> > eliminate build errors:
> > 
> > ../drivers/gpu/drm/vmwgfx/vmwgfx_drv.c: In function ‘vmw_vram_manager_init’:
> > ../drivers/gpu/drm/vmwgfx/vmwgfx_drv.c:678:8: error: implicit declaration 
> > of function ‘ttm_range_man_init’; did you mean ‘ttm_tt_mgr_init’? 
> > [-Werror=implicit-function-declaration]
> >  ret = ttm_range_man_init(_priv->bdev, TTM_PL_VRAM, false,
> > ../drivers/gpu/drm/vmwgfx/vmwgfx_drv.c: In function ‘vmw_vram_manager_fini’:
> > ../drivers/gpu/drm/vmwgfx/vmwgfx_drv.c:690:2: error: implicit declaration 
> > of function ‘ttm_range_man_fini’; did you mean ‘ttm_pool_mgr_fini’? 
> > [-Werror=implicit-function-declaration]
> >  ttm_range_man_fini(_priv->bdev, TTM_PL_VRAM);
> > 
> > Fixes: 9c3006a4cc1b ("drm/ttm: remove available_caching")
> > Fixes: a343160235f5 ("drm/vmwgfx/ttm: fix the non-THP cleanup path.")
> > Signed-off-by: Randy Dunlap 
> > Cc: "VMware Graphics" 
> > Cc: Roland Scheidegger 
> > Cc: Zack Rusin 
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: Dave Airlie 
> > Cc: Christian König 
> 
> Thank you. That change has been part of drm-misc for a few weeks now:
> https://cgit.freedesktop.org/drm/drm-misc/commit/?id=352a81b71ea0a3ce8f929aa60afe369d738a0c6a
> I think it should be part of the next merge of drm-misc to linux-next. If not 
> I’ll port it to drm-misc-fixes.

It should probably be in drm-misc-next-fixes. drm-misc-next is for 5.15.
drm-misc-fixes was for 5.14 and will only reopen after -rc1.

See  
https://drm.pages.freedesktop.org/maintainer-tools/committer-drm-misc.html#where-do-i-apply-my-patch

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v7 0/5] drm: address potential UAF bugs with drm_master ptrs

2021-07-02 Thread Daniel Vetter
On Fri, Jul 02, 2021 at 12:53:53AM +0800, Desmond Cheong Zhi Xi wrote:
> This patch series addresses potential use-after-free errors when 
> dereferencing pointers to struct drm_master. These were identified after one 
> such bug was caught by Syzbot in drm_getunique():
> https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803
> 
> The series is broken up into five patches:
> 
> 1. Move a call to drm_is_current_master() out from a section locked by 
> >mode_config.mutex in drm_mode_getconnector(). This patch does not apply 
> to stable.
> 
> 2. Move a call to _drm_lease_held() out from the section locked by 
> >mode_config.idr_mutex in __drm_mode_object_find().
> 
> 3. Implement a locked version of drm_is_current_master() function that's used 
> within drm_auth.c.
> 
> 4. Serialize drm_file.master by introducing a new lock that's held whenever 
> the value of drm_file.master changes.
> 
> 5. Identify areas in drm_lease.c where pointers to struct drm_master are 
> dereferenced, and ensure that the master pointers are not freed during use.
> 
> Changes in v6 -> v7:
> - Patch 2:
> Modify code alignment as suggested by the intel-gfx CI.
> 
> Update commit message based on the changes to patch 5.
> 
> - Patch 4:
> Add patch 4 to the series. This patch adds a new lock to serialize 
> drm_file.master, in response to the lockdep splat by the intel-gfx CI.
> 
> - Patch 5:
> Move kerneldoc comment about protecting drm_file.master with 
> drm_device.master_mutex into patch 4.
> 
> Update drm_file_get_master to use the new drm_file.master_lock instead of 
> drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.

So there's another one now because master->leases is protected by the
mode_config.idr_mutex, and that's a bit awkward to untangle.

Also I'm really surprised that there was now lockdep through the atomic
code anywhere. The reason seems to be that somehow CI reboot first before
it managed to run any of the kms_atomic tests, and we can only hit this
when we go through the atomic kms ioctl, the legacy kms ioctl don't have
that specific issue.

Anyway I think this approach doesn't look too workable, and we need
something new.

But first things first: Are you still on board working on this? You
started with a simple patch to fix a UAF bug, now we're deep into
reworking tricky locking ... If you feel like you want out I'm totally
fine with that.

Anyway, I think we need to split drm_device->master_mutex up into two
parts:

- One part that protects the actual access/changes, which I think for
  simplicity we'll just leave as the current lock. That lock is a very
  inner lock, since for the drm_lease.c stuff it has to nest within
  mode_config.idr_mutex even.

- Now the issue with checking master status/leases/whatever as an
  innermost lock is that you can race, it's a classic time of check vs
  time of use race: By the time we actually use the thing we validate
  we'er allowed to use, we might now have access anymore. There's two
  reasons for that:

  * DROPMASTER ioctl could remove the master rights, which removes access
rights also for all leases

  * REVOKE_LEASE ioctl can do the same but only for a specific lease

  This is the thing we're trying to protect against in fbcon code, but
  that's very spotty protection because all the ioctls by other users
  aren't actually protected against this.

  So I think for this we need some kind of big reader lock.

Now for the implementation, there's a few things:

- I think best option for this big reader lock would be to just use srcu.
  We only need to flush out all current readers when we drop master or
  revoke a lease, so synchronize_srcu is perfectly good enough for this
  purpose.

- The fbdev code would switch over to srcu in
  drm_master_internal_acquire() and drm_master_internal_release(). Ofc
  within drm_master_internal_acquire we'd still need to check master
  status with the normal master_mutex.

- While we revamp all this we should fix the ioctl checks in drm_ioctl.c.
  Just noticed that drm_ioctl_permit() could and should be unexported,
  last user was removed.

  Within drm_ioctl_kernel we'd then replace the check for
  drm_is_current_master with the drm_master_internal_acquire/release.

- This alone does nothing, we still need to make sure that dropmaster and
  revoke_lease ioctl flush out all other access before they return to
  userspace. We can't just call synchronize_srcu because due to the ioctl
  code in drm_ioctl_kernel we're in that sruc section, we'd need to add a
  DRM_MASTER_FLUSH ioctl flag which we'd check only when DRM_MASTER is
  set, and use to call synchronize_srcu. Maybe wrap that in a
  drm_master_flush or so, or perhaps a drm_master_internal_release_flush.

- Also maybe we should drop the _internal_ from that name. Feels a bit
  wrong when we're also going to use this in the ioctl handler.

Thoughts? Totally silly and overkill?

Cheers, Daniel


> Changes in v5 -> v6:
> - Patch 2:
> 

Re: [Freedreno] [RFC 2/6] drm/msm/dpu: support setting up two independent DSI connectors

2021-07-02 Thread abhinavk

On 2021-07-02 10:10, Dmitry Baryshkov wrote:

On 02/07/2021 18:52, abhin...@codeaurora.org wrote:

On 2021-07-02 02:20, Dmitry Baryshkov wrote:

On 02/07/2021 00:12, abhin...@codeaurora.org wrote:

On 2021-06-09 14:17, Dmitry Baryshkov wrote:

Move setting up encoders from set_encoder_mode to
_dpu_kms_initialize_dsi() / _dpu_kms_initialize_displayport(). This
allows us to support not only "single DSI" and "dual DSI" but also 
"two
independent DSI" configurations. In future this would also help 
adding

support for multiple DP connectors.

Signed-off-by: Dmitry Baryshkov 
I will have to see Bjorn's changes to check why it was dependent on 
this cleanup.

Is the plan to call _dpu_kms_initialize_displayport() twice?


Yes. He needs to initialize several displayport interfaces. With the
current code he has to map ids in the set_encoder_mode, using encoder
ids (to fill up the info.h_tile_instance, which is hardcoded to 0 for
DP in the current code).

But still I am not able to put together where is the dependency on 
that series

with this one. Can you please elaborate on that a little bit?


It is possible to support independent outputs with the current code. 
I

did that for DSI, Bjorn did for DP. However it results in quite an
ugly code to map received encoder in set_encoder_mode back to the DSI
(DP) instances to fill the h_tiles. If we drop the whole
set_encoder_mode story and call dpu_encoder_setup right from the
_dpu_kms_initialize_dsi() (or _dpu_kms_initialize_displayport()),
supporting multiple outputs becomes an easy task.


Okay got it, I think it will become more clear once he posts.



---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 89 
-

 1 file changed, 44 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 1d3a4f395e74..b63e1c948ff2 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -471,30 +471,55 @@ static int _dpu_kms_initialize_dsi(struct 
drm_device *dev,

 struct dpu_kms *dpu_kms)
 {
 struct drm_encoder *encoder = NULL;
+    struct msm_display_info info;
 int i, rc = 0;

 if (!(priv->dsi[0] || priv->dsi[1]))
 return rc;

-    /*TODO: Support two independent DSI connectors */
-    encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI);
-    if (IS_ERR(encoder)) {
-    DPU_ERROR("encoder init failed for dsi display\n");
-    return PTR_ERR(encoder);
-    }
-
-    priv->encoders[priv->num_encoders++] = encoder;
-
 for (i = 0; i < ARRAY_SIZE(priv->dsi); i++) {
 if (!priv->dsi[i])
 continue;

+    if (!encoder) {
+    encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI);
+    if (IS_ERR(encoder)) {
+    DPU_ERROR("encoder init failed for dsi 
display\n");

+    return PTR_ERR(encoder);
+    }
+
+    priv->encoders[priv->num_encoders++] = encoder;
+
+    memset(, 0, sizeof(info));
+    info.intf_type = encoder->encoder_type;
+    info.capabilities = msm_dsi_is_cmd_mode(priv->dsi[i]) 
?

+    MSM_DISPLAY_CAP_CMD_MODE :
+    MSM_DISPLAY_CAP_VID_MODE;
+    }
+
 rc = msm_dsi_modeset_init(priv->dsi[i], dev, encoder);
 if (rc) {
 DPU_ERROR("modeset_init failed for dsi[%d], rc = 
%d\n",

 i, rc);
 break;
 }
+
+    info.h_tile_instance[info.num_of_h_tiles++] = i;
+
+    if (!msm_dsi_is_dual_dsi(priv->dsi[i])) {


I would like to clarify the terminology of dual_dsi in the current 
DSI driver before the rest of the reviews.
Today IS_DUAL_DSI() means that two DSIs are driving the same display 
and the two DSIs are operating in master-slave mode

and are being driven by the same PLL.


Yes

Usually, dual independent DSI means two DSIs driving two separate 
panels using two separate PLLs ( DSI0 with PLL0 and DSI1 with PLL1)


Let's stop calling it 'dual'. I'd suggest to continue using what was
there in the source file: 'two independent DSI'.

I assume thats happening due to the foll logic and both DSI PHYs are 
operating in STANDALONE mode:


 if (!IS_DUAL_DSI()) {
 ret = msm_dsi_host_register(msm_dsi->host, true);
 if (ret)
 return ret;

 msm_dsi_phy_set_usecase(msm_dsi->phy, 
MSM_DSI_PHY_STANDALONE);
 ret = msm_dsi_host_set_src_pll(msm_dsi->host, 
msm_dsi->phy);


Yes. If we have two independent DSI outputs, we'd like them to work 
in

STANDALONE mode.



+    rc = dpu_encoder_setup(dev, encoder, );
+    if (rc)
+    DPU_ERROR("failed to setup DPU encoder %d: 
rc:%d\n",

+    encoder->base.id, rc);
+    encoder = NULL;
+    }
+    }
+
+    if (encoder) {


We will hit this case only for split-DSI right? ( that is two DSIs 
driving the same panel ).


Yes, only in this case.

Even single DSI will 

[Bug 212469] plymouth animation freezes during shutdown

2021-07-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=212469

Norbert (asteri...@gmx.de) changed:

   What|Removed |Added

 CC||asteri...@gmx.de

--- Comment #8 from Norbert (asteri...@gmx.de) ---
I tried plymouth_0.9.5git20210323-0ubuntu1_amd64 without any change.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-02 Thread Boris Brezillon
On Fri, 2 Jul 2021 12:49:55 -0400
Alyssa Rosenzweig  wrote:

> > > ```  
> > > >  #define PANFROST_BO_REF_EXCLUSIVE  0x1
> > > > +#define PANFROST_BO_REF_NO_IMPLICIT_DEP0x2
> > > ```
> > > 
> > > This seems logically backwards. NO_IMPLICIT_DEP makes sense if we're
> > > trying to keep backwards compatibility, but here you're crafting a new
> > > interface totally from scratch. If anything, isn't BO_REF_IMPLICIT_DEP
> > > the flag you'd want?  
> > 
> > AFAICT, all other drivers make the no-implicit-dep an opt-in, and I
> > didn't want to do things differently in panfrost. But if that's really
> > an issue, I can make it an opt-out.  
> 
> I don't have strong feelings either way. I was just under the
> impressions other drivers did this for b/w compat reasons which don't
> apply here.

Okay, I think I'll keep it like that unless there's a strong reason to
make no-implicit dep the default. It's safer to oversync than the skip
the synchronization, so it does feel like something the user should
explicitly enable.

> 
> > > Hmm. I'm not /opposed/ and I know kbase uses strides but it seems like
> > > somewhat unwarranted complexity, and there is a combinatoric explosion
> > > here (if jobs, bo refs, and syncobj refs use 3 different versions, as
> > > this encoding permits... as opposed to just specifying a UABI version or
> > > something like that)  
> > 
> > Sounds like a good idea. I'll add a version field and map that
> > to a  tuple.  
> 
> Cc Steven, does this make sense?

I have this approach working, and I must admit I prefer it to the
per-object stride field passed to the submit struct.



Re: [PATCH] drm/stm: ltdc: improve pm_runtime to stop clocks

2021-07-02 Thread Marek Vasut

On 7/2/21 11:23 AM, Raphael Gallais-Pou wrote:

Hello Marek,


Hi,


Sorry for the late answer.


No worries, take your time


On 6/30/21 2:35 AM, Marek Vasut wrote:

On 6/29/21 1:58 PM, Raphael GALLAIS-POU - foss wrote:

[...]


+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -425,10 +425,17 @@ static void ltdc_crtc_atomic_enable(struct 
drm_crtc *crtc,

  {
  struct ltdc_device *ldev = crtc_to_ltdc(crtc);
  struct drm_device *ddev = crtc->dev;
+    int ret;
    DRM_DEBUG_DRIVER("\n");
  -    pm_runtime_get_sync(ddev->dev);
+    if (!pm_runtime_active(ddev->dev)) {
+    ret = pm_runtime_get_sync(ddev->dev);


All these if (!pm_runtime_active()) then pm_runtime_get_sync() calls 
look like workaround for some larger issue. Shouldn't the pm_runtime 
do some refcounting on its own , so this shouldn't be needed ?



This problem purely comes from the driver internals, so I don't think it 
is a workaround.


Because of the "ltdc_crtc_mode_set_nofb" function which does not have 
any "symmetrical" call, such as enable/disable functions, there was two 
calls to pm_runtime_get_sync against one call to pm_runtime_put_sync.


This instability resulted in the LTDC clocks being always enabled, even 
when the peripheral was disabled. This could be seen in the clk_summary 
as explained in the patch summary among other things.


By doing so, we first check if the clocks are not already activated, and 
in that case we call pm_runtime_get_sync.


I just have to wonder, how come other drivers don't need these if 
(!pm_runtime_active()) pm_runtime_get_sync() conditions. I think they 
just get/put the runtime PM within a call itself, not across function 
calls. Maybe that could be the right fix here too ?


Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-02 Thread Boris Brezillon
On Fri, 2 Jul 2021 12:49:55 -0400
Alyssa Rosenzweig  wrote:

> > > Why is there padding instead of putting point first?  
> > 
> > We can move the point field first, but we need to keep the explicit
> > padding: the struct has to be 64bit aligned because of the __u64 field
> > (which the compiler takes care of) but if we don't have an explicit
> > padding, the unused 32bits are undefined, which might cause trouble if
> > we extend the struct at some point, since we sort of expect that old
> > userspace keep this unused 32bit slot to 0, while new users set
> > non-zero values if they have to.  
> 
> Makes sense. Reordering still probably makes sense.

Actually, I can't re-order if I want the new in_syncs parser to work
with the old ioctl(), which you and Steven asked me to do :-).


Re: [PATCH] drm/vc4: dsi: Only register our component once a DSI device is attached

2021-07-02 Thread Dave Stevenson
Hi Laurent

On Fri, 2 Jul 2021 at 17:47, Laurent Pinchart
 wrote:
>
> Hi Dave,
>
> On Mon, Jun 21, 2021 at 04:59:51PM +0300, Laurent Pinchart wrote:
> > On Mon, Jun 21, 2021 at 04:09:05PM +0300, Laurent Pinchart wrote:
> > > On Mon, Jun 21, 2021 at 03:56:16PM +0300, Laurent Pinchart wrote:
> > > > On Mon, Jun 21, 2021 at 12:49:14PM +0100, Dave Stevenson wrote:
> > > > > On Sun, 20 Jun 2021 at 23:49, Laurent Pinchart wrote:
> > > > > > On Sun, Jun 20, 2021 at 09:42:27PM +0300, Laurent Pinchart wrote:
> > > > > > > On Sun, Jun 20, 2021 at 03:29:03PM +0100, Dave Stevenson wrote:
> > > > > > > > On Sun, 20 Jun 2021 at 04:26, Laurent Pinchart wrote:
> > > > > > > > >
> > > > > > > > > Hi Maxime,
> > > > > > > > >
> > > > > > > > > I'm testing this, and I'm afraid it causes an issue with all 
> > > > > > > > > the
> > > > > > > > > I2C-controlled bridges. I'm focussing on the newly merged 
> > > > > > > > > ti-sn65dsi83
> > > > > > > > > driver at the moment, but other are affected the same way.
> > > > > > > > >
> > > > > > > > > With this patch, the DSI component is only added when the DSI 
> > > > > > > > > device is
> > > > > > > > > attached to the host with mipi_dsi_attach(). In the 
> > > > > > > > > ti-sn65dsi83 driver,
> > > > > > > > > this happens in the bridge attach callback, which is called 
> > > > > > > > > when the
> > > > > > > > > bridge is attached by a call to drm_bridge_attach() in 
> > > > > > > > > vc4_dsi_bind().
> > > > > > > > > This creates a circular dependency, and the DRM/KMS device is 
> > > > > > > > > never
> > > > > > > > > created.
> > > > > > > > >
> > > > > > > > > How should this be solved ? Dave, I think you have shown an 
> > > > > > > > > interest in
> > > > > > > > > the sn65dsi83 recently, any help would be appreciated. On a 
> > > > > > > > > side note,
> > > > > > > > > I've tested the ti-sn65dsi83 driver on a v5.10 RPi kernel, 
> > > > > > > > > without much
> > > > > > > > > success (on top of commit e1499baa0b0c I get a very weird 
> > > > > > > > > frame rate -
> > > > > > > > > 147 fps of 99 fps instead of 60 fps - and nothing on the 
> > > > > > > > > screen, and on
> > > > > > > > > top of the latest v5.10 RPi branch, I get lock-related 
> > > > > > > > > warnings at every
> > > > > > > > > page flip), which is why I tried v5.12 and noticed this 
> > > > > > > > > patch. Is it
> > > > > > > > > worth trying to bring up the display on the v5.10 RPi kernel 
> > > > > > > > > in parallel
> > > > > > > > > to fixing the issue introduced in this patch, or is DSI known 
> > > > > > > > > to be
> > > > > > > > > broken there ?
> > > > > > > >
> > > > > > > > I've been looking at SN65DSI83/4, but as I don't have any 
> > > > > > > > hardware
> > > > > > > > I've largely been suggesting things to try to those on the 
> > > > > > > > forums who
> > > > > > > > do [1].
> > > > > > > >
> > > > > > > > My branch at 
> > > > > > > > https://github.com/6by9/linux/tree/rpi-5.10.y-sn65dsi8x-marek
> > > > > > > > is the latest one I've worked on. It's rpi-5.10.y with Marek's 
> > > > > > > > driver
> > > > > > > > cherry-picked, and an overlay and simple-panel definition by 
> > > > > > > > others.
> > > > > > > > It also has a rework for vc4_dsi to use pm_runtime, instead of
> > > > > > > > breaking up the DSI bridge chain (which is flawed as it never 
> > > > > > > > calls
> > > > > > > > the bridge mode_set or mode_valid functions which sn65dsi83 
> > > > > > > > relies
> > > > > > > > on).
> >
> > I've looked at that, and I'm afraid it doesn't go in the right
> > direction. The drm_encoder.crtc field is deprecated and documented as
> > only meaningful for non-atomic drivers. You're not introducing its
> > usage, but moving the configuration code from .enable() to the runtime
> > PM resume handler will make it impossible to fix this. The driver should
> > instead move to the .atomic_enable() function. If you need
> > enable/pre_enable in the DSI encoder, then you should turn it into a
> > drm_bridge.
>
> Is this something you're looking at by any chance ? I'm testing the
> ti-sn65dsi83 driver with VC4. I've spent a couple of hours debugging,
> only to realise that the vc4_dsi driver (before the rework you mention
> above) doesn't call .mode_set() on the bridges... Applying my sn65dsi83
> series that removes .mode_set() didn't help much as vc4_dsi doesn't call
> the atomic operations either :-) I'll test your branch now.

This is one of the reasons for my email earlier today - thank you for
your reply.

The current mainline vc4_dsi driver deliberately breaks the bridge
chain so that it gets called before the panel/bridge pre_enable and
can power everything up, therefore pre_enable can call host_transfer
to configure the panel/bridge over the DSI interface.
However we've both noted that it doesn't forward on the mode_set and
mode_valid calls, and my investigations say that it doesn't have
enough information to make those calls.

My branch returns the chain to normal, and 

Re: [Freedreno] [RFC 2/6] drm/msm/dpu: support setting up two independent DSI connectors

2021-07-02 Thread Dmitry Baryshkov

On 02/07/2021 18:52, abhin...@codeaurora.org wrote:

On 2021-07-02 02:20, Dmitry Baryshkov wrote:

On 02/07/2021 00:12, abhin...@codeaurora.org wrote:

On 2021-06-09 14:17, Dmitry Baryshkov wrote:

Move setting up encoders from set_encoder_mode to
_dpu_kms_initialize_dsi() / _dpu_kms_initialize_displayport(). This
allows us to support not only "single DSI" and "dual DSI" but also "two
independent DSI" configurations. In future this would also help adding
support for multiple DP connectors.

Signed-off-by: Dmitry Baryshkov 
I will have to see Bjorn's changes to check why it was dependent on 
this cleanup.

Is the plan to call _dpu_kms_initialize_displayport() twice?


Yes. He needs to initialize several displayport interfaces. With the
current code he has to map ids in the set_encoder_mode, using encoder
ids (to fill up the info.h_tile_instance, which is hardcoded to 0 for
DP in the current code).

But still I am not able to put together where is the dependency on 
that series

with this one. Can you please elaborate on that a little bit?


It is possible to support independent outputs with the current code. I
did that for DSI, Bjorn did for DP. However it results in quite an
ugly code to map received encoder in set_encoder_mode back to the DSI
(DP) instances to fill the h_tiles. If we drop the whole
set_encoder_mode story and call dpu_encoder_setup right from the
_dpu_kms_initialize_dsi() (or _dpu_kms_initialize_displayport()),
supporting multiple outputs becomes an easy task.


Okay got it, I think it will become more clear once he posts.



---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 89 -
 1 file changed, 44 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 1d3a4f395e74..b63e1c948ff2 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -471,30 +471,55 @@ static int _dpu_kms_initialize_dsi(struct 
drm_device *dev,

 struct dpu_kms *dpu_kms)
 {
 struct drm_encoder *encoder = NULL;
+    struct msm_display_info info;
 int i, rc = 0;

 if (!(priv->dsi[0] || priv->dsi[1]))
 return rc;

-    /*TODO: Support two independent DSI connectors */
-    encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI);
-    if (IS_ERR(encoder)) {
-    DPU_ERROR("encoder init failed for dsi display\n");
-    return PTR_ERR(encoder);
-    }
-
-    priv->encoders[priv->num_encoders++] = encoder;
-
 for (i = 0; i < ARRAY_SIZE(priv->dsi); i++) {
 if (!priv->dsi[i])
 continue;

+    if (!encoder) {
+    encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI);
+    if (IS_ERR(encoder)) {
+    DPU_ERROR("encoder init failed for dsi display\n");
+    return PTR_ERR(encoder);
+    }
+
+    priv->encoders[priv->num_encoders++] = encoder;
+
+    memset(, 0, sizeof(info));
+    info.intf_type = encoder->encoder_type;
+    info.capabilities = msm_dsi_is_cmd_mode(priv->dsi[i]) ?
+    MSM_DISPLAY_CAP_CMD_MODE :
+    MSM_DISPLAY_CAP_VID_MODE;
+    }
+
 rc = msm_dsi_modeset_init(priv->dsi[i], dev, encoder);
 if (rc) {
 DPU_ERROR("modeset_init failed for dsi[%d], rc = %d\n",
 i, rc);
 break;
 }
+
+    info.h_tile_instance[info.num_of_h_tiles++] = i;
+
+    if (!msm_dsi_is_dual_dsi(priv->dsi[i])) {


I would like to clarify the terminology of dual_dsi in the current 
DSI driver before the rest of the reviews.
Today IS_DUAL_DSI() means that two DSIs are driving the same display 
and the two DSIs are operating in master-slave mode

and are being driven by the same PLL.


Yes

Usually, dual independent DSI means two DSIs driving two separate 
panels using two separate PLLs ( DSI0 with PLL0 and DSI1 with PLL1)


Let's stop calling it 'dual'. I'd suggest to continue using what was
there in the source file: 'two independent DSI'.

I assume thats happening due to the foll logic and both DSI PHYs are 
operating in STANDALONE mode:


 if (!IS_DUAL_DSI()) {
 ret = msm_dsi_host_register(msm_dsi->host, true);
 if (ret)
 return ret;

 msm_dsi_phy_set_usecase(msm_dsi->phy, MSM_DSI_PHY_STANDALONE);
 ret = msm_dsi_host_set_src_pll(msm_dsi->host, msm_dsi->phy);


Yes. If we have two independent DSI outputs, we'd like them to work in
STANDALONE mode.



+    rc = dpu_encoder_setup(dev, encoder, );
+    if (rc)
+    DPU_ERROR("failed to setup DPU encoder %d: rc:%d\n",
+    encoder->base.id, rc);
+    encoder = NULL;
+    }
+    }
+
+    if (encoder) {


We will hit this case only for split-DSI right? ( that is two DSIs 
driving the same panel ).


Yes, only in this case.

Even single DSI will be created in the above loop now. So this looks 
a bit confusing 

Re: [PATCH 1/2] drm/i915/gem: Correct the locking and pin pattern for dma-buf

2021-07-02 Thread Daniel Vetter
On Thu, Jul 1, 2021 at 4:24 PM Michael J. Ruhl  wrote:
>
> From: Thomas Hellström 
>
> If our exported dma-bufs are imported by another instance of our driver,
> that instance will typically have the imported dma-bufs locked during
> dma_buf_map_attachment(). But the exporter also locks the same reservation
> object in the map_dma_buf() callback, which leads to recursive locking.
>
> So taking the lock inside _pin_pages_unlocked() is incorrect.
>
> Additionally, the current pinning code path is contrary to the defined
> way that pinning should occur.
>
> Remove the explicit pin/unpin from the map/umap functions and move them
> to the attach/detach allowing correct locking to occur, and to match
> the static dma-buf drm_prime pattern.
>
> Add a live selftest to exercise both dynamic and non-dynamic
> exports.
>
> v2:
> - Extend the selftest with a fake dynamic importer.
> - Provide real pin and unpin callbacks to not abuse the interface.
> v3: (ruhl)
> - Remove the dynamic export support and move the pinning into the
>   attach/detach path.
>
> Reported-by: Michael J. Ruhl 
> Signed-off-by: Thomas Hellström 
> Signed-off-by: Michael J. Ruhl 

CI splat is because I got the locking rules wrong, I thought
->attach/detach are called under the dma_resv_lock, because when we
used the old dma_buf->lock those calls where protected by that lock
under the same critical section as adding/removing from the list. But
we changed that in

f45f57cce584 ("dma-buf: stop using the dmabuf->lock so much v2")
15fd552d186c ("dma-buf: change DMA-buf locking convention v3")

Because keeping dma_resv_lock over ->attach/detach would go boom on
all the ttm drivers, which pin/unpin the buffer in there. Iow we need
the unlocked version there, but also having this split up is a bit
awkward and might be good to patch up so that it's atomic again. Would
mean updating a bunch of drivers. Christian, any thoughts?

Mike, for now I'd just keep using the _unlocked variants and we should be fine.
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|  46 ++--
>  .../drm/i915/gem/selftests/i915_gem_dmabuf.c  | 111 +-
>  2 files changed, 143 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> index 616c3a2f1baf..00338c8d3739 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> @@ -12,6 +12,8 @@
>  #include "i915_gem_object.h"
>  #include "i915_scatterlist.h"
>
> +I915_SELFTEST_DECLARE(static bool force_different_devices;)
> +
>  static struct drm_i915_gem_object *dma_buf_to_obj(struct dma_buf *buf)
>  {
> return to_intel_bo(buf->priv);
> @@ -25,15 +27,11 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
> dma_buf_attachment *attachme
> struct scatterlist *src, *dst;
> int ret, i;
>
> -   ret = i915_gem_object_pin_pages_unlocked(obj);
> -   if (ret)
> -   goto err;
> -
> /* Copy sg so that we make an independent mapping */
> st = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
> if (st == NULL) {
> ret = -ENOMEM;
> -   goto err_unpin_pages;
> +   goto err;
> }
>
> ret = sg_alloc_table(st, obj->mm.pages->nents, GFP_KERNEL);
> @@ -58,8 +56,6 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
> dma_buf_attachment *attachme
> sg_free_table(st);
>  err_free:
> kfree(st);
> -err_unpin_pages:
> -   i915_gem_object_unpin_pages(obj);
>  err:
> return ERR_PTR(ret);
>  }
> @@ -68,13 +64,9 @@ static void i915_gem_unmap_dma_buf(struct 
> dma_buf_attachment *attachment,
>struct sg_table *sg,
>enum dma_data_direction dir)
>  {
> -   struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf);
> -
> dma_unmap_sgtable(attachment->dev, sg, dir, DMA_ATTR_SKIP_CPU_SYNC);
> sg_free_table(sg);
> kfree(sg);
> -
> -   i915_gem_object_unpin_pages(obj);
>  }
>
>  static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct dma_buf_map 
> *map)
> @@ -168,7 +160,32 @@ static int i915_gem_end_cpu_access(struct dma_buf 
> *dma_buf, enum dma_data_direct
> return err;
>  }
>
> +/**
> + * i915_gem_dmabuf_attach - Do any extra attach work necessary
> + * @dmabuf: imported dma-buf
> + * @attach: new attach to do work on
> + *
> + */
> +static int i915_gem_dmabuf_attach(struct dma_buf *dmabuf,
> + struct dma_buf_attachment *attach)
> +{
> +   struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
> +
> +   assert_object_held(obj);
> +   return i915_gem_object_pin_pages(obj);
> +}
> +
> +static void i915_gem_dmabuf_detach(struct dma_buf *dmabuf,
> + struct dma_buf_attachment *attach)
> +{
> +   struct drm_i915_gem_object *obj = 

Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-02 Thread Alyssa Rosenzweig
> > What is handle? What is point?
> 
> Handle is a syncobj handle, point is the point in a syncobj timeline.
> I'll document those fields.

OK.

> > Why is there padding instead of putting point first?
> 
> We can move the point field first, but we need to keep the explicit
> padding: the struct has to be 64bit aligned because of the __u64 field
> (which the compiler takes care of) but if we don't have an explicit
> padding, the unused 32bits are undefined, which might cause trouble if
> we extend the struct at some point, since we sort of expect that old
> userspace keep this unused 32bit slot to 0, while new users set
> non-zero values if they have to.

Makes sense. Reordering still probably makes sense.

> > ```
> > >  #define PANFROST_BO_REF_EXCLUSIVE0x1
> > > +#define PANFROST_BO_REF_NO_IMPLICIT_DEP  0x2  
> > ```
> > 
> > This seems logically backwards. NO_IMPLICIT_DEP makes sense if we're
> > trying to keep backwards compatibility, but here you're crafting a new
> > interface totally from scratch. If anything, isn't BO_REF_IMPLICIT_DEP
> > the flag you'd want?
> 
> AFAICT, all other drivers make the no-implicit-dep an opt-in, and I
> didn't want to do things differently in panfrost. But if that's really
> an issue, I can make it an opt-out.

I don't have strong feelings either way. I was just under the
impressions other drivers did this for b/w compat reasons which don't
apply here.

> > Hmm. I'm not /opposed/ and I know kbase uses strides but it seems like
> > somewhat unwarranted complexity, and there is a combinatoric explosion
> > here (if jobs, bo refs, and syncobj refs use 3 different versions, as
> > this encoding permits... as opposed to just specifying a UABI version or
> > something like that)
> 
> Sounds like a good idea. I'll add a version field and map that
> to a  tuple.

Cc Steven, does this make sense?

> > > + /**
> > > +  * If the submission fails, this encodes the index of the job
> > > +  * failed.
> > > +  */
> > > + __u32 fail_idx;  
> > ```
> > 
> > What if multiple jobs fail?
> 
> We stop at the first failure. Note that it's not an execution failure,
> but a submission failure (AKA, userspace passed wrong params, like
> invalid BO or synobj handles).

I see, ok.


Re: [PATCH] drm/vc4: dsi: Only register our component once a DSI device is attached

2021-07-02 Thread Laurent Pinchart
Hi Dave,

On Mon, Jun 21, 2021 at 04:59:51PM +0300, Laurent Pinchart wrote:
> On Mon, Jun 21, 2021 at 04:09:05PM +0300, Laurent Pinchart wrote:
> > On Mon, Jun 21, 2021 at 03:56:16PM +0300, Laurent Pinchart wrote:
> > > On Mon, Jun 21, 2021 at 12:49:14PM +0100, Dave Stevenson wrote:
> > > > On Sun, 20 Jun 2021 at 23:49, Laurent Pinchart wrote:
> > > > > On Sun, Jun 20, 2021 at 09:42:27PM +0300, Laurent Pinchart wrote:
> > > > > > On Sun, Jun 20, 2021 at 03:29:03PM +0100, Dave Stevenson wrote:
> > > > > > > On Sun, 20 Jun 2021 at 04:26, Laurent Pinchart wrote:
> > > > > > > >
> > > > > > > > Hi Maxime,
> > > > > > > >
> > > > > > > > I'm testing this, and I'm afraid it causes an issue with all the
> > > > > > > > I2C-controlled bridges. I'm focussing on the newly merged 
> > > > > > > > ti-sn65dsi83
> > > > > > > > driver at the moment, but other are affected the same way.
> > > > > > > >
> > > > > > > > With this patch, the DSI component is only added when the DSI 
> > > > > > > > device is
> > > > > > > > attached to the host with mipi_dsi_attach(). In the 
> > > > > > > > ti-sn65dsi83 driver,
> > > > > > > > this happens in the bridge attach callback, which is called 
> > > > > > > > when the
> > > > > > > > bridge is attached by a call to drm_bridge_attach() in 
> > > > > > > > vc4_dsi_bind().
> > > > > > > > This creates a circular dependency, and the DRM/KMS device is 
> > > > > > > > never
> > > > > > > > created.
> > > > > > > >
> > > > > > > > How should this be solved ? Dave, I think you have shown an 
> > > > > > > > interest in
> > > > > > > > the sn65dsi83 recently, any help would be appreciated. On a 
> > > > > > > > side note,
> > > > > > > > I've tested the ti-sn65dsi83 driver on a v5.10 RPi kernel, 
> > > > > > > > without much
> > > > > > > > success (on top of commit e1499baa0b0c I get a very weird frame 
> > > > > > > > rate -
> > > > > > > > 147 fps of 99 fps instead of 60 fps - and nothing on the 
> > > > > > > > screen, and on
> > > > > > > > top of the latest v5.10 RPi branch, I get lock-related warnings 
> > > > > > > > at every
> > > > > > > > page flip), which is why I tried v5.12 and noticed this patch. 
> > > > > > > > Is it
> > > > > > > > worth trying to bring up the display on the v5.10 RPi kernel in 
> > > > > > > > parallel
> > > > > > > > to fixing the issue introduced in this patch, or is DSI known 
> > > > > > > > to be
> > > > > > > > broken there ?
> > > > > > >
> > > > > > > I've been looking at SN65DSI83/4, but as I don't have any hardware
> > > > > > > I've largely been suggesting things to try to those on the forums 
> > > > > > > who
> > > > > > > do [1].
> > > > > > >
> > > > > > > My branch at 
> > > > > > > https://github.com/6by9/linux/tree/rpi-5.10.y-sn65dsi8x-marek
> > > > > > > is the latest one I've worked on. It's rpi-5.10.y with Marek's 
> > > > > > > driver
> > > > > > > cherry-picked, and an overlay and simple-panel definition by 
> > > > > > > others.
> > > > > > > It also has a rework for vc4_dsi to use pm_runtime, instead of
> > > > > > > breaking up the DSI bridge chain (which is flawed as it never 
> > > > > > > calls
> > > > > > > the bridge mode_set or mode_valid functions which sn65dsi83 relies
> > > > > > > on).
> 
> I've looked at that, and I'm afraid it doesn't go in the right
> direction. The drm_encoder.crtc field is deprecated and documented as
> only meaningful for non-atomic drivers. You're not introducing its
> usage, but moving the configuration code from .enable() to the runtime
> PM resume handler will make it impossible to fix this. The driver should
> instead move to the .atomic_enable() function. If you need
> enable/pre_enable in the DSI encoder, then you should turn it into a
> drm_bridge.

Is this something you're looking at by any chance ? I'm testing the
ti-sn65dsi83 driver with VC4. I've spent a couple of hours debugging,
only to realise that the vc4_dsi driver (before the rework you mention
above) doesn't call .mode_set() on the bridges... Applying my sn65dsi83
series that removes .mode_set() didn't help much as vc4_dsi doesn't call
the atomic operations either :-) I'll test your branch now.

> > > > > > > I ran it on Friday in the lab and encountered an issue with 
> > > > > > > vc4_dsi
> > > > > > > should vc4_dsi_encoder_mode_fixup wish for a divider of 7 
> > > > > > > (required
> > > > > > > for this 800x1280 panel over 4 lanes) where it resulted in an 
> > > > > > > invalid
> > > > > > > mode configuration. That resulted in patch [2] which then gave me
> > > > > > > sensible numbers.
> 
> I have that commit in my branch, but still get 125 fps instead of 60 fps
> with kmstest --flip (after reverting commit 1c3834201272 "drm/vc4:
> Increase the core clock based on HVS load"). I'm not sure if [2] is the
> cause of this, but there seems to be an improvement: in my previous
> tests, the mode was fixed up every time I would start the application,
> with the timings getting more and more bizarre at every run :-)
> 
> > > > 

Re: Questions over DSI within DRM.

2021-07-02 Thread Laurent Pinchart
Hi Dave,

(Expanding the CC list a bit)

On Fri, Jul 02, 2021 at 12:03:31PM +0100, Dave Stevenson wrote:
> Hi All
> 
> I'm trying to get DSI devices working reliably on the Raspberry Pi,
> but I'm hitting a number of places where it isn't clear as to the
> expected behaviour within DRM.

Not a surprise. I dread reading the rest of this e-mail though :-)

> Power on state. Many devices want the DSI clock and/or data lanes in
> LP-11 state when they are powered up.

When they are powered up, or when they are enabled ?

> With the normal calling sequence of:
> - panel/bridge pre_enable calls from connector towards the encoder.
> - encoder enable which also enables video.
> - panel/bridge enable calls from encoder to connector.
> there is no point at which the DSI tx is initialised but not
> transmitting video. What DSI states are expected to be adopted at each
> point?

That's undefined I'm afraid, and it should be documented. The upside is
that you can propose the behaviour that you need :-)

> On a similar theme, some devices want the clock lane in HS mode early
> so they can use it in place of an external oscillator, but the data
> lanes still in LP-11. There appears to be no way for the
> display/bridge to signal this requirement or it be achieved.

You're right. A lng time ago, the omapdrm driver had an internal
infrastructure that didn't use drm_bridge or drm_panel and instead
required omapdrm-specific drivers for those components. It used to model
the display pipeline in a different way than drm_bridge, with the sync
explicitly setting the source state. A DSI sink could thus control its
enable sequence, interleaving programming of the sink with control of
the source.

Migrating omapdrm to the drm_bridge model took a really large effort,
which makes me believe that transitioning the whole subsystem to
sink-controlled sources would be close to impossible. We could add
DSI-specific operations, or add another enable bridge operation
(post_pre_enable ? :-D). Neither would scale, but it may be enough.

> host_transfer calls can supposedly be made at any time, however unless
> MIPI_DSI_MSG_USE_LPM is set in the message then we're meant to send it
> in high speed mode. If this is before a mode has been set, what
> defines the link frequency parameters at this point? Adopting a random
> default sounds like a good way to get undefined behaviour.
> 
> DSI burst mode needs to set the DSI link frequency independently of
> the display mode. How is that meant to be configured? I would have
> expected it to come from DT due to link frequency often being chosen
> based on EMC restrictions, but I don't see such a thing in any
> binding.

Undefined too. DSI support was added to DRM without any design effort,
it's more a hack than a real solution. The issue with devices that can
be controlled over both DSI and I2C is completely unhandled. So far
nobody has really cared about implementing DSI right as far as I can
tell.

> As a follow on, bridge devices can support burst mode (eg TI's
> SN65DSI83 that's just been merged), so it needs to know the desired
> panel timings for the output side of the bridge, but the DSI link
> timings to set up the bridge's PLL. What's the correct way for
> signalling that? drm_crtc_state->adjusted_mode vs
> drm_crtc_state->mode? Except mode is userspace's request, not what has
> been validated/updated by the panel/bridge.

adjusted_mode is also a bit of a hack, it solves very specific issues,
and its design assumes a single encoder in the chain with no extra
bridges. We should instead add modes to the bridge state, and negotiate
modes along the pipeline the same way we negotiate formats.

> vc4 has constraints that the DSI host interface is fed off an integer
> divider from a typically 3GHz clock, so the host interface needs to
> signal that burst mode is in use even if the panel/bridge doesn't need
> to run in burst mode. (This does mean that displays that require a
> very precise link frequency can not be supported).
> It currently updates the adjusted_mode via drm_encoder_helper_funcs
> mode_fixup, but is that the correct thing to do, or is there a better
> solution?
> I'd have expected the DSI tx to be responsible for configuring burst
> mode parameters anyway, so the mechanism required would seem to be
> just the normal approach for adopting burst mode if that is defined.
> 
> Some DSI host interfaces are implemented as bridges, others are
> encoders. Pro's and con's of each? I suspect I'm just missing the
> history here.

It's indeed history. drm_encoder can't go away as it has been erronously
exposed to userspace, but going forward, everything should be a bridge.
The drm_encoder will still be required, but should just be a dummy,
representing the chain of bridges.

> When it comes to the MIPI_DSI_MODE_* flags, which ones are mutually
> exclusive, or are assumed based on others? Does a burst mode DSI sink
> set both MIPI_DSI_MODE_VIDEO and MIPI_DSI_MODE_VIDEO_BURST, or just
> the 

Re: [PATCH v3 4/7] drm/panfrost: Add the ability to create submit queues

2021-07-02 Thread Boris Brezillon
On Fri, 2 Jul 2021 17:49:10 +0200
Boris Brezillon  wrote:

> On Fri, 2 Jul 2021 16:05:30 +0100
> Steven Price  wrote:
> 
> > On 02/07/2021 15:32, Boris Brezillon wrote:  
> > > Needed to keep VkQueues isolated from each other.
> > > 
> > > v3:
> > > * Limit the number of submitqueue per context to 16
> > > * Fix a deadlock
> > > 
> > > Signed-off-by: Boris Brezillon 
> > 
> > 16 ought to be enough for anyone ;)
> > 
> > Reviewed-by: Steven Price   
> 
> Oops, forgot to change the submitqueue_get() prototype. Will address
> that in v4.

I meant submitqueue_create().


Re: [Freedreno] [RFC 2/6] drm/msm/dpu: support setting up two independent DSI connectors

2021-07-02 Thread abhinavk

On 2021-07-02 02:20, Dmitry Baryshkov wrote:

On 02/07/2021 00:12, abhin...@codeaurora.org wrote:

On 2021-06-09 14:17, Dmitry Baryshkov wrote:

Move setting up encoders from set_encoder_mode to
_dpu_kms_initialize_dsi() / _dpu_kms_initialize_displayport(). This
allows us to support not only "single DSI" and "dual DSI" but also 
"two
independent DSI" configurations. In future this would also help 
adding

support for multiple DP connectors.

Signed-off-by: Dmitry Baryshkov 
I will have to see Bjorn's changes to check why it was dependent on 
this cleanup.

Is the plan to call _dpu_kms_initialize_displayport() twice?


Yes. He needs to initialize several displayport interfaces. With the
current code he has to map ids in the set_encoder_mode, using encoder
ids (to fill up the info.h_tile_instance, which is hardcoded to 0 for
DP in the current code).

But still I am not able to put together where is the dependency on 
that series

with this one. Can you please elaborate on that a little bit?


It is possible to support independent outputs with the current code. I
did that for DSI, Bjorn did for DP. However it results in quite an
ugly code to map received encoder in set_encoder_mode back to the DSI
(DP) instances to fill the h_tiles. If we drop the whole
set_encoder_mode story and call dpu_encoder_setup right from the
_dpu_kms_initialize_dsi() (or _dpu_kms_initialize_displayport()),
supporting multiple outputs becomes an easy task.


Okay got it, I think it will become more clear once he posts.



---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 89 
-

 1 file changed, 44 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 1d3a4f395e74..b63e1c948ff2 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -471,30 +471,55 @@ static int _dpu_kms_initialize_dsi(struct 
drm_device *dev,

 struct dpu_kms *dpu_kms)
 {
 struct drm_encoder *encoder = NULL;
+    struct msm_display_info info;
 int i, rc = 0;

 if (!(priv->dsi[0] || priv->dsi[1]))
 return rc;

-    /*TODO: Support two independent DSI connectors */
-    encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI);
-    if (IS_ERR(encoder)) {
-    DPU_ERROR("encoder init failed for dsi display\n");
-    return PTR_ERR(encoder);
-    }
-
-    priv->encoders[priv->num_encoders++] = encoder;
-
 for (i = 0; i < ARRAY_SIZE(priv->dsi); i++) {
 if (!priv->dsi[i])
 continue;

+    if (!encoder) {
+    encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI);
+    if (IS_ERR(encoder)) {
+    DPU_ERROR("encoder init failed for dsi display\n");
+    return PTR_ERR(encoder);
+    }
+
+    priv->encoders[priv->num_encoders++] = encoder;
+
+    memset(, 0, sizeof(info));
+    info.intf_type = encoder->encoder_type;
+    info.capabilities = msm_dsi_is_cmd_mode(priv->dsi[i]) ?
+    MSM_DISPLAY_CAP_CMD_MODE :
+    MSM_DISPLAY_CAP_VID_MODE;
+    }
+
 rc = msm_dsi_modeset_init(priv->dsi[i], dev, encoder);
 if (rc) {
 DPU_ERROR("modeset_init failed for dsi[%d], rc = %d\n",
 i, rc);
 break;
 }
+
+    info.h_tile_instance[info.num_of_h_tiles++] = i;
+
+    if (!msm_dsi_is_dual_dsi(priv->dsi[i])) {


I would like to clarify the terminology of dual_dsi in the current DSI 
driver before the rest of the reviews.
Today IS_DUAL_DSI() means that two DSIs are driving the same display 
and the two DSIs are operating in master-slave mode

and are being driven by the same PLL.


Yes

Usually, dual independent DSI means two DSIs driving two separate 
panels using two separate PLLs ( DSI0 with PLL0 and DSI1 with PLL1)


Let's stop calling it 'dual'. I'd suggest to continue using what was
there in the source file: 'two independent DSI'.

I assume thats happening due to the foll logic and both DSI PHYs are 
operating in STANDALONE mode:


     if (!IS_DUAL_DSI()) {
     ret = msm_dsi_host_register(msm_dsi->host, true);
     if (ret)
     return ret;

     msm_dsi_phy_set_usecase(msm_dsi->phy, 
MSM_DSI_PHY_STANDALONE);

     ret = msm_dsi_host_set_src_pll(msm_dsi->host, msm_dsi->phy);


Yes. If we have two independent DSI outputs, we'd like them to work in
STANDALONE mode.



+    rc = dpu_encoder_setup(dev, encoder, );
+    if (rc)
+    DPU_ERROR("failed to setup DPU encoder %d: rc:%d\n",
+    encoder->base.id, rc);
+    encoder = NULL;
+    }
+    }
+
+    if (encoder) {


We will hit this case only for split-DSI right? ( that is two DSIs 
driving the same panel ).


Yes, only in this case.

Even single DSI will be created in the above loop now. So this looks a 
bit confusing at the moment.


What is so confusing? I can 

Re: [PATCH v3 4/7] drm/panfrost: Add the ability to create submit queues

2021-07-02 Thread Boris Brezillon
On Fri, 2 Jul 2021 16:05:30 +0100
Steven Price  wrote:

> On 02/07/2021 15:32, Boris Brezillon wrote:
> > Needed to keep VkQueues isolated from each other.
> > 
> > v3:
> > * Limit the number of submitqueue per context to 16
> > * Fix a deadlock
> > 
> > Signed-off-by: Boris Brezillon   
> 
> 16 ought to be enough for anyone ;)
> 
> Reviewed-by: Steven Price 

Oops, forgot to change the submitqueue_get() prototype. Will address
that in v4.


Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-02 Thread Boris Brezillon
On Fri, 2 Jul 2021 11:13:16 -0400
Alyssa Rosenzweig  wrote:

> ```
> > +/* Syncobj reference passed at job submission time to encode explicit
> > + * input/output fences.
> > + */
> > +struct drm_panfrost_syncobj_ref {
> > +   __u32 handle;
> > +   __u32 pad;
> > +   __u64 point;
> > +};  
> ```
> 
> What is handle? What is point?

Handle is a syncobj handle, point is the point in a syncobj timeline.
I'll document those fields.

> Why is there padding instead of putting point first?

We can move the point field first, but we need to keep the explicit
padding: the struct has to be 64bit aligned because of the __u64 field
(which the compiler takes care of) but if we don't have an explicit
padding, the unused 32bits are undefined, which might cause trouble if
we extend the struct at some point, since we sort of expect that old
userspace keep this unused 32bit slot to 0, while new users set
non-zero values if they have to.

> 
> ```
> >  #define PANFROST_BO_REF_EXCLUSIVE  0x1
> > +#define PANFROST_BO_REF_NO_IMPLICIT_DEP0x2  
> ```
> 
> This seems logically backwards. NO_IMPLICIT_DEP makes sense if we're
> trying to keep backwards compatibility, but here you're crafting a new
> interface totally from scratch. If anything, isn't BO_REF_IMPLICIT_DEP
> the flag you'd want?

AFAICT, all other drivers make the no-implicit-dep an opt-in, and I
didn't want to do things differently in panfrost. But if that's really
an issue, I can make it an opt-out.

> 
> ```
> > +   /**
> > +* Stride of the jobs array (needed to ease extension of the
> > +* BATCH_SUBMIT ioctl). Should be set to
> > +* sizeof(struct drm_panfrost_job).
> > +*/
> > +   __u32 job_stride;  
> ...
> > +   /**
> > +* Stride of the BO and syncobj reference arrays (needed to ease
> > +* extension of the BATCH_SUBMIT ioctl). Should be set to
> > +* sizeof(struct drm_panfrost_bo_ref).
> > +*/
> > +   __u32 bo_ref_stride;
> > +   __u32 syncobj_ref_stride;  
> ```
> 
> Hmm. I'm not /opposed/ and I know kbase uses strides but it seems like
> somewhat unwarranted complexity, and there is a combinatoric explosion
> here (if jobs, bo refs, and syncobj refs use 3 different versions, as
> this encoding permits... as opposed to just specifying a UABI version or
> something like that)

Sounds like a good idea. I'll add a version field and map that
to a  tuple.

> 
> ```
> > +   /**
> > +* If the submission fails, this encodes the index of the job
> > +* failed.
> > +*/
> > +   __u32 fail_idx;  
> ```
> 
> What if multiple jobs fail?

We stop at the first failure. Note that it's not an execution failure,
but a submission failure (AKA, userspace passed wrong params, like
invalid BO or synobj handles).


Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-02 Thread Alyssa Rosenzweig
> Better, but I was hoping we can mostly delete panfrost_ioctl_submit(),
> leaving something along the lines of:
> 
> static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
>   struct drm_file *file)
> {
>   struct panfrost_submitqueue *queue;
>   struct drm_panfrost_submit *args = data;
>   struct drm_panfrost_job submit_args = {
>   .head = args->jc,
>   .bos = args->bo_handles,
>   .in_syncs = args->in_syncs,
>   .out_syncs = >out_sync, // FIXME
>   .in_sync_count = args->in_sync_count,
>   .out_sync_count = args->out_sync > 0 ? 1 : 0,
>   .bo_count = args->bo_handle_count,
>   .requirements = args->requirements
>   };
>   int ret;
> 
>   queue = panfrost_submitqueue_get(file->driver_priv, 0);
> 
>   ret = panfrost_submit_job(dev, file, queue, _args,
> sizeof(u32), ...);
> 
>   return ret;
> }
> 
> But obviously the out_sync part needs special handling as we can't just
> pass a kernel pointer in like that ;)

This, a dozen times this.


Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-02 Thread Steven Price
On 02/07/2021 15:32, Boris Brezillon wrote:
> This should help limit the number of ioctls when submitting multiple
> jobs. The new ioctl also supports syncobj timelines and BO access flags.
> 
> v3:
> * Re-use panfrost_get_job_bos() and panfrost_get_job_in_syncs() in the
>   old submit path
> 
> Signed-off-by: Boris Brezillon 

Better, but I was hoping we can mostly delete panfrost_ioctl_submit(),
leaving something along the lines of:

static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
struct drm_file *file)
{
struct panfrost_submitqueue *queue;
struct drm_panfrost_submit *args = data;
struct drm_panfrost_job submit_args = {
.head = args->jc,
.bos = args->bo_handles,
.in_syncs = args->in_syncs,
.out_syncs = >out_sync, // FIXME
.in_sync_count = args->in_sync_count,
.out_sync_count = args->out_sync > 0 ? 1 : 0,
.bo_count = args->bo_handle_count,
.requirements = args->requirements
};
int ret;

queue = panfrost_submitqueue_get(file->driver_priv, 0);

ret = panfrost_submit_job(dev, file, queue, _args,
  sizeof(u32), ...);

return ret;
}

But obviously the out_sync part needs special handling as we can't just
pass a kernel pointer in like that ;)

I'd like the above the duplication of things like this:

> + kref_init(>refcount);
> +
> + job->pfdev = pfdev;
> + job->jc = args->head;
> + job->requirements = args->requirements;
> + job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
> + job->file_priv = file_priv->driver_priv;
> + xa_init_flags(>deps, XA_FLAGS_ALLOC);

As otherwise someone is going to mess up in the future and this is going
to diverge between the two ioctls.

Steve

> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 366 +++-
>  drivers/gpu/drm/panfrost/panfrost_job.c |   3 +
>  include/uapi/drm/panfrost_drm.h |  84 ++
>  3 files changed, 375 insertions(+), 78 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 6529e5972b47..e2897de6e77d 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -138,111 +138,95 @@ panfrost_get_job_mappings(struct drm_file *file_priv, 
> struct panfrost_job *job)
>   return 0;
>  }
>  
> -/**
> - * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects
> - * referenced by the job.
> - * @dev: DRM device
> - * @file_priv: DRM file for this fd
> - * @args: IOCTL args
> - * @job: job being set up
> - *
> - * Resolve handles from userspace to BOs and attach them to job.
> - *
> - * Note that this function doesn't need to unreference the BOs on
> - * failure, because that will happen at panfrost_job_cleanup() time.
> - */
> +#define PANFROST_BO_REF_ALLOWED_FLAGS \
> + (PANFROST_BO_REF_EXCLUSIVE | PANFROST_BO_REF_NO_IMPLICIT_DEP)
> +
>  static int
> -panfrost_lookup_bos(struct drm_device *dev,
> -   struct drm_file *file_priv,
> -   struct drm_panfrost_submit *args,
> -   struct panfrost_job *job)
> +panfrost_get_job_bos(struct drm_file *file_priv,
> +  u64 refs, u32 ref_stride, u32 count,
> +  struct panfrost_job *job)
>  {
> + void __user *in = u64_to_user_ptr(refs);
>   unsigned int i;
> - int ret;
>  
> - job->bo_count = args->bo_handle_count;
> + job->bo_count = count;
>  
> - if (!job->bo_count)
> + if (!count)
>   return 0;
>  
> + job->bos = kvmalloc_array(job->bo_count, sizeof(*job->bos),
> +   GFP_KERNEL | __GFP_ZERO);
>   job->bo_flags = kvmalloc_array(job->bo_count,
>  sizeof(*job->bo_flags),
>  GFP_KERNEL | __GFP_ZERO);
> - if (!job->bo_flags)
> + if (!job->bos || !job->bo_flags)
>   return -ENOMEM;
>  
> - for (i = 0; i < job->bo_count; i++)
> - job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE;
> + for (i = 0; i < count; i++) {
> + struct drm_panfrost_bo_ref ref = { };
> + int ret;
>  
> - ret = drm_gem_objects_lookup(file_priv,
> -  (void __user *)(uintptr_t)args->bo_handles,
> -  job->bo_count, >bos);
> - if (ret)
> - return ret;
> + ret = copy_struct_from_user(, sizeof(ref),
> + in + (i * ref_stride),
> + ref_stride);
> + if (ret)
> + return ret;
>  
> - return panfrost_get_job_mappings(file_priv, job);
> + /* Prior to the BATCH_SUBMIT ioctl all accessed BOs were
> +  * treated as exclusive.
> +   

Re: [PATCH v3] drm/dbi: Print errors for mipi_dbi_command()

2021-07-02 Thread Noralf Trønnes



Den 02.07.2021 15.56, skrev Linus Walleij:
> The macro mipi_dbi_command() does not report errors unless you wrap it
> in another macro to do the error reporting.
> 
> Report a rate-limited error so we know what is going on.
> 
> Drop the only user in DRM using mipi_dbi_command() and actually checking
> the error explicitly, let it use mipi_dbi_command_buf() directly
> instead.

You forgot to remove this section.

With that fixed:

Reviewed-by: Noralf Trønnes 

> 
> After this any code wishing to send command arrays can rely on
> mipi_dbi_command() providing an appropriate error message if something
> goes wrong.
> 
> Suggested-by: Noralf Trønnes 
> Suggested-by: Douglas Anderson 
> Signed-off-by: Linus Walleij 
> ---
> ChangeLog v2->v3:
> - Make the macro actually return the error value if need be, by
>   putting a single ret; at the end of the macro. (Neat trick from
>   StackOverflow!)
> - Switch the site where I switched mipi_dbi_command() to
>   mipi_dbi_command_buf() back to what it was.
> - Print the failed command in the error message.
> - Put the dbi in (parens) since drivers/gpu/drm/tiny/st7586.c was
>   passing >dbi as parameter to mipi_dbi_command()
>   and this would expand to
>   struct device *dev = &>dbi->spi->dev
>   which can't be parsed but
>   struct device *dev = &(>dbi)->spi-dev;
>   should work. I hope.
> ChangeLog v1->v2:
> - Fish out the struct device * from the DBI SPI client and use
>   that to print the errors associated with the SPI device.
> ---
>  include/drm/drm_mipi_dbi.h | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/include/drm/drm_mipi_dbi.h b/include/drm/drm_mipi_dbi.h
> index f543d6e3e822..05e194958265 100644
> --- a/include/drm/drm_mipi_dbi.h
> +++ b/include/drm/drm_mipi_dbi.h
> @@ -183,7 +183,12 @@ int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer 
> *fb,
>  #define mipi_dbi_command(dbi, cmd, seq...) \
>  ({ \
>   const u8 d[] = { seq }; \
> - mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \
> + struct device *dev = &(dbi)->spi->dev;  \
> + int ret; \
> + ret = mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \
> + if (ret) \
> + dev_err_ratelimited(dev, "error %d when sending command 
> %#02x\n", ret, cmd); \
> + ret; \
>  })
>  
>  #ifdef CONFIG_DEBUG_FS
> 


Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-07-02 Thread Robin Murphy

On 2021-07-02 14:58, Will Deacon wrote:

Hi Nathan,

On Thu, Jul 01, 2021 at 12:52:20AM -0700, Nathan Chancellor wrote:

On 7/1/2021 12:40 AM, Will Deacon wrote:

On Wed, Jun 30, 2021 at 08:56:51AM -0700, Nathan Chancellor wrote:

On Wed, Jun 30, 2021 at 12:43:48PM +0100, Will Deacon wrote:

On Wed, Jun 30, 2021 at 05:17:27PM +0800, Claire Chang wrote:

`BUG: unable to handle page fault for address: 003a8290` and
the fact it crashed at `_raw_spin_lock_irqsave` look like the memory
(maybe dev->dma_io_tlb_mem) was corrupted?
The dev->dma_io_tlb_mem should be set here
(https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/pci/probe.c#n2528)
through device_initialize.


I'm less sure about this. 'dma_io_tlb_mem' should be pointing at
'io_tlb_default_mem', which is a page-aligned allocation from memblock.
The spinlock is at offset 0x24 in that structure, and looking at the
register dump from the crash:

Jun 29 18:28:42 hp-4300G kernel: RSP: 0018:adb4013db9e8 EFLAGS: 00010006
Jun 29 18:28:42 hp-4300G kernel: RAX: 003a8290 RBX:  
RCX: 8900572ad580
Jun 29 18:28:42 hp-4300G kernel: RDX: 89005653f024 RSI: 000c 
RDI: 1d17
Jun 29 18:28:42 hp-4300G kernel: RBP: 0a20d000 R08: 000c 
R09: 
Jun 29 18:28:42 hp-4300G kernel: R10: 0a20d000 R11: 89005653f000 
R12: 0212
Jun 29 18:28:42 hp-4300G kernel: R13: 1000 R14: 0002 
R15: 0020
Jun 29 18:28:42 hp-4300G kernel: FS:  7f1f8898ea40() 
GS:89005728() knlGS:
Jun 29 18:28:42 hp-4300G kernel: CS:  0010 DS:  ES:  CR0: 
80050033
Jun 29 18:28:42 hp-4300G kernel: CR2: 003a8290 CR3: 0001020d 
CR4: 00350ee0
Jun 29 18:28:42 hp-4300G kernel: Call Trace:
Jun 29 18:28:42 hp-4300G kernel:  _raw_spin_lock_irqsave+0x39/0x50
Jun 29 18:28:42 hp-4300G kernel:  swiotlb_tbl_map_single+0x12b/0x4c0

Then that correlates with R11 holding the 'dma_io_tlb_mem' pointer and
RDX pointing at the spinlock. Yet RAX is holding junk :/

I agree that enabling KASAN would be a good idea, but I also think we
probably need to get some more information out of swiotlb_tbl_map_single()
to see see what exactly is going wrong in there.


I can certainly enable KASAN and if there is any debug print I can add
or dump anything, let me know!


I bit the bullet and took v5.13 with swiotlb/for-linus-5.14 merged in, built
x86 defconfig and ran it on my laptop. However, it seems to work fine!

Please can you share your .config?


Sure thing, it is attached. It is just Arch Linux's config run through
olddefconfig. The original is below in case you need to diff it.

https://raw.githubusercontent.com/archlinux/svntogit-packages/9045405dc835527164f3034b3ceb9a67c7a53cd4/trunk/config

If there is anything more that I can provide, please let me know.


I eventually got this booting (for some reason it was causing LD to SEGV
trying to link it for a while...) and sadly it works fine on my laptop. Hmm.

Did you manage to try again with KASAN?

It might also be worth taking the IOMMU out of the equation, since that
interfaces differently with SWIOTLB and I couldn't figure out the code path
from the log you provided. What happens if you boot with "amd_iommu=off
swiotlb=force"?


Oh, now there's a thing... the chat from the IOMMU API in the boot log 
implies that the IOMMU *should* be in the picture - we see that default 
domains are IOMMU_DOMAIN_DMA default and the GPU :0c:00.0 was added 
to a group. That means dev->dma_ops should be set and DMA API calls 
should be going through iommu-dma, yet the callstack in the crash says 
we've gone straight from dma_map_page_attrs() to swiotlb_map(), implying 
the inline dma_direct_map_page() path.


If dev->dma_ops didn't look right in the first place, it's perhaps less 
surprising that dev->dma_io_tlb_mem might be wild as well. It doesn't 
seem plausible that we should have a race between initialising the 
device and probing its driver, so maybe the whole dev pointer is getting 
trampled earlier in the callchain (or is fundamentally wrong to begin 
with, but from a quick skim of the amdgpu code it did look like 
adev->dev and adev->pdev are appropriately set early on by 
amdgpu_pci_probe()).



(although word of warning here: i915 dies horribly on my laptop if I pass
swiotlb=force, even with the distro 5.10 kernel)


FWIW I'd imagine you probably need to massively increase the SWIOTLB 
buffer size to have hope of that working.


Robin.


Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-02 Thread Alyssa Rosenzweig
```
> +/* Syncobj reference passed at job submission time to encode explicit
> + * input/output fences.
> + */
> +struct drm_panfrost_syncobj_ref {
> + __u32 handle;
> + __u32 pad;
> + __u64 point;
> +};
```

What is handle? What is point? Why is there padding instead of putting
point first?

```
>  #define PANFROST_BO_REF_EXCLUSIVE0x1
> +#define PANFROST_BO_REF_NO_IMPLICIT_DEP  0x2
```

This seems logically backwards. NO_IMPLICIT_DEP makes sense if we're
trying to keep backwards compatibility, but here you're crafting a new
interface totally from scratch. If anything, isn't BO_REF_IMPLICIT_DEP
the flag you'd want?

```
> + /**
> +  * Stride of the jobs array (needed to ease extension of the
> +  * BATCH_SUBMIT ioctl). Should be set to
> +  * sizeof(struct drm_panfrost_job).
> +  */
> + __u32 job_stride;
...
> + /**
> +  * Stride of the BO and syncobj reference arrays (needed to ease
> +  * extension of the BATCH_SUBMIT ioctl). Should be set to
> +  * sizeof(struct drm_panfrost_bo_ref).
> +  */
> + __u32 bo_ref_stride;
> + __u32 syncobj_ref_stride;
```

Hmm. I'm not /opposed/ and I know kbase uses strides but it seems like
somewhat unwarranted complexity, and there is a combinatoric explosion
here (if jobs, bo refs, and syncobj refs use 3 different versions, as
this encoding permits... as opposed to just specifying a UABI version or
something like that)

```
> + /**
> +  * If the submission fails, this encodes the index of the job
> +  * failed.
> +  */
> + __u32 fail_idx;
```

What if multiple jobs fail?

```
> + /**
> +  * ID of the queue to submit those jobs to. 0 is the default
> +  * submit queue and should always exists. If you need a dedicated
> +  * queue, create it with DRM_IOCTL_PANFROST_CREATE_SUBMITQUEUE.
> +  */
> + __u32 queue;
```

s/exists/exist/


Re: [Intel-gfx] [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+

2021-07-02 Thread Michal Wajdeczko



On 02.07.2021 10:09, Martin Peres wrote:
> On 02/07/2021 10:29, Pekka Paalanen wrote:
>> On Thu, 1 Jul 2021 21:28:06 +0200
>> Daniel Vetter  wrote:
>>
>>> On Thu, Jul 1, 2021 at 8:27 PM Martin Peres 
>>> wrote:

 On 01/07/2021 11:14, Pekka Paalanen wrote:
> On Wed, 30 Jun 2021 11:58:25 -0700
> John Harrison  wrote:
>  
>> On 6/30/2021 01:22, Martin Peres wrote:
>>> On 24/06/2021 10:05, Matthew Brost wrote:
 From: Daniele Ceraolo Spurio 

 Unblock GuC submission on Gen11+ platforms.

 Signed-off-by: Michal Wajdeczko 
 Signed-off-by: Daniele Ceraolo Spurio
 
 Signed-off-by: Matthew Brost 
 ---
     drivers/gpu/drm/i915/gt/uc/intel_guc.h    |  1 +
     drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c |  8 
     drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h |  3 +--
     drivers/gpu/drm/i915/gt/uc/intel_uc.c | 14
 +-
     4 files changed, 19 insertions(+), 7 deletions(-)
   
>
> ...
>  
 diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
 b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
 index 7a69c3c027e9..61be0aa81492 100644
 --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
 +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
 @@ -34,8 +34,15 @@ static void uc_expand_default_options(struct
 intel_uc *uc)
     return;
     }
     -    /* Default: enable HuC authentication only */
 -    i915->params.enable_guc = ENABLE_GUC_LOAD_HUC;
 +    /* Intermediate platforms are HuC authentication only */
 +    if (IS_DG1(i915) || IS_ALDERLAKE_S(i915)) {
 +    drm_dbg(>drm, "Disabling GuC only due to old
 platform\n");
>>>
>>> This comment does not seem accurate, given that DG1 is barely
>>> out, and
>>> ADL is not out yet. How about:
>>>
>>> "Disabling GuC on untested platforms"?
>>>   
>> Just because something is not in the shops yet does not mean it is
>> new.
>> Technology is always obsolete by the time it goes on sale.
>
> That is a very good reason to not use terminology like "new", "old",
> "current", "modern" etc. at all.
>
> End users like me definitely do not share your interpretation of
> "old".

 Yep, old and new is relative. In the end, what matters is the
 validation
 effort, which is why I was proposing "untested platforms".

 Also, remember that you are not writing these messages for Intel
 engineers, but instead are writing for Linux *users*.
>>>
>>> It's drm_dbg. Users don't read this stuff, at least not users with no
>>> clue what the driver does and stuff like that.
>>
>> If I had a problem, I would read it, and I have no clue what anything
>> of that is.
> 
> Exactly.
> 
> This level of defense for what is clearly a bad *debug* message (at the
> very least, the grammar) makes no sense at all!
> 
> I don't want to hear arguments like "Not my patch" from a developer
> literally sending the patch to the ML and who added his SoB to the
> patch, playing with words, or minimizing the problem of having such a
> message.

Agree that 'not my patch' is never a good excuse, but equally we can't
blame original patch author as patch was updated few times since then.

Maybe to avoid confusions and simplify reviews, we could split this
patch into two smaller: first one that really unblocks GuC submission on
all Gen11+ (see __guc_submission_supported) and second one that updates
defaults for Gen12+ (see uc_expand_default_options), as original patch
(from ~2019) evolved more than what title/commit message says.

Then we can fix all messaging and make sure it's clear and understood.

Thanks,
Michal

> 
> All of the above are just clear signals for the community to get off
> your playground, which is frankly unacceptable. Your email address does
> not matter.
> 
> In the spirit of collaboration, your response should have been "Good
> catch, how about  or ?". This would not have wasted everyone's
> time in an attempt to just have it your way.
> 
> My level of confidence in this GuC transition was already low, but you
> guys are working hard to shoot yourself in the foot. Trust should be
> earned!
> 
> Martin
> 
>>
>>
>> Thanks,
>> pq
>>
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [PATCH v3 4/7] drm/panfrost: Add the ability to create submit queues

2021-07-02 Thread Steven Price
On 02/07/2021 15:32, Boris Brezillon wrote:
> Needed to keep VkQueues isolated from each other.
> 
> v3:
> * Limit the number of submitqueue per context to 16
> * Fix a deadlock
> 
> Signed-off-by: Boris Brezillon 

16 ought to be enough for anyone ;)

Reviewed-by: Steven Price 

> ---
>  drivers/gpu/drm/panfrost/Makefile |   3 +-
>  drivers/gpu/drm/panfrost/panfrost_device.h|   2 +-
>  drivers/gpu/drm/panfrost/panfrost_drv.c   |  69 +++--
>  drivers/gpu/drm/panfrost/panfrost_job.c   |  47 ++
>  drivers/gpu/drm/panfrost/panfrost_job.h   |   9 +-
>  .../gpu/drm/panfrost/panfrost_submitqueue.c   | 136 ++
>  .../gpu/drm/panfrost/panfrost_submitqueue.h   |  27 
>  include/uapi/drm/panfrost_drm.h   |  17 +++
>  8 files changed, 264 insertions(+), 46 deletions(-)
>  create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c
>  create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h
> 
> diff --git a/drivers/gpu/drm/panfrost/Makefile 
> b/drivers/gpu/drm/panfrost/Makefile
> index b71935862417..e99192b66ec9 100644
> --- a/drivers/gpu/drm/panfrost/Makefile
> +++ b/drivers/gpu/drm/panfrost/Makefile
> @@ -9,6 +9,7 @@ panfrost-y := \
>   panfrost_gpu.o \
>   panfrost_job.o \
>   panfrost_mmu.o \
> - panfrost_perfcnt.o
> + panfrost_perfcnt.o \
> + panfrost_submitqueue.o
>  
>  obj-$(CONFIG_DRM_PANFROST) += panfrost.o
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
> b/drivers/gpu/drm/panfrost/panfrost_device.h
> index 8b25278f34c8..51c0ba4e50f5 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -137,7 +137,7 @@ struct panfrost_mmu {
>  struct panfrost_file_priv {
>   struct panfrost_device *pfdev;
>  
> - struct drm_sched_entity sched_entity[NUM_JOB_SLOTS];
> + struct idr queues;
>  
>   struct panfrost_mmu *mmu;
>  };
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index b6b5997c9366..6529e5972b47 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -19,6 +19,7 @@
>  #include "panfrost_job.h"
>  #include "panfrost_gpu.h"
>  #include "panfrost_perfcnt.h"
> +#include "panfrost_submitqueue.h"
>  
>  static bool unstable_ioctls;
>  module_param_unsafe(unstable_ioctls, bool, 0600);
> @@ -250,6 +251,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
> void *data,
>   struct panfrost_device *pfdev = dev->dev_private;
>   struct drm_panfrost_submit *args = data;
>   struct drm_syncobj *sync_out = NULL;
> + struct panfrost_submitqueue *queue;
>   struct panfrost_job *job;
>   int ret = 0;
>  
> @@ -259,10 +261,16 @@ static int panfrost_ioctl_submit(struct drm_device 
> *dev, void *data,
>   if (args->requirements && args->requirements != PANFROST_JD_REQ_FS)
>   return -EINVAL;
>  
> + queue = panfrost_submitqueue_get(file->driver_priv, 0);
> + if (IS_ERR(queue))
> + return PTR_ERR(queue);
> +
>   if (args->out_sync > 0) {
>   sync_out = drm_syncobj_find(file, args->out_sync);
> - if (!sync_out)
> - return -ENODEV;
> + if (!sync_out) {
> + ret = -ENODEV;
> + goto fail_put_queue;
> + }
>   }
>  
>   job = kzalloc(sizeof(*job), GFP_KERNEL);
> @@ -289,7 +297,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
> void *data,
>   if (ret)
>   goto fail_job;
>  
> - ret = panfrost_job_push(job);
> + ret = panfrost_job_push(queue, job);
>   if (ret)
>   goto fail_job;
>  
> @@ -302,6 +310,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
> void *data,
>  fail_out_sync:
>   if (sync_out)
>   drm_syncobj_put(sync_out);
> +fail_put_queue:
> + panfrost_submitqueue_put(queue);
>  
>   return ret;
>  }
> @@ -451,6 +461,36 @@ static int panfrost_ioctl_madvise(struct drm_device 
> *dev, void *data,
>   return ret;
>  }
>  
> +static int
> +panfrost_ioctl_create_submitqueue(struct drm_device *dev, void *data,
> +   struct drm_file *file_priv)
> +{
> + struct panfrost_file_priv *priv = file_priv->driver_priv;
> + struct drm_panfrost_create_submitqueue *args = data;
> + struct panfrost_submitqueue *queue;
> +
> + queue = panfrost_submitqueue_create(priv, args->priority, args->flags);
> + if (IS_ERR(queue))
> + return PTR_ERR(queue);
> +
> + args->id = queue->id;
> + return 0;
> +}
> +
> +static int
> +panfrost_ioctl_destroy_submitqueue(struct drm_device *dev, void *data,
> +struct drm_file *file_priv)
> +{
> + struct panfrost_file_priv *priv = file_priv->driver_priv;
> + u32 id = *((u32 *)data);
> +
> + /* Default queue can't be destroyed. */
> + if 

[PATCH v3 4/7] drm/panfrost: Add the ability to create submit queues

2021-07-02 Thread Boris Brezillon
Needed to keep VkQueues isolated from each other.

v3:
* Limit the number of submitqueue per context to 16
* Fix a deadlock

Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/panfrost/Makefile |   3 +-
 drivers/gpu/drm/panfrost/panfrost_device.h|   2 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c   |  69 +++--
 drivers/gpu/drm/panfrost/panfrost_job.c   |  47 ++
 drivers/gpu/drm/panfrost/panfrost_job.h   |   9 +-
 .../gpu/drm/panfrost/panfrost_submitqueue.c   | 136 ++
 .../gpu/drm/panfrost/panfrost_submitqueue.h   |  27 
 include/uapi/drm/panfrost_drm.h   |  17 +++
 8 files changed, 264 insertions(+), 46 deletions(-)
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h

diff --git a/drivers/gpu/drm/panfrost/Makefile 
b/drivers/gpu/drm/panfrost/Makefile
index b71935862417..e99192b66ec9 100644
--- a/drivers/gpu/drm/panfrost/Makefile
+++ b/drivers/gpu/drm/panfrost/Makefile
@@ -9,6 +9,7 @@ panfrost-y := \
panfrost_gpu.o \
panfrost_job.o \
panfrost_mmu.o \
-   panfrost_perfcnt.o
+   panfrost_perfcnt.o \
+   panfrost_submitqueue.o
 
 obj-$(CONFIG_DRM_PANFROST) += panfrost.o
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
b/drivers/gpu/drm/panfrost/panfrost_device.h
index 8b25278f34c8..51c0ba4e50f5 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -137,7 +137,7 @@ struct panfrost_mmu {
 struct panfrost_file_priv {
struct panfrost_device *pfdev;
 
-   struct drm_sched_entity sched_entity[NUM_JOB_SLOTS];
+   struct idr queues;
 
struct panfrost_mmu *mmu;
 };
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index b6b5997c9366..6529e5972b47 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -19,6 +19,7 @@
 #include "panfrost_job.h"
 #include "panfrost_gpu.h"
 #include "panfrost_perfcnt.h"
+#include "panfrost_submitqueue.h"
 
 static bool unstable_ioctls;
 module_param_unsafe(unstable_ioctls, bool, 0600);
@@ -250,6 +251,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
struct panfrost_device *pfdev = dev->dev_private;
struct drm_panfrost_submit *args = data;
struct drm_syncobj *sync_out = NULL;
+   struct panfrost_submitqueue *queue;
struct panfrost_job *job;
int ret = 0;
 
@@ -259,10 +261,16 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
if (args->requirements && args->requirements != PANFROST_JD_REQ_FS)
return -EINVAL;
 
+   queue = panfrost_submitqueue_get(file->driver_priv, 0);
+   if (IS_ERR(queue))
+   return PTR_ERR(queue);
+
if (args->out_sync > 0) {
sync_out = drm_syncobj_find(file, args->out_sync);
-   if (!sync_out)
-   return -ENODEV;
+   if (!sync_out) {
+   ret = -ENODEV;
+   goto fail_put_queue;
+   }
}
 
job = kzalloc(sizeof(*job), GFP_KERNEL);
@@ -289,7 +297,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
if (ret)
goto fail_job;
 
-   ret = panfrost_job_push(job);
+   ret = panfrost_job_push(queue, job);
if (ret)
goto fail_job;
 
@@ -302,6 +310,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
 fail_out_sync:
if (sync_out)
drm_syncobj_put(sync_out);
+fail_put_queue:
+   panfrost_submitqueue_put(queue);
 
return ret;
 }
@@ -451,6 +461,36 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, 
void *data,
return ret;
 }
 
+static int
+panfrost_ioctl_create_submitqueue(struct drm_device *dev, void *data,
+ struct drm_file *file_priv)
+{
+   struct panfrost_file_priv *priv = file_priv->driver_priv;
+   struct drm_panfrost_create_submitqueue *args = data;
+   struct panfrost_submitqueue *queue;
+
+   queue = panfrost_submitqueue_create(priv, args->priority, args->flags);
+   if (IS_ERR(queue))
+   return PTR_ERR(queue);
+
+   args->id = queue->id;
+   return 0;
+}
+
+static int
+panfrost_ioctl_destroy_submitqueue(struct drm_device *dev, void *data,
+  struct drm_file *file_priv)
+{
+   struct panfrost_file_priv *priv = file_priv->driver_priv;
+   u32 id = *((u32 *)data);
+
+   /* Default queue can't be destroyed. */
+   if (!id)
+   return -ENOENT;
+
+   return panfrost_submitqueue_destroy(priv, id);
+}
+
 int panfrost_unstable_ioctl_check(void)
 {
if (!unstable_ioctls)
@@ -465,6 +505,7 @@ panfrost_open(struct drm_device *dev, struct drm_file *file)
int ret;
  

[PATCH v3 6/7] drm/panfrost: Advertise the SYNCOBJ_TIMELINE feature

2021-07-02 Thread Boris Brezillon
Now that we have a new SUBMIT ioctl dealing with timelined syncojbs we
can advertise the feature.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index e2897de6e77d..242a16246d79 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -791,7 +791,8 @@ DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops);
  * - 1.2 - adds AFBC_FEATURES query
  */
 static const struct drm_driver panfrost_drm_driver = {
-   .driver_features= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ,
+   .driver_features= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ |
+ DRIVER_SYNCOBJ_TIMELINE,
.open   = panfrost_open,
.postclose  = panfrost_postclose,
.ioctls = panfrost_drm_driver_ioctls,
-- 
2.31.1



[PATCH v3 7/7] drm/panfrost: Bump minor version to reflect the feature additions

2021-07-02 Thread Boris Brezillon
We now have a new ioctl that allows submitting multiple jobs at once
(among other things) and we support timelined syncobjs. Bump the
minor version number to reflect those changes.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 242a16246d79..33cd34a1213c 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -789,6 +789,8 @@ DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops);
  * - 1.0 - initial interface
  * - 1.1 - adds HEAP and NOEXEC flags for CREATE_BO
  * - 1.2 - adds AFBC_FEATURES query
+ * - 1.3 - adds the BATCH_SUBMIT, CREATE_SUBMITQUEUE, DESTROY_SUBMITQUEUE
+ *ioctls and advertises the SYNCOBJ_TIMELINE feature
  */
 static const struct drm_driver panfrost_drm_driver = {
.driver_features= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ |
@@ -802,7 +804,7 @@ static const struct drm_driver panfrost_drm_driver = {
.desc   = "panfrost DRM",
.date   = "20180908",
.major  = 1,
-   .minor  = 2,
+   .minor  = 3,
 
.gem_create_object  = panfrost_gem_create_object,
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
-- 
2.31.1



[PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches

2021-07-02 Thread Boris Brezillon
This should help limit the number of ioctls when submitting multiple
jobs. The new ioctl also supports syncobj timelines and BO access flags.

v3:
* Re-use panfrost_get_job_bos() and panfrost_get_job_in_syncs() in the
  old submit path

Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 366 +++-
 drivers/gpu/drm/panfrost/panfrost_job.c |   3 +
 include/uapi/drm/panfrost_drm.h |  84 ++
 3 files changed, 375 insertions(+), 78 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 6529e5972b47..e2897de6e77d 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -138,111 +138,95 @@ panfrost_get_job_mappings(struct drm_file *file_priv, 
struct panfrost_job *job)
return 0;
 }
 
-/**
- * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects
- * referenced by the job.
- * @dev: DRM device
- * @file_priv: DRM file for this fd
- * @args: IOCTL args
- * @job: job being set up
- *
- * Resolve handles from userspace to BOs and attach them to job.
- *
- * Note that this function doesn't need to unreference the BOs on
- * failure, because that will happen at panfrost_job_cleanup() time.
- */
+#define PANFROST_BO_REF_ALLOWED_FLAGS \
+   (PANFROST_BO_REF_EXCLUSIVE | PANFROST_BO_REF_NO_IMPLICIT_DEP)
+
 static int
-panfrost_lookup_bos(struct drm_device *dev,
- struct drm_file *file_priv,
- struct drm_panfrost_submit *args,
- struct panfrost_job *job)
+panfrost_get_job_bos(struct drm_file *file_priv,
+u64 refs, u32 ref_stride, u32 count,
+struct panfrost_job *job)
 {
+   void __user *in = u64_to_user_ptr(refs);
unsigned int i;
-   int ret;
 
-   job->bo_count = args->bo_handle_count;
+   job->bo_count = count;
 
-   if (!job->bo_count)
+   if (!count)
return 0;
 
+   job->bos = kvmalloc_array(job->bo_count, sizeof(*job->bos),
+ GFP_KERNEL | __GFP_ZERO);
job->bo_flags = kvmalloc_array(job->bo_count,
   sizeof(*job->bo_flags),
   GFP_KERNEL | __GFP_ZERO);
-   if (!job->bo_flags)
+   if (!job->bos || !job->bo_flags)
return -ENOMEM;
 
-   for (i = 0; i < job->bo_count; i++)
-   job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE;
+   for (i = 0; i < count; i++) {
+   struct drm_panfrost_bo_ref ref = { };
+   int ret;
 
-   ret = drm_gem_objects_lookup(file_priv,
-(void __user *)(uintptr_t)args->bo_handles,
-job->bo_count, >bos);
-   if (ret)
-   return ret;
+   ret = copy_struct_from_user(, sizeof(ref),
+   in + (i * ref_stride),
+   ref_stride);
+   if (ret)
+   return ret;
 
-   return panfrost_get_job_mappings(file_priv, job);
+   /* Prior to the BATCH_SUBMIT ioctl all accessed BOs were
+* treated as exclusive.
+*/
+   if (ref_stride == sizeof(u32))
+   ref.flags = PANFROST_BO_REF_EXCLUSIVE;
+
+   if ((ref.flags & ~PANFROST_BO_REF_ALLOWED_FLAGS))
+   return -EINVAL;
+
+   job->bos[i] = drm_gem_object_lookup(file_priv, ref.handle);
+   if (!job->bos[i])
+   return -EINVAL;
+
+   job->bo_flags[i] = ref.flags;
+   }
+
+   return 0;
 }
 
-/**
- * panfrost_copy_in_sync() - Sets up job->deps with the sync objects
- * referenced by the job.
- * @dev: DRM device
- * @file_priv: DRM file for this fd
- * @args: IOCTL args
- * @job: job being set up
- *
- * Resolve syncobjs from userspace to fences and attach them to job.
- *
- * Note that this function doesn't need to unreference the fences on
- * failure, because that will happen at panfrost_job_cleanup() time.
- */
 static int
-panfrost_copy_in_sync(struct drm_device *dev,
- struct drm_file *file_priv,
- struct drm_panfrost_submit *args,
- struct panfrost_job *job)
+panfrost_get_job_in_syncs(struct drm_file *file_priv,
+ u64 refs, u32 ref_stride,
+ u32 count, struct panfrost_job *job)
 {
-   u32 *handles;
-   int ret = 0;
-   int i, in_fence_count;
+   const void __user *in = u64_to_user_ptr(refs);
+   unsigned int i;
+   int ret;
 
-   in_fence_count = args->in_sync_count;
-
-   if (!in_fence_count)
+   if (!count)
return 0;
 
-   handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL);
-   if (!handles) {
-   ret = -ENOMEM;
-  

[PATCH v3 3/7] drm/panfrost: Add BO access flags to relax dependencies between jobs

2021-07-02 Thread Boris Brezillon
Jobs reading from the same BO should not be serialized. Add access
flags so we can relax the implicit dependencies in that case. We force
exclusive access for now to keep the behavior unchanged, but a new
SUBMIT ioctl taking explicit access flags will be introduced.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c |  9 +
 drivers/gpu/drm/panfrost/panfrost_job.c | 23 +++
 drivers/gpu/drm/panfrost/panfrost_job.h |  1 +
 include/uapi/drm/panfrost_drm.h |  2 ++
 4 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 9bbc9e78cc85..b6b5997c9366 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -164,6 +164,15 @@ panfrost_lookup_bos(struct drm_device *dev,
if (!job->bo_count)
return 0;
 
+   job->bo_flags = kvmalloc_array(job->bo_count,
+  sizeof(*job->bo_flags),
+  GFP_KERNEL | __GFP_ZERO);
+   if (!job->bo_flags)
+   return -ENOMEM;
+
+   for (i = 0; i < job->bo_count; i++)
+   job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE;
+
ret = drm_gem_objects_lookup(file_priv,
 (void __user *)(uintptr_t)args->bo_handles,
 job->bo_count, >bos);
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index fdc1bd7ecf12..152245b122be 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -245,8 +245,16 @@ static int panfrost_acquire_object_fences(struct 
panfrost_job *job)
int i, ret;
 
for (i = 0; i < job->bo_count; i++) {
-   /* panfrost always uses write mode in its current uapi */
-   ret = drm_gem_fence_array_add_implicit(>deps, job->bos[i], 
true);
+   bool exclusive = job->bo_flags[i] & PANFROST_BO_REF_EXCLUSIVE;
+
+   if (!exclusive) {
+   ret = dma_resv_reserve_shared(job->bos[i]->resv, 1);
+   if (ret)
+   return ret;
+   }
+
+   ret = drm_gem_fence_array_add_implicit(>deps, job->bos[i],
+  exclusive);
if (ret)
return ret;
}
@@ -258,8 +266,14 @@ static void panfrost_attach_object_fences(struct 
panfrost_job *job)
 {
int i;
 
-   for (i = 0; i < job->bo_count; i++)
-   dma_resv_add_excl_fence(job->bos[i]->resv, 
job->render_done_fence);
+   for (i = 0; i < job->bo_count; i++) {
+   struct dma_resv *robj = job->bos[i]->resv;
+
+   if (job->bo_flags[i] & PANFROST_BO_REF_EXCLUSIVE)
+   dma_resv_add_excl_fence(robj, job->render_done_fence);
+   else
+   dma_resv_add_shared_fence(robj, job->render_done_fence);
+   }
 }
 
 int panfrost_job_push(struct panfrost_job *job)
@@ -340,6 +354,7 @@ static void panfrost_job_cleanup(struct kref *ref)
kvfree(job->bos);
}
 
+   kvfree(job->bo_flags);
kfree(job);
 }
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h 
b/drivers/gpu/drm/panfrost/panfrost_job.h
index 82306a03b57e..1cbc3621b663 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.h
+++ b/drivers/gpu/drm/panfrost/panfrost_job.h
@@ -32,6 +32,7 @@ struct panfrost_job {
 
struct panfrost_gem_mapping **mappings;
struct drm_gem_object **bos;
+   u32 *bo_flags;
u32 bo_count;
 
/* Fence to be signaled by drm-sched once its done with the job */
diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index 061e700dd06c..45d6c600475c 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -224,6 +224,8 @@ struct drm_panfrost_madvise {
__u32 retained;   /* out, whether backing store still exists */
 };
 
+#define PANFROST_BO_REF_EXCLUSIVE  0x1
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.31.1



[PATCH v3 2/7] drm/panfrost: Move the mappings collection out of panfrost_lookup_bos()

2021-07-02 Thread Boris Brezillon
So we can re-use it from elsewhere.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 52 ++---
 1 file changed, 29 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 1ffaef5ec5ff..9bbc9e78cc85 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -109,6 +109,34 @@ static int panfrost_ioctl_create_bo(struct drm_device 
*dev, void *data,
return 0;
 }
 
+static int
+panfrost_get_job_mappings(struct drm_file *file_priv, struct panfrost_job *job)
+{
+   struct panfrost_file_priv *priv = file_priv->driver_priv;
+   unsigned int i;
+
+   job->mappings = kvmalloc_array(job->bo_count,
+  sizeof(*job->mappings),
+  GFP_KERNEL | __GFP_ZERO);
+   if (!job->mappings)
+   return -ENOMEM;
+
+   for (i = 0; i < job->bo_count; i++) {
+   struct panfrost_gem_mapping *mapping;
+   struct panfrost_gem_object *bo;
+
+   bo = to_panfrost_bo(job->bos[i]);
+   mapping = panfrost_gem_mapping_get(bo, priv);
+   if (!mapping)
+   return -EINVAL;
+
+   atomic_inc(>gpu_usecount);
+   job->mappings[i] = mapping;
+   }
+
+   return 0;
+}
+
 /**
  * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects
  * referenced by the job.
@@ -128,8 +156,6 @@ panfrost_lookup_bos(struct drm_device *dev,
  struct drm_panfrost_submit *args,
  struct panfrost_job *job)
 {
-   struct panfrost_file_priv *priv = file_priv->driver_priv;
-   struct panfrost_gem_object *bo;
unsigned int i;
int ret;
 
@@ -144,27 +170,7 @@ panfrost_lookup_bos(struct drm_device *dev,
if (ret)
return ret;
 
-   job->mappings = kvmalloc_array(job->bo_count,
-  sizeof(struct panfrost_gem_mapping *),
-  GFP_KERNEL | __GFP_ZERO);
-   if (!job->mappings)
-   return -ENOMEM;
-
-   for (i = 0; i < job->bo_count; i++) {
-   struct panfrost_gem_mapping *mapping;
-
-   bo = to_panfrost_bo(job->bos[i]);
-   mapping = panfrost_gem_mapping_get(bo, priv);
-   if (!mapping) {
-   ret = -EINVAL;
-   break;
-   }
-
-   atomic_inc(>gpu_usecount);
-   job->mappings[i] = mapping;
-   }
-
-   return ret;
+   return panfrost_get_job_mappings(file_priv, job);
 }
 
 /**
-- 
2.31.1



[PATCH v3 1/7] drm/panfrost: Pass a job to panfrost_{acquire, attach}_object_fences()

2021-07-02 Thread Boris Brezillon
So we don't have to change the prototype if we extend the function.

v3:
* Fix subject

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 71a72fb50e6b..fdc1bd7ecf12 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -240,15 +240,13 @@ static void panfrost_job_hw_submit(struct panfrost_job 
*job, int js)
spin_unlock(>js->job_lock);
 }
 
-static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
- int bo_count,
- struct xarray *deps)
+static int panfrost_acquire_object_fences(struct panfrost_job *job)
 {
int i, ret;
 
-   for (i = 0; i < bo_count; i++) {
+   for (i = 0; i < job->bo_count; i++) {
/* panfrost always uses write mode in its current uapi */
-   ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
+   ret = drm_gem_fence_array_add_implicit(>deps, job->bos[i], 
true);
if (ret)
return ret;
}
@@ -256,14 +254,12 @@ static int panfrost_acquire_object_fences(struct 
drm_gem_object **bos,
return 0;
 }
 
-static void panfrost_attach_object_fences(struct drm_gem_object **bos,
- int bo_count,
- struct dma_fence *fence)
+static void panfrost_attach_object_fences(struct panfrost_job *job)
 {
int i;
 
-   for (i = 0; i < bo_count; i++)
-   dma_resv_add_excl_fence(bos[i]->resv, fence);
+   for (i = 0; i < job->bo_count; i++)
+   dma_resv_add_excl_fence(job->bos[i]->resv, 
job->render_done_fence);
 }
 
 int panfrost_job_push(struct panfrost_job *job)
@@ -290,8 +286,7 @@ int panfrost_job_push(struct panfrost_job *job)
 
job->render_done_fence = dma_fence_get(>base.s_fence->finished);
 
-   ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
->deps);
+   ret = panfrost_acquire_object_fences(job);
if (ret) {
mutex_unlock(>sched_lock);
goto unlock;
@@ -303,8 +298,7 @@ int panfrost_job_push(struct panfrost_job *job)
 
mutex_unlock(>sched_lock);
 
-   panfrost_attach_object_fences(job->bos, job->bo_count,
- job->render_done_fence);
+   panfrost_attach_object_fences(job);
 
 unlock:
drm_gem_unlock_reservations(job->bos, job->bo_count, _ctx);
-- 
2.31.1



[PATCH v3 0/7] drm/panfrost: drm/panfrost: Add a new submit ioctl

2021-07-02 Thread Boris Brezillon
Hello,

This is an attempt at providing a new submit ioctl that's more
Vulkan-friendly than the existing one. This ioctl

1/ allows passing several out syncobjs so we can easily update
   several fence/semaphore in a single ioctl() call
2/ allows passing several jobs so we don't have to have one ioctl
   per job-chain recorded in the command buffer
3/ supports disabling implicit dependencies as well as 
   non-exclusive access to BOs, thus removing unnecessary
   synchronization

I've also been looking at adding {IN,OUT}_FENCE_FD support (allowing
one to pass at most one sync_file object in input and/or creating a
sync_file FD embedding the render out fence), but it's not entirely
clear to me when that's useful. Indeed, we can already do the
sync_file <-> syncobj conversion using the
SYNCOBJ_{FD_TO_HANDLE,HANDLE_TO_FD} ioctls if we have to.
Note that, unlike Turnip, PanVk is using syncobjs to implement
vkQueueWaitIdle(), so the syncobj -> sync_file conversion doesn't
have to happen for each submission, but maybe there's a good reason
to use sync_files for that too. Any feedback on that aspect would
be useful I guess.

Any feedback on this new ioctl is welcome, in particular, do you
think other things are missing/would be nice to have for Vulkan?

Regards,

Boris

P.S.: basic igt tests for these new ioctls re available there [1]

[1]https://gitlab.freedesktop.org/bbrezillon/igt-gpu-tools/-/tree/panfrost-batch-submit

Changes in v3:
* Fix a deadlock in the submitqueue logic
* Limit the number of submitqueue per context to 16

Boris Brezillon (7):
  drm/panfrost: Pass a job to panfrost_{acquire,attach}_object_fences()
  drm/panfrost: Move the mappings collection out of
panfrost_lookup_bos()
  drm/panfrost: Add BO access flags to relax dependencies between jobs
  drm/panfrost: Add the ability to create submit queues
  drm/panfrost: Add a new ioctl to submit batches
  drm/panfrost: Advertise the SYNCOBJ_TIMELINE feature
  drm/panfrost: Bump minor version to reflect the feature additions

 drivers/gpu/drm/panfrost/Makefile |   3 +-
 drivers/gpu/drm/panfrost/panfrost_device.h|   2 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c   | 463 ++
 drivers/gpu/drm/panfrost/panfrost_job.c   |  89 ++--
 drivers/gpu/drm/panfrost/panfrost_job.h   |  10 +-
 .../gpu/drm/panfrost/panfrost_submitqueue.c   | 136 +
 .../gpu/drm/panfrost/panfrost_submitqueue.h   |  27 +
 include/uapi/drm/panfrost_drm.h   | 103 
 8 files changed, 689 insertions(+), 144 deletions(-)
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h

-- 
2.31.1



Re: [Intel-gfx] [PATCH v2 3/3] drm/i915/uapi: reject set_domain for discrete

2021-07-02 Thread Tvrtko Ursulin



On 01/07/2021 16:10, Matthew Auld wrote:

The CPU domain should be static for discrete, and on DG1 we don't need
any flushing since everything is already coherent, so really all this


Knowledge of the write combine buffer is assumed to be had by anyone involved?


does is an object wait, for which we have an ioctl. Longer term the
desired caching should be an immutable creation time property for the
BO, which can be set with something like gem_create_ext.

One other user is iris + userptr, which uses the set_domain to probe all
the pages to check if the GUP succeeds, however keeping the set_domain
around just for that seems rather scuffed. We could equally just submit
a dummy batch, which should hopefully be good enough, otherwise adding a
new creation time flag for userptr might be an option. Although longer
term we will also have vm_bind, which should also be a nice fit for
this, so adding a whole new flag is likely overkill.


Execbuf sounds horrible. But it all reminds me of past work by Chris which is 
surprisingly hard to find in the archives. Patches like:

commit 7706a433388016983052a27c0fd74a64b1897ae7
Author: Chris Wilson 
Date:   Wed Nov 8 17:04:07 2017 +

drm/i915/userptr: Probe existence of backing struct pages upon creation

Jason Ekstrand requested a more efficient method than userptr+set-domain

to determine if the userptr object was backed by a complete set of pages
upon creation. To be more efficient than simply populating the userptr
using get_user_pages() (as done by the call to set-domain or execbuf),
we can walk the tree of vm_area_struct and check for gaps or vma not
backed by struct page (VM_PFNMAP). The question is how to handle
VM_MIXEDMAP which may be either struct page or pfn backed...

commit 7ca21d3390eec23db99b8131ed18bc036efaba18
Author: Chris Wilson 
Date:   Wed Nov 8 17:48:22 2017 +

drm/i915/userptr: Add a flag to populate the userptr on creation

Acquiring the backing struct pages for the userptr range is not free;

the first client for userptr would insist on frequently creating userptr
objects ahead of time and not use them. For that first client, deferring
the cost of populating the userptr (calling get_user_pages()) to the
actual execbuf was a substantial improvement. However, not all clients
are the same, and most would like to validate that the userptr is valid
and backed by struct pages upon creation, so offer a
I915_USERPTR_POPULATE flag to do just that.

Note that big difference between I915_USERPTR_POPULATE and the deferred

scheme is that POPULATE is guaranteed to be synchronous, the result is
known before the ioctl returns (and the handle exposed). However, due to
system memory pressure, the object may be paged out before use,
requiring them to be paged back in on execbuf (as may always happen).

At least with the first one I think I was skeptical, since probing at point A 
makes a weak test versus userptr getting used at point B. Populate is kind of 
same really when user controls the backing store. At least these two arguments 
I think stand if we are trying to sell these flags as validation. But if the 
idea is limited to pure preload, with no guarantees that it keeps working by 
time of real use, then I guess it may be passable.

Disclaimer that I haven't been following the story on why it is desirable to 
abandon set domain. Only judging from this series, mmap caching mode is implied 
from the object? Should set domain availability be driven by the object backing 
store instead of outright rejection?

Regards,

Tvrtko
 

Suggested-by: Daniel Vetter 
Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Maarten Lankhorst 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 
Cc: Ramalingam C 
---
  drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 43004bef55cb..b684a62bf3b0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -490,6 +490,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void 
*data,
u32 write_domain = args->write_domain;
int err;
  
+	if (IS_DGFX(to_i915(dev)))

+   return -ENODEV;
+
/* Only handle setting domains to types used by the CPU. */
if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS)
return -EINVAL;



Re: [PATCH v2] drm/dbi: Print errors for mipi_dbi_command()

2021-07-02 Thread kernel test robot
Hi Linus,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.13 next-20210701]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Linus-Walleij/drm-dbi-Print-errors-for-mipi_dbi_command/20210702-180745
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
3dbdb38e286903ec220aaf1fb29a8d94297da246
config: arm64-randconfig-r001-20210702 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
9eb613b2de3163686b1a4bd1160f15ac56a4b083)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm64 cross compiling tool for clang build
# apt-get install binutils-aarch64-linux-gnu
# 
https://github.com/0day-ci/linux/commit/42d93a52e398adbb1fe2dfbc895c649cc8d42780
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Linus-Walleij/drm-dbi-Print-errors-for-mipi_dbi_command/20210702-180745
git checkout 42d93a52e398adbb1fe2dfbc895c649cc8d42780
# save the attached .config to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross O=build_dir 
ARCH=arm64 SHELL=/bin/bash arch/arm64/kvm/ drivers/gpu/drm/tiny/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/tiny/st7586.c:260:2: error: member reference type 'struct 
>> mipi_dbi' is not a pointer; did you mean to use '.'?
   mipi_dbi_command(>dbi, MIPI_DCS_SET_DISPLAY_OFF);
   ^ ~~~
   include/drm/drm_mipi_dbi.h:186:27: note: expanded from macro 
'mipi_dbi_command'
   struct device *dev = >spi->dev; \
 ~~~^
>> drivers/gpu/drm/tiny/st7586.c:260:2: error: cannot take the address of an 
>> rvalue of type 'struct device *'
   mipi_dbi_command(>dbi, MIPI_DCS_SET_DISPLAY_OFF);
   ^~~~
   include/drm/drm_mipi_dbi.h:186:23: note: expanded from macro 
'mipi_dbi_command'
   struct device *dev = >spi->dev; \
^~
   2 errors generated.


vim +260 drivers/gpu/drm/tiny/st7586.c

eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner  2017-08-07  246  
eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner  2017-08-07  247  
static void st7586_pipe_disable(struct drm_simple_display_pipe *pipe)
eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner  2017-08-07  248  
{
84137b866e834a drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-07-22  249  
struct mipi_dbi_dev *dbidev = drm_to_mipi_dbi_dev(pipe->crtc.dev);
eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner  2017-08-07  250  
9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25  251  
/*
9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25  252  
 * This callback is not protected by drm_dev_enter/exit since we want to
9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25  253  
 * turn off the display on regular driver unload. It's highly unlikely
9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25  254  
 * that the underlying SPI controller is gone should this be called 
after
9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25  255  
 * unplug.
9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25  256  
 */
9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25  257  
eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner  2017-08-07  258  
DRM_DEBUG_KMS("\n");
eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner  2017-08-07  259  
84137b866e834a drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-07-22 @260  
mipi_dbi_command(>dbi, MIPI_DCS_SET_DISPLAY_OFF);
eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner  2017-08-07  261  
}
eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner  2017-08-07  262  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH v2 4/7] drm/panfrost: Add the ability to create submit queues

2021-07-02 Thread Boris Brezillon
On Fri, 2 Jul 2021 09:58:06 -0400
Alyssa Rosenzweig  wrote:

> > > My Vulkan knowledge is limited so I'm not sure whether this is the right
> > > approach or not. In particular is it correct that an application can
> > > create a high priority queue which could affect other (normal priority)
> > > applications?  
> > 
> > That's what msm does (with no extra CAPS check AFAICT), and the
> > freedreno driver can already create high priority queues if
> > PIPE_CONTEXT_HIGH_PRIORITY is passed. Not saying that's okay to allow
> > userspace to tweak the priority, but if that's a problem, other drivers
> > are in trouble too ;-).  
> 
> Speaking of, how will PIPE_CONTEXT_HIGH_PRIORITY be implemented with the
> new ioctl()? I envisioned something much simpler (for the old ioctl),
> just adding a "high priority?" flag to the submit and internally
> creating the two queues of normal/high priority for drm_sched to work
> out. Is this juggling now moved to userspace?

That's what freedreno does. I guess we could create 2 default queues
(one normal and one high prio) and extend the old submit ioctl() to do
what you suggest if you see a good reason to not switch to the new
ioctl() directly. I mean, we'll have to keep support for both anyway,
but switching to the new ioctl()) shouldn't be that hard (I can prepare
a MR transitioning the gallium driver to BATCH_SUBMIT if you want).


Re: [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+

2021-07-02 Thread Michal Wajdeczko



On 02.07.2021 15:12, Martin Peres wrote:
> On 02/07/2021 16:06, Michal Wajdeczko wrote:
>>
>>
>> On 02.07.2021 10:13, Martin Peres wrote:
>>> On 01/07/2021 21:24, Martin Peres wrote:
>>> [...]
>
>>
>>> +    i915->params.enable_guc = ENABLE_GUC_LOAD_HUC;
>>> +    return;
>>> +    }
>>> +
>>> +    /* Default: enable HuC authentication and GuC submission */
>>> +    i915->params.enable_guc = ENABLE_GUC_LOAD_HUC |
>>> ENABLE_GUC_SUBMISSION;
>>
>> This seems to be in contradiction with the GuC submission plan which
>> states:
>>
>> "Not enabled by default on any current platforms but can be
>> enabled via
>> modparam enable_guc".
>>
>
> I don't believe any current platform gets this point where GuC
> submission would be enabled by default. The first would be ADL-P which
> isn't out yet.

 Isn't that exactly what the line above does?
>>>
>>> In case you missed this crucial part of the review. Please answer the
>>> above question.
>>
>> I guess there is some misunderstanding here, and I must admit I had
>> similar doubt, but if you look beyond patch diff and check function code
>> you will find that the very condition is:
>>
>> /* Don't enable GuC/HuC on pre-Gen12 */
>> if (GRAPHICS_VER(i915) < 12) {
>>     i915->params.enable_guc = 0;
>>     return;
>> }
>>
>> so all pre-Gen12 platforms will continue to have GuC/HuC disabled.
> 
> Thanks Michal, but then the problem is the other way: how can one enable
> it on gen11?

this code here converts default GuC auto mode (enable_guc=-1) into per
platform desired (tested) GuC/HuC enables.

to override that default, you may still use enable_guc=1 to explicitly
enable GuC submission and since we also have this code:

+static bool __guc_submission_supported(struct intel_guc *guc)
+{
+   /* GuC submission is unavailable for pre-Gen11 */
+   return intel_guc_is_supported(guc) &&
+  INTEL_GEN(guc_to_gt(guc)->i915) >= 11;
+}

it should work on any Gen11+.

Michal

> 
> I like what Daniele was going for here: separating the capability from
> the user-requested value, but then it seems the patch stopped half way.
> How about never touching the parameter, and having a AND between the two
> values to get the effective enable_guc?
> 
> Right now, the code is really confusing :s
> 
> Thanks,
> Martin
> 
>>
>> Thanks,
>> Michal
>>


Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-07-02 Thread Will Deacon
Hi Nathan,

On Thu, Jul 01, 2021 at 12:52:20AM -0700, Nathan Chancellor wrote:
> On 7/1/2021 12:40 AM, Will Deacon wrote:
> > On Wed, Jun 30, 2021 at 08:56:51AM -0700, Nathan Chancellor wrote:
> > > On Wed, Jun 30, 2021 at 12:43:48PM +0100, Will Deacon wrote:
> > > > On Wed, Jun 30, 2021 at 05:17:27PM +0800, Claire Chang wrote:
> > > > > `BUG: unable to handle page fault for address: 003a8290` and
> > > > > the fact it crashed at `_raw_spin_lock_irqsave` look like the memory
> > > > > (maybe dev->dma_io_tlb_mem) was corrupted?
> > > > > The dev->dma_io_tlb_mem should be set here
> > > > > (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/pci/probe.c#n2528)
> > > > > through device_initialize.
> > > > 
> > > > I'm less sure about this. 'dma_io_tlb_mem' should be pointing at
> > > > 'io_tlb_default_mem', which is a page-aligned allocation from memblock.
> > > > The spinlock is at offset 0x24 in that structure, and looking at the
> > > > register dump from the crash:
> > > > 
> > > > Jun 29 18:28:42 hp-4300G kernel: RSP: 0018:adb4013db9e8 EFLAGS: 
> > > > 00010006
> > > > Jun 29 18:28:42 hp-4300G kernel: RAX: 003a8290 RBX: 
> > > >  RCX: 8900572ad580
> > > > Jun 29 18:28:42 hp-4300G kernel: RDX: 89005653f024 RSI: 
> > > > 000c RDI: 1d17
> > > > Jun 29 18:28:42 hp-4300G kernel: RBP: 0a20d000 R08: 
> > > > 000c R09: 
> > > > Jun 29 18:28:42 hp-4300G kernel: R10: 0a20d000 R11: 
> > > > 89005653f000 R12: 0212
> > > > Jun 29 18:28:42 hp-4300G kernel: R13: 1000 R14: 
> > > > 0002 R15: 0020
> > > > Jun 29 18:28:42 hp-4300G kernel: FS:  7f1f8898ea40() 
> > > > GS:89005728() knlGS:
> > > > Jun 29 18:28:42 hp-4300G kernel: CS:  0010 DS:  ES:  CR0: 
> > > > 80050033
> > > > Jun 29 18:28:42 hp-4300G kernel: CR2: 003a8290 CR3: 
> > > > 0001020d CR4: 00350ee0
> > > > Jun 29 18:28:42 hp-4300G kernel: Call Trace:
> > > > Jun 29 18:28:42 hp-4300G kernel:  _raw_spin_lock_irqsave+0x39/0x50
> > > > Jun 29 18:28:42 hp-4300G kernel:  swiotlb_tbl_map_single+0x12b/0x4c0
> > > > 
> > > > Then that correlates with R11 holding the 'dma_io_tlb_mem' pointer and
> > > > RDX pointing at the spinlock. Yet RAX is holding junk :/
> > > > 
> > > > I agree that enabling KASAN would be a good idea, but I also think we
> > > > probably need to get some more information out of 
> > > > swiotlb_tbl_map_single()
> > > > to see see what exactly is going wrong in there.
> > > 
> > > I can certainly enable KASAN and if there is any debug print I can add
> > > or dump anything, let me know!
> > 
> > I bit the bullet and took v5.13 with swiotlb/for-linus-5.14 merged in, built
> > x86 defconfig and ran it on my laptop. However, it seems to work fine!
> > 
> > Please can you share your .config?
> 
> Sure thing, it is attached. It is just Arch Linux's config run through
> olddefconfig. The original is below in case you need to diff it.
> 
> https://raw.githubusercontent.com/archlinux/svntogit-packages/9045405dc835527164f3034b3ceb9a67c7a53cd4/trunk/config
> 
> If there is anything more that I can provide, please let me know.

I eventually got this booting (for some reason it was causing LD to SEGV
trying to link it for a while...) and sadly it works fine on my laptop. Hmm.

Did you manage to try again with KASAN?

It might also be worth taking the IOMMU out of the equation, since that
interfaces differently with SWIOTLB and I couldn't figure out the code path
from the log you provided. What happens if you boot with "amd_iommu=off
swiotlb=force"?

(although word of warning here: i915 dies horribly on my laptop if I pass
swiotlb=force, even with the distro 5.10 kernel)

Will


Re: [PATCH v2 4/7] drm/panfrost: Add the ability to create submit queues

2021-07-02 Thread Alyssa Rosenzweig
> > My Vulkan knowledge is limited so I'm not sure whether this is the right
> > approach or not. In particular is it correct that an application can
> > create a high priority queue which could affect other (normal priority)
> > applications?
> 
> That's what msm does (with no extra CAPS check AFAICT), and the
> freedreno driver can already create high priority queues if
> PIPE_CONTEXT_HIGH_PRIORITY is passed. Not saying that's okay to allow
> userspace to tweak the priority, but if that's a problem, other drivers
> are in trouble too ;-).

Speaking of, how will PIPE_CONTEXT_HIGH_PRIORITY be implemented with the
new ioctl()? I envisioned something much simpler (for the old ioctl),
just adding a "high priority?" flag to the submit and internally
creating the two queues of normal/high priority for drm_sched to work
out. Is this juggling now moved to userspace?


[PATCH v3] drm/dbi: Print errors for mipi_dbi_command()

2021-07-02 Thread Linus Walleij
The macro mipi_dbi_command() does not report errors unless you wrap it
in another macro to do the error reporting.

Report a rate-limited error so we know what is going on.

Drop the only user in DRM using mipi_dbi_command() and actually checking
the error explicitly, let it use mipi_dbi_command_buf() directly
instead.

After this any code wishing to send command arrays can rely on
mipi_dbi_command() providing an appropriate error message if something
goes wrong.

Suggested-by: Noralf Trønnes 
Suggested-by: Douglas Anderson 
Signed-off-by: Linus Walleij 
---
ChangeLog v2->v3:
- Make the macro actually return the error value if need be, by
  putting a single ret; at the end of the macro. (Neat trick from
  StackOverflow!)
- Switch the site where I switched mipi_dbi_command() to
  mipi_dbi_command_buf() back to what it was.
- Print the failed command in the error message.
- Put the dbi in (parens) since drivers/gpu/drm/tiny/st7586.c was
  passing >dbi as parameter to mipi_dbi_command()
  and this would expand to
  struct device *dev = &>dbi->spi->dev
  which can't be parsed but
  struct device *dev = &(>dbi)->spi-dev;
  should work. I hope.
ChangeLog v1->v2:
- Fish out the struct device * from the DBI SPI client and use
  that to print the errors associated with the SPI device.
---
 include/drm/drm_mipi_dbi.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/drm/drm_mipi_dbi.h b/include/drm/drm_mipi_dbi.h
index f543d6e3e822..05e194958265 100644
--- a/include/drm/drm_mipi_dbi.h
+++ b/include/drm/drm_mipi_dbi.h
@@ -183,7 +183,12 @@ int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer 
*fb,
 #define mipi_dbi_command(dbi, cmd, seq...) \
 ({ \
const u8 d[] = { seq }; \
-   mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \
+   struct device *dev = &(dbi)->spi->dev;  \
+   int ret; \
+   ret = mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \
+   if (ret) \
+   dev_err_ratelimited(dev, "error %d when sending command 
%#02x\n", ret, cmd); \
+   ret; \
 })
 
 #ifdef CONFIG_DEBUG_FS
-- 
2.31.1



Re: [PATCH v15 12/12] of: Add plumbing for restricted DMA pool

2021-07-02 Thread Guenter Roeck

On 7/2/21 6:18 AM, Will Deacon wrote:

On Fri, Jul 02, 2021 at 12:39:41PM +0100, Robin Murphy wrote:

On 2021-07-02 04:08, Guenter Roeck wrote:

On Thu, Jun 24, 2021 at 11:55:26PM +0800, Claire Chang wrote:

If a device is not behind an IOMMU, we look up the device node and set
up the restricted DMA when the restricted-dma-pool is presented.

Signed-off-by: Claire Chang 
Tested-by: Stefano Stabellini 
Tested-by: Will Deacon 


With this patch in place, all sparc and sparc64 qemu emulations
fail to boot. Symptom is that the root file system is not found.
Reverting this patch fixes the problem. Bisect log is attached.


Ah, OF_ADDRESS depends on !SPARC, so of_dma_configure_id() is presumably
returning an unexpected -ENODEV from the of_dma_set_restricted_buffer()
stub. That should probably be returning 0 instead, since either way it's not
an error condition for it to simply do nothing.


Something like below?



Yes, that does the trick.


Will

--->8


From 4d9dcb9210c1f37435b6088284e04b6b36ee8c4d Mon Sep 17 00:00:00 2001

From: Will Deacon 
Date: Fri, 2 Jul 2021 14:13:28 +0100
Subject: [PATCH] of: Return success from of_dma_set_restricted_buffer() when
  !OF_ADDRESS

When CONFIG_OF_ADDRESS=n, of_dma_set_restricted_buffer() returns -ENODEV
and breaks the boot for sparc[64] machines. Return 0 instead, since the
function is essentially a glorified NOP in this configuration.

Cc: Claire Chang 
Cc: Konrad Rzeszutek Wilk 
Reported-by: Guenter Roeck 
Suggested-by: Robin Murphy 
Link: https://lore.kernel.org/r/20210702030807.ga2685...@roeck-us.net
Signed-off-by: Will Deacon 


Tested-by: Guenter Roeck 


---
  drivers/of/of_private.h | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index 8fde97565d11..34dd548c5eac 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -173,7 +173,8 @@ static inline int of_dma_get_range(struct device_node *np,
  static inline int of_dma_set_restricted_buffer(struct device *dev,
   struct device_node *np)
  {
-   return -ENODEV;
+   /* Do nothing, successfully. */
+   return 0;
  }
  #endif
  





Re: [PATCH v2 3/3] drm/i915/uapi: reject set_domain for discrete

2021-07-02 Thread Matthew Auld
On Thu, 1 Jul 2021 at 16:10, Matthew Auld  wrote:
>
> The CPU domain should be static for discrete, and on DG1 we don't need
> any flushing since everything is already coherent, so really all this
> does is an object wait, for which we have an ioctl. Longer term the
> desired caching should be an immutable creation time property for the
> BO, which can be set with something like gem_create_ext.
>
> One other user is iris + userptr, which uses the set_domain to probe all
> the pages to check if the GUP succeeds, however keeping the set_domain
> around just for that seems rather scuffed. We could equally just submit
> a dummy batch, which should hopefully be good enough, otherwise adding a
> new creation time flag for userptr might be an option. Although longer
> term we will also have vm_bind, which should also be a nice fit for
> this, so adding a whole new flag is likely overkill.

Kenneth, do you have a preference for the iris + userptr use case?
Adding the flag shouldn't be much work, if you feel the dummy batch is
too ugly. I don't mind either way.

>
> Suggested-by: Daniel Vetter 
> Signed-off-by: Matthew Auld 
> Cc: Thomas Hellström 
> Cc: Maarten Lankhorst 
> Cc: Jordan Justen 
> Cc: Kenneth Graunke 
> Cc: Jason Ekstrand 
> Cc: Daniel Vetter 
> Cc: Ramalingam C 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> index 43004bef55cb..b684a62bf3b0 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> @@ -490,6 +490,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void 
> *data,
> u32 write_domain = args->write_domain;
> int err;
>
> +   if (IS_DGFX(to_i915(dev)))
> +   return -ENODEV;
> +
> /* Only handle setting domains to types used by the CPU. */
> if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS)
> return -EINVAL;
> --
> 2.26.3
>


Re: [PATCH v15 12/12] of: Add plumbing for restricted DMA pool

2021-07-02 Thread Will Deacon
On Fri, Jul 02, 2021 at 12:39:41PM +0100, Robin Murphy wrote:
> On 2021-07-02 04:08, Guenter Roeck wrote:
> > On Thu, Jun 24, 2021 at 11:55:26PM +0800, Claire Chang wrote:
> > > If a device is not behind an IOMMU, we look up the device node and set
> > > up the restricted DMA when the restricted-dma-pool is presented.
> > > 
> > > Signed-off-by: Claire Chang 
> > > Tested-by: Stefano Stabellini 
> > > Tested-by: Will Deacon 
> > 
> > With this patch in place, all sparc and sparc64 qemu emulations
> > fail to boot. Symptom is that the root file system is not found.
> > Reverting this patch fixes the problem. Bisect log is attached.
> 
> Ah, OF_ADDRESS depends on !SPARC, so of_dma_configure_id() is presumably
> returning an unexpected -ENODEV from the of_dma_set_restricted_buffer()
> stub. That should probably be returning 0 instead, since either way it's not
> an error condition for it to simply do nothing.

Something like below?

Will

--->8

>From 4d9dcb9210c1f37435b6088284e04b6b36ee8c4d Mon Sep 17 00:00:00 2001
From: Will Deacon 
Date: Fri, 2 Jul 2021 14:13:28 +0100
Subject: [PATCH] of: Return success from of_dma_set_restricted_buffer() when
 !OF_ADDRESS

When CONFIG_OF_ADDRESS=n, of_dma_set_restricted_buffer() returns -ENODEV
and breaks the boot for sparc[64] machines. Return 0 instead, since the
function is essentially a glorified NOP in this configuration.

Cc: Claire Chang 
Cc: Konrad Rzeszutek Wilk 
Reported-by: Guenter Roeck 
Suggested-by: Robin Murphy 
Link: https://lore.kernel.org/r/20210702030807.ga2685...@roeck-us.net
Signed-off-by: Will Deacon 
---
 drivers/of/of_private.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index 8fde97565d11..34dd548c5eac 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -173,7 +173,8 @@ static inline int of_dma_get_range(struct device_node *np,
 static inline int of_dma_set_restricted_buffer(struct device *dev,
   struct device_node *np)
 {
-   return -ENODEV;
+   /* Do nothing, successfully. */
+   return 0;
 }
 #endif
 
-- 
2.32.0.93.g670b81a890-goog



Re: [PATCH 2/2] drm/vc4: hdmi: Convert to gpiod

2021-07-02 Thread Maxime Ripard
Hi Nathan,

On Thu, Jul 01, 2021 at 08:29:34PM -0700, Nathan Chancellor wrote:
> On Mon, May 24, 2021 at 03:18:52PM +0200, Maxime Ripard wrote:
> > The new gpiod interface takes care of parsing the GPIO flags and to
> > return the logical value when accessing an active-low GPIO, so switching
> > to it simplifies a lot the driver.
> > 
> > Signed-off-by: Maxime Ripard 
> > ---
> >  drivers/gpu/drm/vc4/vc4_hdmi.c | 24 +++-
> >  drivers/gpu/drm/vc4/vc4_hdmi.h |  3 +--
> >  2 files changed, 8 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > index ccc6c8079dc6..34622c59f6a7 100644
> > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > @@ -159,10 +159,9 @@ vc4_hdmi_connector_detect(struct drm_connector 
> > *connector, bool force)
> > struct vc4_hdmi *vc4_hdmi = connector_to_vc4_hdmi(connector);
> > bool connected = false;
> >  
> > -   if (vc4_hdmi->hpd_gpio) {
> > -   if (gpio_get_value_cansleep(vc4_hdmi->hpd_gpio) ^
> > -   vc4_hdmi->hpd_active_low)
> > -   connected = true;
> > +   if (vc4_hdmi->hpd_gpio &&
> > +   gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio)) {
> > +   connected = true;
> > } else if (drm_probe_ddc(vc4_hdmi->ddc)) {
> > connected = true;
> > } else if (HDMI_READ(HDMI_HOTPLUG) & VC4_HDMI_HOTPLUG_CONNECTED) {
> > @@ -1993,7 +1992,6 @@ static int vc4_hdmi_bind(struct device *dev, struct 
> > device *master, void *data)
> > struct vc4_hdmi *vc4_hdmi;
> > struct drm_encoder *encoder;
> > struct device_node *ddc_node;
> > -   u32 value;
> > int ret;
> >  
> > vc4_hdmi = devm_kzalloc(dev, sizeof(*vc4_hdmi), GFP_KERNEL);
> > @@ -2031,18 +2029,10 @@ static int vc4_hdmi_bind(struct device *dev, struct 
> > device *master, void *data)
> > /* Only use the GPIO HPD pin if present in the DT, otherwise
> >  * we'll use the HDMI core's register.
> >  */
> > -   if (of_find_property(dev->of_node, "hpd-gpios", )) {
> > -   enum of_gpio_flags hpd_gpio_flags;
> > -
> > -   vc4_hdmi->hpd_gpio = of_get_named_gpio_flags(dev->of_node,
> > -"hpd-gpios", 0,
> > -_gpio_flags);
> > -   if (vc4_hdmi->hpd_gpio < 0) {
> > -   ret = vc4_hdmi->hpd_gpio;
> > -   goto err_put_ddc;
> > -   }
> > -
> > -   vc4_hdmi->hpd_active_low = hpd_gpio_flags & OF_GPIO_ACTIVE_LOW;
> > +   vc4_hdmi->hpd_gpio = devm_gpiod_get_optional(dev, "hpd", GPIOD_IN);
> > +   if (IS_ERR(vc4_hdmi->hpd_gpio)) {
> > +   ret = PTR_ERR(vc4_hdmi->hpd_gpio);
> > +   goto err_put_ddc;
> > }
> >  
> > vc4_hdmi->disable_wifi_frequencies =
> > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h
> > index 060bcaefbeb5..2688a55461d6 100644
> > --- a/drivers/gpu/drm/vc4/vc4_hdmi.h
> > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.h
> > @@ -146,8 +146,7 @@ struct vc4_hdmi {
> > /* VC5 Only */
> > void __iomem *rm_regs;
> >  
> > -   int hpd_gpio;
> > -   bool hpd_active_low;
> > +   struct gpio_desc *hpd_gpio;
> >  
> > /*
> >  * On some systems (like the RPi4), some modes are in the same
> > -- 
> > 2.31.1
> 
> This patch as commit 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod")
> causes my Raspberry Pi 3 to lock up shortly after boot in combination
> with commit 411efa18e4b0 ("drm/vc4: hdmi: Move the HSM clock enable to
> runtime_pm"). The serial console and ssh are completely unresponsive and
> I do not see any messages in dmesg with "debug ignore_loglevel". The
> device is running with a 32-bit kernel (multi_v7_defconfig) with 32-bit
> userspace. If there is any further information that I can provide,
> please let me know.

Thanks for reporting this. The same bug has been reported on wednesday
on the RPi repo here:
https://github.com/raspberrypi/linux/pull/4418

More specifically, this commit should fix it:
https://github.com/raspberrypi/linux/pull/4418/commits/6d404373c20a794da3d6a7b4f1373903183bb5d0

Even though it's based on the 5.10 kernel, it should apply without any
warning on a mainline tree. Let me know if it fixes your issue too

Maxime


signature.asc
Description: PGP signature


Re: [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+

2021-07-02 Thread Martin Peres

On 02/07/2021 16:06, Michal Wajdeczko wrote:



On 02.07.2021 10:13, Martin Peres wrote:

On 01/07/2021 21:24, Martin Peres wrote:
[...]





+    i915->params.enable_guc = ENABLE_GUC_LOAD_HUC;
+    return;
+    }
+
+    /* Default: enable HuC authentication and GuC submission */
+    i915->params.enable_guc = ENABLE_GUC_LOAD_HUC |
ENABLE_GUC_SUBMISSION;


This seems to be in contradiction with the GuC submission plan which
states:

"Not enabled by default on any current platforms but can be enabled via
modparam enable_guc".



I don't believe any current platform gets this point where GuC
submission would be enabled by default. The first would be ADL-P which
isn't out yet.


Isn't that exactly what the line above does?


In case you missed this crucial part of the review. Please answer the
above question.


I guess there is some misunderstanding here, and I must admit I had
similar doubt, but if you look beyond patch diff and check function code
you will find that the very condition is:

/* Don't enable GuC/HuC on pre-Gen12 */
if (GRAPHICS_VER(i915) < 12) {
i915->params.enable_guc = 0;
return;
}

so all pre-Gen12 platforms will continue to have GuC/HuC disabled.


Thanks Michal, but then the problem is the other way: how can one enable 
it on gen11?


I like what Daniele was going for here: separating the capability from 
the user-requested value, but then it seems the patch stopped half way. 
How about never touching the parameter, and having a AND between the two 
values to get the effective enable_guc?


Right now, the code is really confusing :s

Thanks,
Martin



Thanks,
Michal



Re: [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+

2021-07-02 Thread Michal Wajdeczko



On 02.07.2021 10:13, Martin Peres wrote:
> On 01/07/2021 21:24, Martin Peres wrote:
> [...]
>>>

> +    i915->params.enable_guc = ENABLE_GUC_LOAD_HUC;
> +    return;
> +    }
> +
> +    /* Default: enable HuC authentication and GuC submission */
> +    i915->params.enable_guc = ENABLE_GUC_LOAD_HUC |
> ENABLE_GUC_SUBMISSION;

 This seems to be in contradiction with the GuC submission plan which
 states:

 "Not enabled by default on any current platforms but can be enabled via
 modparam enable_guc".

>>>
>>> I don't believe any current platform gets this point where GuC
>>> submission would be enabled by default. The first would be ADL-P which
>>> isn't out yet.
>>
>> Isn't that exactly what the line above does?
> 
> In case you missed this crucial part of the review. Please answer the
> above question.

I guess there is some misunderstanding here, and I must admit I had
similar doubt, but if you look beyond patch diff and check function code
you will find that the very condition is:

/* Don't enable GuC/HuC on pre-Gen12 */
if (GRAPHICS_VER(i915) < 12) {
i915->params.enable_guc = 0;
return;
}

so all pre-Gen12 platforms will continue to have GuC/HuC disabled.

Thanks,
Michal


[ANNOUNCE] libdrm 2.4.107

2021-07-02 Thread Bas Nieuwenhuizen
Alex Deucher (1):
  amdgpu: update marketing names

Andrey Grodzovsky (6):
  tests/amdgpu: Fix valgrind warning
  test/amdgpu: Add helper functions for hot unplug
  test/amdgpu/hotunplug: Add test suite for GPU unplug
  tests/amdgpu/hotunplug: Add unplug with cs test.
  tests/amdgpu/hotunplug: Add hotunplug with exported bo test
  tests/amdgpu/hotunplug: Add hotunplug with exported fence

Bas Nieuwenhuizen (2):
  amdgpu: Add vamgr for capture/replay.
  Bump version to 2.4.107

Eleni Maria Stea (3):
  include  in xf86drmMode when the OS is FreeBSD
  _WANT_KERNEL_ERRNO must be defined in FreeBSD for ERESTART to be used
  Conditionally include  and  on Linux, BSD

Lang Yu (1):
  Revert "tests/amdgpu: fix bo eviction test issue"

Marius Vlad (6):
  xf86drm: Add a human readable representation for format modifiers
  xf86drm: Add a vendor function to decode the format modifier
  xf86drm: Add support for decoding Nvidia format modifiers
  xf86drm: Add support for decoding AMD format modifiers
  xf86drm: Add support for decoding AMLOGIC format modifiers
  README.rst: Include some notes about syncing uapi headers

Rahul Kumar (1):
  amdgpu: Added product name for E9390,E9560 and E9565 dgpu

Tejas Upadhyay (1):
  intel: Add support for ADLP

git tag: libdrm-2.4.107

https://dri.freedesktop.org/libdrm/libdrm-2.4.107.tar.xz
SHA256: c554cef03b033636a975543eab363cc19081cb464595d3da1ec129f87370f888  
libdrm-2.4.107.tar.xz
SHA512: 
c7542ba15c4c934519a6a1f3cb1ec21effa820a805a030d0175313bb1cc796cd311f39596ead883f9f251679d701e262894c5a297d5cf45093c80a6cd818def0
  libdrm-2.4.107.tar.xz
PGP:  https://dri.freedesktop.org/libdrm/libdrm-2.4.107.tar.xz.sig

-BEGIN PGP SIGNATURE-

iQIzBAEBCAAdFiEEiZqBCQC4FYB3QubYlaZ3ojCsSqkFAmDfDLAACgkQlaZ3ojCs
Sqlsrw/+MnflXdeAGkMJYbCDb/mMhItR0zWh7KFTML2q+qgnCckAziEgmV0GNPYn
ahw64WfDg0HMyYecZFYRJug0+gja2jWFBXirSAM6GbFwO5aEegOAwUa0D+LpQm0E
BfJyccW52XT926shsWdIi+hJreboPzgPh7N9cs/8lz5NhrTQWUCVFyeQBW/1nlI/
rvXzJFQPKwgPOlUQiCub9dHSf4EcIMj2hCRukjj5g0hxBINQrJUxmashMnIoBaho
SMx4AeVofIWwXOEDqJ68aF9R1NqL2b97FCOV60w8vmjvM5w8aJn2PW1LHsUZsSr8
Ztxh0FTTm9QBghVHu+7JYOFIy5kqdN3PRVUd9hjSmxf2dAq/wjDiogr8RL5lKT83
PD63aCj0guF8rQgNMLN+g/lfpv462l+eeWiiO/2ci6nFh9e7nusu2jE0ZiUPcGll
M2UoJ1agJcI6TM3zVm6iYiGuE50rz+7ZKXnHpkwMYQQeIXRprUbOP1d+NepFCVt3
bf2Ad7FebtYnduwJfq4gEC4FoJEVDV26tuJy9je30n55/KjmiD+M/HSbIDCElTrp
CzOMvrUcChm6VWZ1jkjfn2cUijxWlIzoeFrir9ci2quGWcjhHl5TBUI+nvi7lmWg
EdjoYgthFxjJ7HGwJNkB+oKfuTD7r32DRZJXaIdwxxQBo0XpzlE=
=48d6
-END PGP SIGNATURE-


Re: [Intel-gfx] [PATCH 08/53] drm/i915/xehp: Extra media engines - Part 2 (interrupts)

2021-07-02 Thread Tvrtko Ursulin



On 01/07/2021 21:23, Matt Roper wrote:

From: John Harrison 

Xe_HP can have a lot of extra media engines. This patch adds the
interrupt handler support for them.

Cc: Tvrtko Ursulin 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: John Harrison 
Signed-off-by: Matt Roper 
---
  drivers/gpu/drm/i915/gt/intel_gt_irq.c | 13 -
  drivers/gpu/drm/i915/i915_reg.h|  3 +++
  2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c 
b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index c13462274fe8..b2de83be4d97 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -184,7 +184,13 @@ void gen11_gt_irq_reset(struct intel_gt *gt)
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK,~0);
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK,   ~0);
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK,   ~0);
+   if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
+   intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK,   ~0);
+   if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7))
+   intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK,   ~0);
intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~0);
+   if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3))
+   intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~0);
  
  	intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0);

intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK,  ~0);
@@ -218,8 +224,13 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask);
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask);
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask);
+   if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
+   intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~dmask);
+   if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7))
+   intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~dmask);
intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~dmask);


Poor 0-1 sandwiched between 4-7 and 2-3. ;) With hopefully order restored:

Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko


-
+   if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3))
+   intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~dmask);
/*
 * RPS interrupts will get enabled/disabled on demand when RPS itself
 * is enabled/disabled.
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d4546e871833..cb1716b6ce72 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8076,7 +8076,10 @@ enum {
  #define GEN11_BCS_RSVD_INTR_MASK  _MMIO(0x1900a0)
  #define GEN11_VCS0_VCS1_INTR_MASK _MMIO(0x1900a8)
  #define GEN11_VCS2_VCS3_INTR_MASK _MMIO(0x1900ac)
+#define GEN12_VCS4_VCS5_INTR_MASK  _MMIO(0x1900b0)
+#define GEN12_VCS6_VCS7_INTR_MASK  _MMIO(0x1900b4)
  #define GEN11_VECS0_VECS1_INTR_MASK   _MMIO(0x1900d0)
+#define GEN12_VECS2_VECS3_INTR_MASK_MMIO(0x1900d4)
  #define GEN11_GUC_SG_INTR_MASK_MMIO(0x1900e8)
  #define GEN11_GPM_WGBOXPERF_INTR_MASK _MMIO(0x1900ec)
  #define GEN11_CRYPTO_RSVD_INTR_MASK   _MMIO(0x1900f0)



Re: [Intel-gfx] [PATCH 01/53] drm/i915: Add "release id" version

2021-07-02 Thread Tvrtko Ursulin



On 01/07/2021 21:23, Matt Roper wrote:

From: Lucas De Marchi 

Besides the arch version returned by GRAPHICS_VER(), new platforms
contain a "release id" to make clear the difference from one platform to
another. Although for the first ones we may use them as if they were a


What does "first ones" refer to here?


major/minor version, that is not true for all platforms: we may have a
`release_id == n` that is closer to `n - 2` than to `n - 1`.


Hm this is a bit confusing. Is the sentence simply trying to say that, 
as the release id number is growing, hw capabilities are not simply 
accumulating but can be removed as well? Otherwise I am not sure how the 
user of these macros is supposed to act on this sentence.



However the release id number is not defined by hardware until we start
using the GMD_ID register. For the platforms before that register is
useful we will set the values in software and we can set them as we
please. So the plan is to set them so we can group different features
under a single GRAPHICS_VER_FULL() check.

After GMD_ID is used, the usefulness of a "full version check" will be
greatly reduced and will be mostly used for deciding workarounds and a
few code paths. So it makes sense to keep it as a separate field from
graphics_ver.

Also, currently there is not much use for the release id in media and
display, so keep them out.

This is a mix of 2 independent changes: one by me and the other by Matt
Roper.

Cc: Matt Roper 
Signed-off-by: Lucas De Marchi 
Signed-off-by: Matt Roper 
---
  drivers/gpu/drm/i915/i915_drv.h  | 6 ++
  drivers/gpu/drm/i915/intel_device_info.c | 2 ++
  drivers/gpu/drm/i915/intel_device_info.h | 2 ++
  3 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6dff4ca01241..9639800485b9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1258,11 +1258,17 @@ static inline struct drm_i915_private 
*pdev_to_i915(struct pci_dev *pdev)
   */
  #define IS_GEN(dev_priv, n)   (GRAPHICS_VER(dev_priv) == (n))
  
+#define IP_VER(ver, release)		((ver) << 8 | (release))

+
  #define GRAPHICS_VER(i915)(INTEL_INFO(i915)->graphics_ver)
+#define GRAPHICS_VER_FULL(i915)
IP_VER(INTEL_INFO(i915)->graphics_ver, \
+  
INTEL_INFO(i915)->graphics_ver_release)
  #define IS_GRAPHICS_VER(i915, from, until) \
(GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until))
  
  #define MEDIA_VER(i915)			(INTEL_INFO(i915)->media_ver)

+#define MEDIA_VER_FULL(i915)   IP_VER(INTEL_INFO(i915)->media_ver, \
+  
INTEL_INFO(i915)->media_ver_release)
  #define IS_MEDIA_VER(i915, from, until) \
(MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until))
  
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c

index 7eaa92fee421..e8ad14f002c1 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -97,7 +97,9 @@ void intel_device_info_print_static(const struct 
intel_device_info *info,
struct drm_printer *p)
  {
drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);
+   drm_printf(p, "graphics_ver_release: %u\n", info->graphics_ver_release);


I get the VER and VER_FULL in the macros but could 'ver' and 
'ver_release' here and in the code simply be renamed to 'ver'/'version' 
and 'release'? Maybe it is just me but don't think I encountered the 
term "version release" before.


Regards,

Tvrtko


drm_printf(p, "media_ver: %u\n", info->media_ver);
+   drm_printf(p, "media_ver_release: %u\n", info->media_ver_release);
drm_printf(p, "display_ver: %u\n", info->display.ver);
drm_printf(p, "gt: %d\n", info->gt);
drm_printf(p, "iommu: %s\n", iommu_name());
diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
b/drivers/gpu/drm/i915/intel_device_info.h
index b326aff65cd6..944a5ff4df49 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -162,7 +162,9 @@ enum intel_ppgtt_type {
  
  struct intel_device_info {

u8 graphics_ver;
+   u8 graphics_ver_release;
u8 media_ver;
+   u8 media_ver_release;
  
  	u8 gt; /* GT number, 0 if undefined */

intel_engine_mask_t platform_engine_mask; /* Engines supported by the 
HW */



Re: [Intel-gfx] [PATCH 07/53] drm/i915/xehp: Extra media engines - Part 1 (engine definitions)

2021-07-02 Thread Tvrtko Ursulin



On 01/07/2021 21:23, Matt Roper wrote:

From: John Harrison 

Xe_HP can have a lot of extra media engines. This patch adds the basic
definitions for them.

Cc: Tvrtko Ursulin 
Signed-off-by: John Harrison 
Signed-off-by: Tomas Winkler 
Signed-off-by: Matt Roper 
---
  drivers/gpu/drm/i915/gt/gen8_engine_cs.c |  7 ++-
  drivers/gpu/drm/i915/gt/intel_engine_cs.c| 50 
  drivers/gpu/drm/i915/gt/intel_engine_types.h | 14 --
  drivers/gpu/drm/i915/i915_reg.h  |  6 +++
  4 files changed, 69 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 87b06572fd2e..35edc55720f4 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -279,7 +279,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
if (mode & EMIT_INVALIDATE)
aux_inv = rq->engine->mask & ~BIT(BCS0);
if (aux_inv)
-   cmd += 2 * hweight8(aux_inv) + 2;
+   cmd += 2 * hweight32(aux_inv) + 2;
  
  	cs = intel_ring_begin(rq, cmd);

if (IS_ERR(cs))
@@ -313,9 +313,8 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
struct intel_engine_cs *engine;
unsigned int tmp;
  
-		*cs++ = MI_LOAD_REGISTER_IMM(hweight8(aux_inv));

-   for_each_engine_masked(engine, rq->engine->gt,
-  aux_inv, tmp) {
+   *cs++ = MI_LOAD_REGISTER_IMM(hweight32(aux_inv));
+   for_each_engine_masked(engine, rq->engine->gt, aux_inv, tmp) {
*cs++ = i915_mmio_reg_offset(aux_inv_reg(engine));
*cs++ = AUX_INV;
}
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4ab2c9abb943..6e2aa1acc4d4 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -104,6 +104,38 @@ static const struct engine_info intel_engines[] = {
{ .graphics_ver = 11, .base = GEN11_BSD4_RING_BASE }
},
},
+   [VCS4] = {
+   .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
+   .class = VIDEO_DECODE_CLASS,
+   .instance = 4,
+   .mmio_bases = {
+   { .graphics_ver = 11, .base = XEHP_BSD5_RING_BASE }
+   },
+   },
+   [VCS5] = {
+   .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
+   .class = VIDEO_DECODE_CLASS,
+   .instance = 5,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHP_BSD6_RING_BASE }
+   },
+   },
+   [VCS6] = {
+   .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
+   .class = VIDEO_DECODE_CLASS,
+   .instance = 6,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHP_BSD7_RING_BASE }
+   },
+   },
+   [VCS7] = {
+   .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
+   .class = VIDEO_DECODE_CLASS,
+   .instance = 7,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHP_BSD8_RING_BASE }
+   },
+   },
[VECS0] = {
.hw_id = VECS0_HW,
.class = VIDEO_ENHANCEMENT_CLASS,
@@ -121,6 +153,22 @@ static const struct engine_info intel_engines[] = {
{ .graphics_ver = 11, .base = GEN11_VEBOX2_RING_BASE }
},
},
+   [VECS2] = {
+   .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
+   .class = VIDEO_ENHANCEMENT_CLASS,
+   .instance = 2,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHP_VEBOX3_RING_BASE }
+   },
+   },
+   [VECS3] = {
+   .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
+   .class = VIDEO_ENHANCEMENT_CLASS,
+   .instance = 3,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHP_VEBOX4_RING_BASE }
+   },
+   },
  };
  
  /**

@@ -269,6 +317,8 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
  
  	BUILD_BUG_ON(MAX_ENGINE_CLASS >= BIT(GEN11_ENGINE_CLASS_WIDTH));

BUILD_BUG_ON(MAX_ENGINE_INSTANCE >= BIT(GEN11_ENGINE_INSTANCE_WIDTH));
+   BUILD_BUG_ON(I915_MAX_VCS > (MAX_ENGINE_INSTANCE + 1));
+   BUILD_BUG_ON(I915_MAX_VECS > (MAX_ENGINE_INSTANCE + 1));
  
  	if (GEM_DEBUG_WARN_ON(id >= ARRAY_SIZE(gt->engine)))

return -EINVAL;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 5b91068ab277..b25f594a7e4b 100644
--- 

Re: [Intel-gfx] [drm-intel:drm-intel-gt-next 8/14] drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: uninitialized symbol 'rq'.

2021-07-02 Thread Dan Carpenter
On Fri, Jul 02, 2021 at 12:34:33PM +0100, Matthew Auld wrote:
> > > > cf586021642d80 Chris Wilson 2021-06-17   85 err = 
> > > > fn(migrate, , src, dst, );
> > > > cf586021642d80 Chris Wilson 2021-06-17   86 if (!err)
> > > > cf586021642d80 Chris Wilson 2021-06-17   87 
> > > > continue;
> > > >
> > > > Does fn() initialize "rq" on the success path?  Anyway Smatch would
> > > > complain anyway because it thinks the list could be empty or that we
> > > > might hit and early continue for everything.
> > >
> > > The fn() will always first initialize the rq to NULL. If it returns
> > > success then rq will always be a valid rq. If it returns an err then
> > > the rq might be NULL, or a valid rq depending on how far the copy/fn
> > > got.
> > >
> > > And for_i915_gem_ww() will always run at least once, since ww->loop =
> > > true, so this looks like a false positive?
> >
> > You don't think i915_gem_object_lock(), i915_gem_object_pin_map() or
> > i915_gem_object_pin_map() can fail?
> 
> Yeah, they can totally fail but then we mostly likely just hit the
> err_out. The for_i915_gem_ww() is a little strange since it's not
> really looping over anything, it's just about retrying the block if we
> see -EDEADLK(which involves dropping some locks), if we see any other
> error then the loop is terminated with ww->loop = false, which then
> hits the goto err_out.
> 

Ah, yeah, you're right.  False positive.

I hadn't looked at this code in context (I only had reviewed the email).
Now that I've pulled the tree and looked at the code, then I'm sort of
surprised that Smatch generates a warning...  I will investigate some
more.  Thanks!

regards,
dan carpenter



Re: [Intel-gfx] [PATCH 05/53] drm/i915/gen12: Use fuse info to enable SFC

2021-07-02 Thread Tvrtko Ursulin



On 01/07/2021 21:23, Matt Roper wrote:

From: Venkata Sandeep Dhanalakota 

In Gen12 there are various fuse combinations and in each configuration
vdbox engine may be connected to SFC depending on which engines are
available, so we need to set the SFC capability based on fuse value from
the hardware. Even numbered phyical instance always have SFC, odd


physical


numbered physical instances have SFC only if previous even instance is
fused off.


Just a few nits.


Bspec: 48028
Cc: Tvrtko Ursulin 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Venkata Sandeep Dhanalakota 
Signed-off-by: Matt Roper 
---
  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 30 ++-
  1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 151870d8fdd3..4ab2c9abb943 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -442,6 +442,28 @@ void intel_engines_free(struct intel_gt *gt)
}
  }
  
+static inline


Inline is not desired here.


+bool vdbox_has_sfc(struct drm_i915_private *i915, unsigned int physical_vdbox,
+  unsigned int logical_vdbox, u16 vdbox_mask)
+{


I'd be tempted to prefix the function name with gen11_ so it is clearer 
it does not apply to earlier gens. Because if looking just at the diff 
out of context below, one can wonder if there is a functional change or 
not. There isn't, because there is a bailout for gen < 11 early in 
init_engine_mask(), but perhaps gen11 function name prefix would make 
this a bit more self-documenting.



+   /*
+* In Gen11, only even numbered logical VDBOXes are hooked
+* up to an SFC (Scaler & Format Converter) unit.
+* In Gen12, Even numbered phyical instance always are connected


physical


+* to an SFC. Odd numbered physical instances have SFC only if
+* previous even instance is fused off.
+*/
+   if (GRAPHICS_VER(i915) == 12) {
+   return (physical_vdbox % 2 == 0) ||
+   !(BIT(physical_vdbox - 1) & vdbox_mask);
+   } else if (GRAPHICS_VER(i915) == 11) {
+   return logical_vdbox % 2 == 0;
+   }


Not need for curlies on these branches.


+
+   MISSING_CASE(GRAPHICS_VER(i915));
+   return false;
+}
+
  /*
   * Determine which engines are fused off in our particular hardware.
   * Note that we have a catch-22 situation where we need to be able to access
@@ -493,13 +515,9 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
continue;
}
  
-		/*

-* In Gen11, only even numbered logical VDBOXes are
-* hooked up to an SFC (Scaler & Format Converter) unit.
-* In TGL each VDBOX has access to an SFC.
-*/
-   if (GRAPHICS_VER(i915) >= 12 || logical_vdbox++ % 2 == 0)
+   if (vdbox_has_sfc(i915, i, logical_vdbox, vdbox_mask))
gt->info.vdbox_sfc_access |= BIT(i);
+   logical_vdbox++;
}
drm_dbg(>drm, "vdbox enable: %04x, instances: %04lx\n",
vdbox_mask, VDBOX_MASK(gt));



Regards,

Tvrtko


[PATCH] drm/nouveau: Remove redundant error check on variable ret

2021-07-02 Thread Colin King
From: Colin Ian King 

The call to drm_dp_aux_init never returns an error code and there
is no error return being assigned to variable ret. The check for
an error in ret is always false since ret is still zero from the
start of the function so the init error check and error message
is redundant and can be removed.

Addresses-Coverity: ("Logically dead code")
Fixes: fd43ad9d47e7 ("drm/nouveau/kms/nv50-: Move AUX adapter reg to connector 
late register/early unregister")
Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/nouveau/nouveau_connector.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c 
b/drivers/gpu/drm/nouveau/nouveau_connector.c
index 22b83a6577eb..f37e5f28a93f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
@@ -1362,12 +1362,6 @@ nouveau_connector_create(struct drm_device *dev,
 dcbe->hasht, dcbe->hashm);
nv_connector->aux.name = kstrdup(aux_name, GFP_KERNEL);
drm_dp_aux_init(_connector->aux);
-   if (ret) {
-   NV_ERROR(drm, "Failed to init AUX adapter for 
sor-%04x-%04x: %d\n",
-dcbe->hasht, dcbe->hashm, ret);
-   kfree(nv_connector);
-   return ERR_PTR(ret);
-   }
fallthrough;
default:
funcs = _connector_funcs;
-- 
2.31.1



Re: [Intel-gfx] [drm-intel:drm-intel-gt-next 8/14] drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: uninitialized symbol 'rq'.

2021-07-02 Thread Matthew Auld
On Fri, 2 Jul 2021 at 12:14, Dan Carpenter  wrote:
>
> On Fri, Jul 02, 2021 at 02:07:27PM +0300, Dan Carpenter wrote:
> > On Fri, Jul 02, 2021 at 11:32:45AM +0100, Matthew Auld wrote:
> > > On Fri, 2 Jul 2021 at 09:45, Dan Carpenter  
> > > wrote:
> > > > cf586021642d80 Chris Wilson 2021-06-17   84
> > > > cf586021642d80 Chris Wilson 2021-06-17   85 err = 
> > > > fn(migrate, , src, dst, );
> > > > cf586021642d80 Chris Wilson 2021-06-17   86 if (!err)
> > > > cf586021642d80 Chris Wilson 2021-06-17   87 
> > > > continue;
> > > >
> > > > Does fn() initialize "rq" on the success path?  Anyway Smatch would
> > > > complain anyway because it thinks the list could be empty or that we
> > > > might hit and early continue for everything.
> > >
> > > The fn() will always first initialize the rq to NULL. If it returns
> > > success then rq will always be a valid rq. If it returns an err then
> > > the rq might be NULL, or a valid rq depending on how far the copy/fn
> > > got.
> > >
> > > And for_i915_gem_ww() will always run at least once, since ww->loop =
> > > true, so this looks like a false positive?
> >
> > You don't think i915_gem_object_lock(), i915_gem_object_pin_map() or
> > i915_gem_object_pin_map() can fail?
>
> Btw, I sincerely hope that we will re-enable GCC's uninitialized
> variable checks.  Will GCC be able to verify that this is initialized?

34b07d47dd00 ("drm/i915: Enable -Wuninitialized")

GCC doesn't complain AFAIK.

>
> regards,
> dan carpenter
>


Re: [PATCH v2] drm/dbi: Print errors for mipi_dbi_command()

2021-07-02 Thread kernel test robot
Hi Linus,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.13 next-20210701]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Linus-Walleij/drm-dbi-Print-errors-for-mipi_dbi_command/20210702-180745
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
3dbdb38e286903ec220aaf1fb29a8d94297da246
config: m68k-allmodconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/42d93a52e398adbb1fe2dfbc895c649cc8d42780
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Linus-Walleij/drm-dbi-Print-errors-for-mipi_dbi_command/20210702-180745
git checkout 42d93a52e398adbb1fe2dfbc895c649cc8d42780
# save the attached .config to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
O=build_dir ARCH=m68k SHELL=/bin/bash drivers/gpu/drm/tiny/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   In file included from drivers/gpu/drm/tiny/st7586.c:25:
   drivers/gpu/drm/tiny/st7586.c: In function 'st7586_pipe_disable':
>> include/drm/drm_mipi_dbi.h:186:27: error: invalid type argument of '->' 
>> (have 'struct mipi_dbi')
 186 |  struct device *dev = >spi->dev; \
 |   ^~
   drivers/gpu/drm/tiny/st7586.c:260:2: note: in expansion of macro 
'mipi_dbi_command'
 260 |  mipi_dbi_command(>dbi, MIPI_DCS_SET_DISPLAY_OFF);
 |  ^~~~


vim +186 include/drm/drm_mipi_dbi.h

   160  
   161  u32 mipi_dbi_spi_cmd_max_speed(struct spi_device *spi, size_t len);
   162  int mipi_dbi_spi_transfer(struct spi_device *spi, u32 speed_hz,
   163u8 bpw, const void *buf, size_t len);
   164  
   165  int mipi_dbi_command_read(struct mipi_dbi *dbi, u8 cmd, u8 *val);
   166  int mipi_dbi_command_buf(struct mipi_dbi *dbi, u8 cmd, u8 *data, size_t 
len);
   167  int mipi_dbi_command_stackbuf(struct mipi_dbi *dbi, u8 cmd, const u8 
*data,
   168size_t len);
   169  int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer *fb,
   170struct drm_rect *clip, bool swap);
   171  /**
   172   * mipi_dbi_command - MIPI DCS command with optional parameter(s)
   173   * @dbi: MIPI DBI structure
   174   * @cmd: Command
   175   * @seq: Optional parameter(s)
   176   *
   177   * Send MIPI DCS command to the controller. Use mipi_dbi_command_read() 
for
   178   * get/read.
   179   *
   180   * Returns:
   181   * Zero on success, negative error code on failure.
   182   */
   183  #define mipi_dbi_command(dbi, cmd, seq...) \
   184  ({ \
   185  const u8 d[] = { seq }; \
 > 186  struct device *dev = >spi->dev; \
   187  int ret; \
   188  ret = mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \
   189  if (ret) \
   190  dev_err_ratelimited(dev, "error %d when sending 
command\n", ret); \
   191  })
   192  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH v15 12/12] of: Add plumbing for restricted DMA pool

2021-07-02 Thread Robin Murphy

On 2021-07-02 04:08, Guenter Roeck wrote:

Hi,

On Thu, Jun 24, 2021 at 11:55:26PM +0800, Claire Chang wrote:

If a device is not behind an IOMMU, we look up the device node and set
up the restricted DMA when the restricted-dma-pool is presented.

Signed-off-by: Claire Chang 
Tested-by: Stefano Stabellini 
Tested-by: Will Deacon 


With this patch in place, all sparc and sparc64 qemu emulations
fail to boot. Symptom is that the root file system is not found.
Reverting this patch fixes the problem. Bisect log is attached.


Ah, OF_ADDRESS depends on !SPARC, so of_dma_configure_id() is presumably 
returning an unexpected -ENODEV from the of_dma_set_restricted_buffer() 
stub. That should probably be returning 0 instead, since either way it's 
not an error condition for it to simply do nothing.


Robin.



Guenter

---
# bad: [fb0ca446157a86b75502c1636b0d81e642fe6bf1] Add linux-next specific files 
for 20210701
# good: [62fb9874f5da54fdb243003b386128037319b219] Linux 5.13
git bisect start 'HEAD' 'v5.13'
# bad: [f63c4fda987a19b1194cc45cb72fd5bf968d9d90] Merge remote-tracking branch 
'rdma/for-next'
git bisect bad f63c4fda987a19b1194cc45cb72fd5bf968d9d90
# good: [46bb5dd1d2a63e906e374e97dfd4a5e33934b1c4] Merge remote-tracking branch 
'ipsec/master'
git bisect good 46bb5dd1d2a63e906e374e97dfd4a5e33934b1c4
# good: [43ba6969cfb8185353a7a6fc79070f13b9e3d6d3] Merge remote-tracking branch 
'clk/clk-next'
git bisect good 43ba6969cfb8185353a7a6fc79070f13b9e3d6d3
# good: [1ca5eddcf8dca1d6345471c6404e7364af0d7019] Merge remote-tracking branch 
'fuse/for-next'
git bisect good 1ca5eddcf8dca1d6345471c6404e7364af0d7019
# good: [8f6d7b3248705920187263a4e7147b0752ec7dcf] Merge remote-tracking branch 
'pci/next'
git bisect good 8f6d7b3248705920187263a4e7147b0752ec7dcf
# good: [df1885a755784da3ef285f36d9230c1d090ef186] RDMA/rtrs_clt: Alloc less 
memory with write path fast memory registration
git bisect good df1885a755784da3ef285f36d9230c1d090ef186
# good: [93d31efb58c8ad4a66bbedbc2d082df458c04e45] Merge remote-tracking branch 
'cpufreq-arm/cpufreq/arm/linux-next'
git bisect good 93d31efb58c8ad4a66bbedbc2d082df458c04e45
# good: [46308965ae6fdc7c25deb2e8c048510ae51bbe66] RDMA/irdma: Check contents 
of user-space irdma_mem_reg_req object
git bisect good 46308965ae6fdc7c25deb2e8c048510ae51bbe66
# good: [6de7a1d006ea9db235492b288312838d6878385f] 
thermal/drivers/int340x/processor_thermal: Split enumeration and processing part
git bisect good 6de7a1d006ea9db235492b288312838d6878385f
# good: [081bec2577cda3d04f6559c60b6f4e2242853520] dt-bindings: of: Add 
restricted DMA pool
git bisect good 081bec2577cda3d04f6559c60b6f4e2242853520
# good: [bf95ac0bcd69979af146852f6a617a60285ebbc1] Merge remote-tracking branch 
'thermal/thermal/linux-next'
git bisect good bf95ac0bcd69979af146852f6a617a60285ebbc1
# good: [3d8287544223a3d2f37981c1f9ffd94d0b5e9ffc] RDMA/core: Always release 
restrack object
git bisect good 3d8287544223a3d2f37981c1f9ffd94d0b5e9ffc
# bad: [cff1f23fad6e0bd7d671acce0d15285c709f259c] Merge remote-tracking branch 
'swiotlb/linux-next'
git bisect bad cff1f23fad6e0bd7d671acce0d15285c709f259c
# bad: [b655006619b7bccd0dc1e055bd72de5d613e7b5c] of: Add plumbing for 
restricted DMA pool
git bisect bad b655006619b7bccd0dc1e055bd72de5d613e7b5c
# first bad commit: [b655006619b7bccd0dc1e055bd72de5d613e7b5c] of: Add plumbing 
for restricted DMA pool



Re: [Intel-gfx] [drm-intel:drm-intel-gt-next 8/14] drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: uninitialized symbol 'rq'.

2021-07-02 Thread Matthew Auld
On Fri, 2 Jul 2021 at 12:07, Dan Carpenter  wrote:
>
> On Fri, Jul 02, 2021 at 11:32:45AM +0100, Matthew Auld wrote:
> > On Fri, 2 Jul 2021 at 09:45, Dan Carpenter  wrote:
> > >
> > > tree:   git://anongit.freedesktop.org/drm-intel drm-intel-gt-next
> > > head:   5cd57f676bb946a00275408f0dd0d75dbc466d25
> > > commit: cf586021642d8017cde111b7dd1ba86224e9da51 [8/14] drm/i915/gt: 
> > > Pipelined page migration
> > > config: x86_64-randconfig-m001-20210630 (attached as .config)
> > > compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
> > >
> > > If you fix the issue, kindly add following tag as appropriate
> > > Reported-by: kernel test robot 
> > > Reported-by: Dan Carpenter 
> > >
> > > New smatch warnings:
> > > drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: 
> > > uninitialized symbol 'rq'.
> > > drivers/gpu/drm/i915/gt/selftest_migrate.c:113 copy() error: 
> > > uninitialized symbol 'vaddr'.
> > >
> > > Old smatch warnings:
> > > drivers/gpu/drm/i915/gem/i915_gem_object.h:182 __i915_gem_object_lock() 
> > > error: we previously assumed 'ww' could be null (see line 171)
> > >
> > > vim +/rq +102 drivers/gpu/drm/i915/gt/selftest_migrate.c
> > >
> > > cf586021642d80 Chris Wilson 2021-06-17   32  static int copy(struct 
> > > intel_migrate *migrate,
> > > cf586021642d80 Chris Wilson 2021-06-17   33 int (*fn)(struct 
> > > intel_migrate *migrate,
> > > cf586021642d80 Chris Wilson 2021-06-17   34   struct 
> > > i915_gem_ww_ctx *ww,
> > > cf586021642d80 Chris Wilson 2021-06-17   35   struct 
> > > drm_i915_gem_object *src,
> > > cf586021642d80 Chris Wilson 2021-06-17   36   struct 
> > > drm_i915_gem_object *dst,
> > > cf586021642d80 Chris Wilson 2021-06-17   37   struct 
> > > i915_request **out),
> > > cf586021642d80 Chris Wilson 2021-06-17   38 u32 sz, struct 
> > > rnd_state *prng)
> > > cf586021642d80 Chris Wilson 2021-06-17   39  {
> > > cf586021642d80 Chris Wilson 2021-06-17   40 struct drm_i915_private 
> > > *i915 = migrate->context->engine->i915;
> > > cf586021642d80 Chris Wilson 2021-06-17   41 struct 
> > > drm_i915_gem_object *src, *dst;
> > > cf586021642d80 Chris Wilson 2021-06-17   42 struct i915_request *rq;
> > > cf586021642d80 Chris Wilson 2021-06-17   43 struct i915_gem_ww_ctx ww;
> > > cf586021642d80 Chris Wilson 2021-06-17   44 u32 *vaddr;
> > > cf586021642d80 Chris Wilson 2021-06-17   45 int err = 0;
> > >
> > > One way to silence these warnings would be to initialize err = -EINVAL.
> > > Then Smatch would know that we goto err_out for an empty list.
> > >
> > > cf586021642d80 Chris Wilson 2021-06-17   46 int i;
> > > cf586021642d80 Chris Wilson 2021-06-17   47
> > > cf586021642d80 Chris Wilson 2021-06-17   48 src = 
> > > create_lmem_or_internal(i915, sz);
> > > cf586021642d80 Chris Wilson 2021-06-17   49 if (IS_ERR(src))
> > > cf586021642d80 Chris Wilson 2021-06-17   50 return 0;
> > > cf586021642d80 Chris Wilson 2021-06-17   51
> > > cf586021642d80 Chris Wilson 2021-06-17   52 dst = 
> > > i915_gem_object_create_internal(i915, sz);
> > > cf586021642d80 Chris Wilson 2021-06-17   53 if (IS_ERR(dst))
> > > cf586021642d80 Chris Wilson 2021-06-17   54 goto err_free_src;
> > > cf586021642d80 Chris Wilson 2021-06-17   55
> > > cf586021642d80 Chris Wilson 2021-06-17   56 for_i915_gem_ww(, err, 
> > > true) {
> > > cf586021642d80 Chris Wilson 2021-06-17   57 err = 
> > > i915_gem_object_lock(src, );
> > > cf586021642d80 Chris Wilson 2021-06-17   58 if (err)
> > > cf586021642d80 Chris Wilson 2021-06-17   59 continue;
> > > cf586021642d80 Chris Wilson 2021-06-17   60
> > > cf586021642d80 Chris Wilson 2021-06-17   61 err = 
> > > i915_gem_object_lock(dst, );
> > > cf586021642d80 Chris Wilson 2021-06-17   62 if (err)
> > > cf586021642d80 Chris Wilson 2021-06-17   63 continue;
> > > cf586021642d80 Chris Wilson 2021-06-17   64
> > > cf586021642d80 Chris Wilson 2021-06-17   65 vaddr = 
> > > i915_gem_object_pin_map(src, I915_MAP_WC);
> > > cf586021642d80 Chris Wilson 2021-06-17   66 if 
> > > (IS_ERR(vaddr)) {
> > > cf586021642d80 Chris Wilson 2021-06-17   67 err = 
> > > PTR_ERR(vaddr);
> > > cf586021642d80 Chris Wilson 2021-06-17   68 continue;
> > > cf586021642d80 Chris Wilson 2021-06-17   69 }
> > > cf586021642d80 Chris Wilson 2021-06-17   70
> > > cf586021642d80 Chris Wilson 2021-06-17   71 for (i = 0; i < 
> > > sz / sizeof(u32); i++)
> > > cf586021642d80 Chris Wilson 2021-06-17   72 vaddr[i] 
> > > = i;
> > > cf586021642d80 Chris Wilson 2021-06-17   73 
> > > i915_gem_object_flush_map(src);
> > > cf586021642d80 Chris Wilson 2021-06-17   74
> > > cf586021642d80 Chris Wilson 2021-06-17   75 vaddr = 
> > 

[PATCH 4/4] drm/msm: always wait for the exclusive fence

2021-07-02 Thread Christian König
Drivers also need to to sync to the exclusive fence when
a shared one is present.

Completely untested since the driver won't even compile on !ARM.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/msm/msm_gem.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index a94a43de95ef..72a07e311de3 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -817,17 +817,15 @@ int msm_gem_sync_object(struct drm_gem_object *obj,
struct dma_fence *fence;
int i, ret;
 
-   fobj = dma_resv_shared_list(obj->resv);
-   if (!fobj || (fobj->shared_count == 0)) {
-   fence = dma_resv_excl_fence(obj->resv);
-   /* don't need to wait on our own fences, since ring is fifo */
-   if (fence && (fence->context != fctx->context)) {
-   ret = dma_fence_wait(fence, true);
-   if (ret)
-   return ret;
-   }
+   fence = dma_resv_excl_fence(obj->resv);
+   /* don't need to wait on our own fences, since ring is fifo */
+   if (fence && (fence->context != fctx->context)) {
+   ret = dma_fence_wait(fence, true);
+   if (ret)
+   return ret;
}
 
+   fobj = dma_resv_shared_list(obj->resv);
if (!exclusive || !fobj)
return 0;
 
-- 
2.25.1



[PATCH 3/4] drm/nouveau: always wait for the exclusive fence

2021-07-02 Thread Christian König
Drivers also need to to sync to the exclusive fence when
a shared one is present.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 6b43918035df..05d0b3eb3690 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -358,7 +358,7 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct 
nouveau_channel *chan, bool e
fobj = dma_resv_shared_list(resv);
fence = dma_resv_excl_fence(resv);
 
-   if (fence && (!exclusive || !fobj || !fobj->shared_count)) {
+   if (fence) {
struct nouveau_channel *prev = NULL;
bool must_wait = true;
 
-- 
2.25.1



[PATCH 2/4] dma-buf: fix dma_resv_test_signaled test_all handling v2

2021-07-02 Thread Christian König
As the name implies if testing all fences is requested we
should indeed test all fences and not skip the exclusive
one because we see shared ones.

v2: fix logic once more

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 33 -
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 4ab02b6c387a..18dd5a6ca06c 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -618,25 +618,21 @@ static inline int dma_resv_test_signaled_single(struct 
dma_fence *passed_fence)
  */
 bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
 {
-   unsigned int seq, shared_count;
+   struct dma_fence *fence;
+   unsigned int seq;
int ret;
 
rcu_read_lock();
 retry:
ret = true;
-   shared_count = 0;
seq = read_seqcount_begin(>seq);
 
if (test_all) {
struct dma_resv_list *fobj = dma_resv_shared_list(obj);
-   unsigned int i;
-
-   if (fobj)
-   shared_count = fobj->shared_count;
+   unsigned int i, shared_count;
 
+   shared_count = fobj ? fobj->shared_count : 0;
for (i = 0; i < shared_count; ++i) {
-   struct dma_fence *fence;
-
fence = rcu_dereference(fobj->shared[i]);
ret = dma_resv_test_signaled_single(fence);
if (ret < 0)
@@ -644,24 +640,19 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool 
test_all)
else if (!ret)
break;
}
-
-   if (read_seqcount_retry(>seq, seq))
-   goto retry;
}
 
-   if (!shared_count) {
-   struct dma_fence *fence_excl = dma_resv_excl_fence(obj);
-
-   if (fence_excl) {
-   ret = dma_resv_test_signaled_single(fence_excl);
-   if (ret < 0)
-   goto retry;
+   fence = dma_resv_excl_fence(obj);
+   if (ret && fence) {
+   ret = dma_resv_test_signaled_single(fence);
+   if (ret < 0)
+   goto retry;
 
-   if (read_seqcount_retry(>seq, seq))
-   goto retry;
-   }
}
 
+   if (read_seqcount_retry(>seq, seq))
+   goto retry;
+
rcu_read_unlock();
return ret;
 }
-- 
2.25.1



[PATCH 1/4] dma-buf: add some more kerneldoc to dma_resv_add_shared_fence

2021-07-02 Thread Christian König
Explicitly document that code can't assume that shared fences
signal after the exclusive fence.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index f26c71747d43..4ab02b6c387a 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -235,7 +235,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
  * @fence: the shared fence to add
  *
  * Add a fence to a shared slot, obj->lock must be held, and
- * dma_resv_reserve_shared() has been called.
+ * dma_resv_reserve_shared() has been called. The shared fences can signal in
+ * any order and there is especially no guarantee that shared fences signal
+ * after the exclusive one. Code relying on any signaling order is broken and
+ * needs to be fixed.
  */
 void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
-- 
2.25.1



Start fixing the shared to exclusive fence dependencies.

2021-07-02 Thread Christian König
Hey Daniel,

even when you are not 100% done with the driver audit I think we should push 
that patch set here to drm-misc-next now so that it can end up in 5.15.

Not having any dependency between the exclusive and the shared fence signaling 
order is just way more defensive than the current model.

As discussed I'm holding back any amdgpu and TTM workarounds which could be 
removed for now.

Thoughts?

Thanks,
Christian.




Re: [Intel-gfx] [drm-intel:drm-intel-gt-next 8/14] drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: uninitialized symbol 'rq'.

2021-07-02 Thread Dan Carpenter
On Fri, Jul 02, 2021 at 02:07:27PM +0300, Dan Carpenter wrote:
> On Fri, Jul 02, 2021 at 11:32:45AM +0100, Matthew Auld wrote:
> > On Fri, 2 Jul 2021 at 09:45, Dan Carpenter  wrote:
> > > cf586021642d80 Chris Wilson 2021-06-17   84
> > > cf586021642d80 Chris Wilson 2021-06-17   85 err = fn(migrate, 
> > > , src, dst, );
> > > cf586021642d80 Chris Wilson 2021-06-17   86 if (!err)
> > > cf586021642d80 Chris Wilson 2021-06-17   87 continue;
> > >
> > > Does fn() initialize "rq" on the success path?  Anyway Smatch would
> > > complain anyway because it thinks the list could be empty or that we
> > > might hit and early continue for everything.
> > 
> > The fn() will always first initialize the rq to NULL. If it returns
> > success then rq will always be a valid rq. If it returns an err then
> > the rq might be NULL, or a valid rq depending on how far the copy/fn
> > got.
> > 
> > And for_i915_gem_ww() will always run at least once, since ww->loop =
> > true, so this looks like a false positive?
> 
> You don't think i915_gem_object_lock(), i915_gem_object_pin_map() or
> i915_gem_object_pin_map() can fail?

Btw, I sincerely hope that we will re-enable GCC's uninitialized
variable checks.  Will GCC be able to verify that this is initialized?

regards,
dan carpenter



Re: [Intel-gfx] [drm-intel:drm-intel-gt-next 8/14] drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: uninitialized symbol 'rq'.

2021-07-02 Thread Dan Carpenter
On Fri, Jul 02, 2021 at 11:32:45AM +0100, Matthew Auld wrote:
> On Fri, 2 Jul 2021 at 09:45, Dan Carpenter  wrote:
> >
> > tree:   git://anongit.freedesktop.org/drm-intel drm-intel-gt-next
> > head:   5cd57f676bb946a00275408f0dd0d75dbc466d25
> > commit: cf586021642d8017cde111b7dd1ba86224e9da51 [8/14] drm/i915/gt: 
> > Pipelined page migration
> > config: x86_64-randconfig-m001-20210630 (attached as .config)
> > compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot 
> > Reported-by: Dan Carpenter 
> >
> > New smatch warnings:
> > drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: uninitialized 
> > symbol 'rq'.
> > drivers/gpu/drm/i915/gt/selftest_migrate.c:113 copy() error: uninitialized 
> > symbol 'vaddr'.
> >
> > Old smatch warnings:
> > drivers/gpu/drm/i915/gem/i915_gem_object.h:182 __i915_gem_object_lock() 
> > error: we previously assumed 'ww' could be null (see line 171)
> >
> > vim +/rq +102 drivers/gpu/drm/i915/gt/selftest_migrate.c
> >
> > cf586021642d80 Chris Wilson 2021-06-17   32  static int copy(struct 
> > intel_migrate *migrate,
> > cf586021642d80 Chris Wilson 2021-06-17   33 int (*fn)(struct 
> > intel_migrate *migrate,
> > cf586021642d80 Chris Wilson 2021-06-17   34   struct 
> > i915_gem_ww_ctx *ww,
> > cf586021642d80 Chris Wilson 2021-06-17   35   struct 
> > drm_i915_gem_object *src,
> > cf586021642d80 Chris Wilson 2021-06-17   36   struct 
> > drm_i915_gem_object *dst,
> > cf586021642d80 Chris Wilson 2021-06-17   37   struct 
> > i915_request **out),
> > cf586021642d80 Chris Wilson 2021-06-17   38 u32 sz, struct 
> > rnd_state *prng)
> > cf586021642d80 Chris Wilson 2021-06-17   39  {
> > cf586021642d80 Chris Wilson 2021-06-17   40 struct drm_i915_private 
> > *i915 = migrate->context->engine->i915;
> > cf586021642d80 Chris Wilson 2021-06-17   41 struct drm_i915_gem_object 
> > *src, *dst;
> > cf586021642d80 Chris Wilson 2021-06-17   42 struct i915_request *rq;
> > cf586021642d80 Chris Wilson 2021-06-17   43 struct i915_gem_ww_ctx ww;
> > cf586021642d80 Chris Wilson 2021-06-17   44 u32 *vaddr;
> > cf586021642d80 Chris Wilson 2021-06-17   45 int err = 0;
> >
> > One way to silence these warnings would be to initialize err = -EINVAL.
> > Then Smatch would know that we goto err_out for an empty list.
> >
> > cf586021642d80 Chris Wilson 2021-06-17   46 int i;
> > cf586021642d80 Chris Wilson 2021-06-17   47
> > cf586021642d80 Chris Wilson 2021-06-17   48 src = 
> > create_lmem_or_internal(i915, sz);
> > cf586021642d80 Chris Wilson 2021-06-17   49 if (IS_ERR(src))
> > cf586021642d80 Chris Wilson 2021-06-17   50 return 0;
> > cf586021642d80 Chris Wilson 2021-06-17   51
> > cf586021642d80 Chris Wilson 2021-06-17   52 dst = 
> > i915_gem_object_create_internal(i915, sz);
> > cf586021642d80 Chris Wilson 2021-06-17   53 if (IS_ERR(dst))
> > cf586021642d80 Chris Wilson 2021-06-17   54 goto err_free_src;
> > cf586021642d80 Chris Wilson 2021-06-17   55
> > cf586021642d80 Chris Wilson 2021-06-17   56 for_i915_gem_ww(, err, 
> > true) {
> > cf586021642d80 Chris Wilson 2021-06-17   57 err = 
> > i915_gem_object_lock(src, );
> > cf586021642d80 Chris Wilson 2021-06-17   58 if (err)
> > cf586021642d80 Chris Wilson 2021-06-17   59 continue;
> > cf586021642d80 Chris Wilson 2021-06-17   60
> > cf586021642d80 Chris Wilson 2021-06-17   61 err = 
> > i915_gem_object_lock(dst, );
> > cf586021642d80 Chris Wilson 2021-06-17   62 if (err)
> > cf586021642d80 Chris Wilson 2021-06-17   63 continue;
> > cf586021642d80 Chris Wilson 2021-06-17   64
> > cf586021642d80 Chris Wilson 2021-06-17   65 vaddr = 
> > i915_gem_object_pin_map(src, I915_MAP_WC);
> > cf586021642d80 Chris Wilson 2021-06-17   66 if (IS_ERR(vaddr)) {
> > cf586021642d80 Chris Wilson 2021-06-17   67 err = 
> > PTR_ERR(vaddr);
> > cf586021642d80 Chris Wilson 2021-06-17   68 continue;
> > cf586021642d80 Chris Wilson 2021-06-17   69 }
> > cf586021642d80 Chris Wilson 2021-06-17   70
> > cf586021642d80 Chris Wilson 2021-06-17   71 for (i = 0; i < sz 
> > / sizeof(u32); i++)
> > cf586021642d80 Chris Wilson 2021-06-17   72 vaddr[i] = 
> > i;
> > cf586021642d80 Chris Wilson 2021-06-17   73 
> > i915_gem_object_flush_map(src);
> > cf586021642d80 Chris Wilson 2021-06-17   74
> > cf586021642d80 Chris Wilson 2021-06-17   75 vaddr = 
> > i915_gem_object_pin_map(dst, I915_MAP_WC);
> > cf586021642d80 Chris Wilson 2021-06-17   76 if (IS_ERR(vaddr)) {
> > cf586021642d80 Chris Wilson 2021-06-17   77 err = 
> > PTR_ERR(vaddr);
> > cf586021642d80 Chris 

Questions over DSI within DRM.

2021-07-02 Thread Dave Stevenson
Hi All

I'm trying to get DSI devices working reliably on the Raspberry Pi,
but I'm hitting a number of places where it isn't clear as to the
expected behaviour within DRM.

Power on state. Many devices want the DSI clock and/or data lanes in
LP-11 state when they are powered up. With the normal calling sequence
of:
- panel/bridge pre_enable calls from connector towards the encoder.
- encoder enable which also enables video.
- panel/bridge enable calls from encoder to connector.
there is no point at which the DSI tx is initialised but not
transmitting video. What DSI states are expected to be adopted at each
point?

On a similar theme, some devices want the clock lane in HS mode early
so they can use it in place of an external oscillator, but the data
lanes still in LP-11. There appears to be no way for the
display/bridge to signal this requirement or it be achieved.

host_transfer calls can supposedly be made at any time, however unless
MIPI_DSI_MSG_USE_LPM is set in the message then we're meant to send it
in high speed mode. If this is before a mode has been set, what
defines the link frequency parameters at this point? Adopting a random
default sounds like a good way to get undefined behaviour.

DSI burst mode needs to set the DSI link frequency independently of
the display mode. How is that meant to be configured? I would have
expected it to come from DT due to link frequency often being chosen
based on EMC restrictions, but I don't see such a thing in any
binding.

As a follow on, bridge devices can support burst mode (eg TI's
SN65DSI83 that's just been merged), so it needs to know the desired
panel timings for the output side of the bridge, but the DSI link
timings to set up the bridge's PLL. What's the correct way for
signalling that? drm_crtc_state->adjusted_mode vs
drm_crtc_state->mode? Except mode is userspace's request, not what has
been validated/updated by the panel/bridge.

vc4 has constraints that the DSI host interface is fed off an integer
divider from a typically 3GHz clock, so the host interface needs to
signal that burst mode is in use even if the panel/bridge doesn't need
to run in burst mode. (This does mean that displays that require a
very precise link frequency can not be supported).
It currently updates the adjusted_mode via drm_encoder_helper_funcs
mode_fixup, but is that the correct thing to do, or is there a better
solution?
I'd have expected the DSI tx to be responsible for configuring burst
mode parameters anyway, so the mechanism required would seem to be
just the normal approach for adopting burst mode if that is defined.

Some DSI host interfaces are implemented as bridges, others are
encoders. Pro's and con's of each? I suspect I'm just missing the
history here.

When it comes to the MIPI_DSI_MODE_* flags, which ones are mutually
exclusive, or are assumed based on others? Does a burst mode DSI sink
set both MIPI_DSI_MODE_VIDEO and MIPI_DSI_MODE_VIDEO_BURST, or just
the latter?
Presumably !MIPI_DSI_MODE_VIDEO signals the of use command mode for
conveying video. So looking at panel-ilitek-ili9881c where it sets
just MIPI_DSI_MODE_VIDEO_SYNC_PULSE means command mode video with sync
pulses? That sounds unlikely.

I have looked for any information that covers this, but failed to find
such, hence calling on all your expertise.

Many thanks for your time,
  Dave


Re: [PATCH v2] drm/dbi: Print errors for mipi_dbi_command()

2021-07-02 Thread Noralf Trønnes



Den 02.07.2021 12.04, skrev Linus Walleij:
> The macro mipi_dbi_command() does not report errors unless you wrap it
> in another macro to do the error reporting.
> 
> Report a rate-limited error so we know what is going on.
> 
> Drop the only user in DRM using mipi_dbi_command() and actually checking
> the error explicitly, let it use mipi_dbi_command_buf() directly
> instead.
> 
> After this any code wishing to send command arrays can rely on
> mipi_dbi_command() providing an appropriate error message if something
> goes wrong.
> 
> Suggested-by: Noralf Trønnes 
> Suggested-by: Douglas Anderson 
> Signed-off-by: Linus Walleij 
> ---
> ChangeLog v1->v2:
> - Fish out the struct device * from the DBI SPI client and use
>   that to print the errors associated with the SPI device.
> ---
>  drivers/gpu/drm/drm_mipi_dbi.c | 2 +-
>  include/drm/drm_mipi_dbi.h | 6 +-
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_mipi_dbi.c b/drivers/gpu/drm/drm_mipi_dbi.c
> index 3854fb9798e9..c7c1b75df190 100644
> --- a/drivers/gpu/drm/drm_mipi_dbi.c
> +++ b/drivers/gpu/drm/drm_mipi_dbi.c
> @@ -645,7 +645,7 @@ static int mipi_dbi_poweron_reset_conditional(struct 
> mipi_dbi_dev *dbidev, bool
>   return 1;
>  
>   mipi_dbi_hw_reset(dbi);
> - ret = mipi_dbi_command(dbi, MIPI_DCS_SOFT_RESET);
> + ret = mipi_dbi_command_buf(dbi, MIPI_DCS_SOFT_RESET, NULL, 0);
>   if (ret) {
>   DRM_DEV_ERROR(dev, "Failed to send reset command (%d)\n", ret);
>   if (dbidev->regulator)
> diff --git a/include/drm/drm_mipi_dbi.h b/include/drm/drm_mipi_dbi.h
> index f543d6e3e822..f00cb9690cf2 100644
> --- a/include/drm/drm_mipi_dbi.h
> +++ b/include/drm/drm_mipi_dbi.h
> @@ -183,7 +183,11 @@ int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer 
> *fb,
>  #define mipi_dbi_command(dbi, cmd, seq...) \
>  ({ \
>   const u8 d[] = { seq }; \
> - mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \
> + struct device *dev = >spi->dev; \
> + int ret; \
> + ret = mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \
> + if (ret) \
> + dev_err_ratelimited(dev, "error %d when sending command\n", 
> ret); \

Nit: Printing the failing command would have been useful, like you did
in the driver macro.

>  })

I would have preferred if mipi_dbi_command could have returned the error
code. This indicates that it should be possible:
https://stackoverflow.com/questions/3532621/using-and-returning-output-in-c-macro

But I can live with this, but if drivers want to start checking the
error code we might have to rethink this.

But this works as things are now:

Reviewed-by: Noralf Trønnes 


Re: [PATCH v2 4/7] drm/panfrost: Add the ability to create submit queues

2021-07-02 Thread Daniel Stone
Hi,

On Fri, 2 Jul 2021 at 11:55, Steven Price  wrote:
> On 02/07/2021 11:43, Boris Brezillon wrote:
> > On Fri, 2 Jul 2021 10:56:29 +0100
> > Steven Price  wrote:
> >> My Vulkan knowledge is limited so I'm not sure whether this is the right
> >> approach or not. In particular is it correct that an application can
> >> create a high priority queue which could affect other (normal priority)
> >> applications?
> >
> > That's what msm does (with no extra CAPS check AFAICT), and the
> > freedreno driver can already create high priority queues if
> > PIPE_CONTEXT_HIGH_PRIORITY is passed. Not saying that's okay to allow
> > userspace to tweak the priority, but if that's a problem, other drivers
> > are in trouble too ;-).
>
> Oh well I guess if others are doing the same ;) I have to admit kbase
> has always struggled with how to identify a "privileged" process - it's
> something that makes a bit of sense on Android but for other userspaces
> there really doesn't seem to be a good way of identifying what should or
> should not be allowed to create high priority queues.

Yeah, it's a platform-specific question. Some might want to say
compositor-only, some might want to let foreground apps ramp, etc.

Thankfully, Vulkan is pretty clear that it's just a hint and the
results might be anything or nothing.

> >> Also does it really make sense to allow user space to create an
> >> unlimited number of queues? It feels like an ideal way for an malicious
> >> application to waste kernel memory.
> >
> > Same here, I see no limit on the number of queues the msm driver can
> > create. I can definitely pick an arbitrary limit of 2^16 (or 2^8 if
> > 2^16 is too high) if you prefer, but I feel like there's plenty of ways
> > to force kernel allocations already, like allocating a gazillion of 4k
> > GEM buffers (cgroup can probably limit the total amount of memory
> > allocated, but you'd still have all gem-buf meta data in kernel memory).
>
> I guess the real problem is picking a sensible limit ;) My main concern
> here is that there doesn't appear to be any memory accounted against the
> process. For GEM buffers at least there is some cost to the application
> - so an unbounded allocation isn't possible, even if the bounds are
> likely to be very high.
>
> With kbase we found that syzcaller was good at finding ways of using up
> all the memory on the platform - and if it wasn't accounted to the right
> process that meant the OOM-killer knocked out the wrong process and
> produced a bug report to investigate. Perhaps I'm just scarred by that
> history ;)

Yep, cgroup accounting and restriction is still very much unsolved.
GEM buffers let you make an outsize impact on the whole system at
little to no cost to yourself. You can also create a million syncobjs
if you want. Oh well.

Cheers,
Daniel


Re: [PATCH v2 4/7] drm/panfrost: Add the ability to create submit queues

2021-07-02 Thread Boris Brezillon
On Fri, 2 Jul 2021 11:58:34 +0100
Steven Price  wrote:

> On 02/07/2021 11:52, Boris Brezillon wrote:
> > On Fri, 2 Jul 2021 11:08:58 +0100
> > Steven Price  wrote:
> >   
> >> On 01/07/2021 10:12, Boris Brezillon wrote:  
> >>> Needed to keep VkQueues isolated from each other.
> >>
> >> One more comment I noticed when I tried this out:
> >>
> >> [...]  
> >>> +struct panfrost_submitqueue *
> >>> +panfrost_submitqueue_create(struct panfrost_file_priv *ctx,
> >>> + enum panfrost_submitqueue_priority priority,
> >>> + u32 flags)
> >>> +{
> >>> + struct panfrost_submitqueue *queue;
> >>> + enum drm_sched_priority sched_prio;
> >>> + int ret, i;
> >>> +
> >>> + if (flags || priority >= PANFROST_SUBMITQUEUE_PRIORITY_COUNT)
> >>> + return ERR_PTR(-EINVAL);
> >>> +
> >>> + queue = kzalloc(sizeof(*queue), GFP_KERNEL);
> >>> + if (!queue)
> >>> + return ERR_PTR(-ENOMEM);
> >>> +
> >>> + queue->pfdev = ctx->pfdev;
> >>> + sched_prio = to_sched_prio(priority);
> >>> + for (i = 0; i < NUM_JOB_SLOTS; i++) {
> >>> + struct drm_gpu_scheduler *sched;
> >>> +
> >>> + sched = panfrost_job_get_sched(ctx->pfdev, i);
> >>> + ret = drm_sched_entity_init(>sched_entity[i],
> >>> + sched_prio, , 1, NULL);
> >>> + if (ret)
> >>> + break;
> >>> + }
> >>> +
> >>> + if (ret) {
> >>> + for (i--; i >= 0; i--)
> >>> + drm_sched_entity_destroy(>sched_entity[i]);
> >>> +
> >>> + return ERR_PTR(ret);
> >>> + }
> >>> +
> >>> + kref_init(>refcount);
> >>> + idr_lock(>queues);
> >>> + ret = idr_alloc(>queues, queue, 0, INT_MAX, GFP_KERNEL);
> >>
> >> This makes lockdep complain. idr_lock() is a spinlock and GFP_KERNEL can
> >> sleep. So either we need to bring our own mutex here or not use GFP_KERNEL.
> >>  
> > 
> > Ouch! I wonder why I don't see that (I have lockdep enabled, and the
> > igt tests should have exercised this path).  
> 
> Actually I'm not sure it technically lockdep - have you got
> CONFIG_DEBUG_ATOMIC_SLEEP set?

Nope, I was missing that one :-/.


Re: [PATCH v2 4/7] drm/panfrost: Add the ability to create submit queues

2021-07-02 Thread Steven Price
On 02/07/2021 11:52, Boris Brezillon wrote:
> On Fri, 2 Jul 2021 11:08:58 +0100
> Steven Price  wrote:
> 
>> On 01/07/2021 10:12, Boris Brezillon wrote:
>>> Needed to keep VkQueues isolated from each other.  
>>
>> One more comment I noticed when I tried this out:
>>
>> [...]
>>> +struct panfrost_submitqueue *
>>> +panfrost_submitqueue_create(struct panfrost_file_priv *ctx,
>>> +   enum panfrost_submitqueue_priority priority,
>>> +   u32 flags)
>>> +{
>>> +   struct panfrost_submitqueue *queue;
>>> +   enum drm_sched_priority sched_prio;
>>> +   int ret, i;
>>> +
>>> +   if (flags || priority >= PANFROST_SUBMITQUEUE_PRIORITY_COUNT)
>>> +   return ERR_PTR(-EINVAL);
>>> +
>>> +   queue = kzalloc(sizeof(*queue), GFP_KERNEL);
>>> +   if (!queue)
>>> +   return ERR_PTR(-ENOMEM);
>>> +
>>> +   queue->pfdev = ctx->pfdev;
>>> +   sched_prio = to_sched_prio(priority);
>>> +   for (i = 0; i < NUM_JOB_SLOTS; i++) {
>>> +   struct drm_gpu_scheduler *sched;
>>> +
>>> +   sched = panfrost_job_get_sched(ctx->pfdev, i);
>>> +   ret = drm_sched_entity_init(>sched_entity[i],
>>> +   sched_prio, , 1, NULL);
>>> +   if (ret)
>>> +   break;
>>> +   }
>>> +
>>> +   if (ret) {
>>> +   for (i--; i >= 0; i--)
>>> +   drm_sched_entity_destroy(>sched_entity[i]);
>>> +
>>> +   return ERR_PTR(ret);
>>> +   }
>>> +
>>> +   kref_init(>refcount);
>>> +   idr_lock(>queues);
>>> +   ret = idr_alloc(>queues, queue, 0, INT_MAX, GFP_KERNEL);  
>>
>> This makes lockdep complain. idr_lock() is a spinlock and GFP_KERNEL can
>> sleep. So either we need to bring our own mutex here or not use GFP_KERNEL.
>>
> 
> Ouch! I wonder why I don't see that (I have lockdep enabled, and the
> igt tests should have exercised this path).

Actually I'm not sure it technically lockdep - have you got
CONFIG_DEBUG_ATOMIC_SLEEP set?

Steve


Re: [PATCH v2 4/7] drm/panfrost: Add the ability to create submit queues

2021-07-02 Thread Steven Price
On 02/07/2021 11:43, Boris Brezillon wrote:
> On Fri, 2 Jul 2021 10:56:29 +0100
> Steven Price  wrote:
> 
>> On 01/07/2021 10:12, Boris Brezillon wrote:
>>> Needed to keep VkQueues isolated from each other.
>>>
>>> Signed-off-by: Boris Brezillon   
>>
>> My Vulkan knowledge is limited so I'm not sure whether this is the right
>> approach or not. In particular is it correct that an application can
>> create a high priority queue which could affect other (normal priority)
>> applications?
> 
> That's what msm does (with no extra CAPS check AFAICT), and the
> freedreno driver can already create high priority queues if
> PIPE_CONTEXT_HIGH_PRIORITY is passed. Not saying that's okay to allow
> userspace to tweak the priority, but if that's a problem, other drivers
> are in trouble too ;-).

Oh well I guess if others are doing the same ;) I have to admit kbase
has always struggled with how to identify a "privileged" process - it's
something that makes a bit of sense on Android but for other userspaces
there really doesn't seem to be a good way of identifying what should or
should not be allowed to create high priority queues.

>>
>> Also does it really make sense to allow user space to create an
>> unlimited number of queues? It feels like an ideal way for an malicious
>> application to waste kernel memory.
> 
> Same here, I see no limit on the number of queues the msm driver can
> create. I can definitely pick an arbitrary limit of 2^16 (or 2^8 if
> 2^16 is too high) if you prefer, but I feel like there's plenty of ways
> to force kernel allocations already, like allocating a gazillion of 4k
> GEM buffers (cgroup can probably limit the total amount of memory
> allocated, but you'd still have all gem-buf meta data in kernel memory).

I guess the real problem is picking a sensible limit ;) My main concern
here is that there doesn't appear to be any memory accounted against the
process. For GEM buffers at least there is some cost to the application
- so an unbounded allocation isn't possible, even if the bounds are
likely to be very high.

With kbase we found that syzcaller was good at finding ways of using up
all the memory on the platform - and if it wasn't accounted to the right
process that meant the OOM-killer knocked out the wrong process and
produced a bug report to investigate. Perhaps I'm just scarred by that
history ;)

Steve

>>
>> In terms of implementation it looks correct, but one comment below
>>
>>> ---
>>>  drivers/gpu/drm/panfrost/Makefile |   3 +-
>>>  drivers/gpu/drm/panfrost/panfrost_device.h|   2 +-
>>>  drivers/gpu/drm/panfrost/panfrost_drv.c   |  69 --
>>>  drivers/gpu/drm/panfrost/panfrost_job.c   |  47 ++-
>>>  drivers/gpu/drm/panfrost/panfrost_job.h   |   9 +-
>>>  .../gpu/drm/panfrost/panfrost_submitqueue.c   | 130 ++
>>>  .../gpu/drm/panfrost/panfrost_submitqueue.h   |  27 
>>>  include/uapi/drm/panfrost_drm.h   |  17 +++
>>>  8 files changed, 258 insertions(+), 46 deletions(-)
>>>  create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c
>>>  create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h
>>>   
>> [...]
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_submitqueue.c 
>>> b/drivers/gpu/drm/panfrost/panfrost_submitqueue.c
>>> new file mode 100644
>>> index ..98050f7690df
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_submitqueue.c
>>> @@ -0,0 +1,130 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/* Copyright 2021 Collabora ltd. */
>>> +
>>> +#include 
>>> +
>>> +#include "panfrost_device.h"
>>> +#include "panfrost_job.h"
>>> +#include "panfrost_submitqueue.h"
>>> +
>>> +static enum drm_sched_priority
>>> +to_sched_prio(enum panfrost_submitqueue_priority priority)
>>> +{
>>> +   switch (priority) {
>>> +   case PANFROST_SUBMITQUEUE_PRIORITY_LOW:
>>> +   return DRM_SCHED_PRIORITY_MIN;
>>> +   case PANFROST_SUBMITQUEUE_PRIORITY_MEDIUM:
>>> +   return DRM_SCHED_PRIORITY_NORMAL;
>>> +   case PANFROST_SUBMITQUEUE_PRIORITY_HIGH:
>>> +   return DRM_SCHED_PRIORITY_HIGH;
>>> +   default:
>>> +   break;
>>> +   }
>>> +
>>> +   return DRM_SCHED_PRIORITY_UNSET;
>>> +}
>>> +
>>> +static void
>>> +panfrost_submitqueue_cleanup(struct kref *ref)
>>> +{
>>> +   struct panfrost_submitqueue *queue;
>>> +   unsigned int i;
>>> +
>>> +   queue = container_of(ref, struct panfrost_submitqueue, refcount);
>>> +
>>> +   for (i = 0; i < NUM_JOB_SLOTS; i++)
>>> +   drm_sched_entity_destroy(>sched_entity[i]);
>>> +
>>> +   /* Kill in-flight jobs */
>>> +   panfrost_job_kill_queue(queue);
>>> +
>>> +   kfree(queue);
>>> +}
>>> +
>>> +void panfrost_submitqueue_put(struct panfrost_submitqueue *queue)
>>> +{
>>> +   if (!IS_ERR_OR_NULL(queue))
>>> +   kref_put(>refcount, panfrost_submitqueue_cleanup);
>>> +}
>>> +
>>> +struct panfrost_submitqueue *
>>> +panfrost_submitqueue_create(struct panfrost_file_priv *ctx,
>>> +   

  1   2   >