Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
Hi Will and Robin, On Fri, Jul 02, 2021 at 04:13:50PM +0100, Robin Murphy wrote: > On 2021-07-02 14:58, Will Deacon wrote: > > Hi Nathan, > > > > On Thu, Jul 01, 2021 at 12:52:20AM -0700, Nathan Chancellor wrote: > > > On 7/1/2021 12:40 AM, Will Deacon wrote: > > > > On Wed, Jun 30, 2021 at 08:56:51AM -0700, Nathan Chancellor wrote: > > > > > On Wed, Jun 30, 2021 at 12:43:48PM +0100, Will Deacon wrote: > > > > > > On Wed, Jun 30, 2021 at 05:17:27PM +0800, Claire Chang wrote: > > > > > > > `BUG: unable to handle page fault for address: 003a8290` > > > > > > > and > > > > > > > the fact it crashed at `_raw_spin_lock_irqsave` look like the > > > > > > > memory > > > > > > > (maybe dev->dma_io_tlb_mem) was corrupted? > > > > > > > The dev->dma_io_tlb_mem should be set here > > > > > > > (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/pci/probe.c#n2528) > > > > > > > through device_initialize. > > > > > > > > > > > > I'm less sure about this. 'dma_io_tlb_mem' should be pointing at > > > > > > 'io_tlb_default_mem', which is a page-aligned allocation from > > > > > > memblock. > > > > > > The spinlock is at offset 0x24 in that structure, and looking at the > > > > > > register dump from the crash: > > > > > > > > > > > > Jun 29 18:28:42 hp-4300G kernel: RSP: 0018:adb4013db9e8 EFLAGS: > > > > > > 00010006 > > > > > > Jun 29 18:28:42 hp-4300G kernel: RAX: 003a8290 RBX: > > > > > > RCX: 8900572ad580 > > > > > > Jun 29 18:28:42 hp-4300G kernel: RDX: 89005653f024 RSI: > > > > > > 000c RDI: 1d17 > > > > > > Jun 29 18:28:42 hp-4300G kernel: RBP: 0a20d000 R08: > > > > > > 000c R09: > > > > > > Jun 29 18:28:42 hp-4300G kernel: R10: 0a20d000 R11: > > > > > > 89005653f000 R12: 0212 > > > > > > Jun 29 18:28:42 hp-4300G kernel: R13: 1000 R14: > > > > > > 0002 R15: 0020 > > > > > > Jun 29 18:28:42 hp-4300G kernel: FS: 7f1f8898ea40() > > > > > > GS:89005728() knlGS: > > > > > > Jun 29 18:28:42 hp-4300G kernel: CS: 0010 DS: ES: CR0: > > > > > > 80050033 > > > > > > Jun 29 18:28:42 hp-4300G kernel: CR2: 003a8290 CR3: > > > > > > 0001020d CR4: 00350ee0 > > > > > > Jun 29 18:28:42 hp-4300G kernel: Call Trace: > > > > > > Jun 29 18:28:42 hp-4300G kernel: _raw_spin_lock_irqsave+0x39/0x50 > > > > > > Jun 29 18:28:42 hp-4300G kernel: swiotlb_tbl_map_single+0x12b/0x4c0 > > > > > > > > > > > > Then that correlates with R11 holding the 'dma_io_tlb_mem' pointer > > > > > > and > > > > > > RDX pointing at the spinlock. Yet RAX is holding junk :/ > > > > > > > > > > > > I agree that enabling KASAN would be a good idea, but I also think > > > > > > we > > > > > > probably need to get some more information out of > > > > > > swiotlb_tbl_map_single() > > > > > > to see see what exactly is going wrong in there. > > > > > > > > > > I can certainly enable KASAN and if there is any debug print I can add > > > > > or dump anything, let me know! > > > > > > > > I bit the bullet and took v5.13 with swiotlb/for-linus-5.14 merged in, > > > > built > > > > x86 defconfig and ran it on my laptop. However, it seems to work fine! > > > > > > > > Please can you share your .config? > > > > > > Sure thing, it is attached. It is just Arch Linux's config run through > > > olddefconfig. The original is below in case you need to diff it. > > > > > > https://raw.githubusercontent.com/archlinux/svntogit-packages/9045405dc835527164f3034b3ceb9a67c7a53cd4/trunk/config > > > > > > If there is anything more that I can provide, please let me know. > > > > I eventually got this booting (for some reason it was causing LD to SEGV > > trying to link it for a while...) and sadly it works fine on my laptop. Hmm. Seems like it might be something specific to the amdgpu module? > > Did you manage to try again with KASAN? Yes, it took a few times to reproduce the issue but I did manage to get a dmesg, please find it attached. I build from commit 7d31f1c65cc9 ("swiotlb: fix implicit debugfs declarations") in Konrad's tree. > > It might also be worth taking the IOMMU out of the equation, since that > > interfaces differently with SWIOTLB and I couldn't figure out the code path > > from the log you provided. What happens if you boot with "amd_iommu=off > > swiotlb=force"? > > Oh, now there's a thing... the chat from the IOMMU API in the boot log > implies that the IOMMU *should* be in the picture - we see that default > domains are IOMMU_DOMAIN_DMA default and the GPU :0c:00.0 was added to a > group. That means dev->dma_ops should be set and DMA API calls should be > going through iommu-dma, yet the callstack in the crash says we've gone > straight from dma_map_page_attrs() to swiotlb_map(), implying the inline > dma_direct_map_page() path. > > If dev->dma_ops didn't
Re: [PATCH 2/2] drm/vc4: hdmi: Convert to gpiod
On Fri, Jul 02, 2021 at 03:16:46PM +0200, Maxime Ripard wrote: > Hi Nathan, > > On Thu, Jul 01, 2021 at 08:29:34PM -0700, Nathan Chancellor wrote: > > On Mon, May 24, 2021 at 03:18:52PM +0200, Maxime Ripard wrote: > > > The new gpiod interface takes care of parsing the GPIO flags and to > > > return the logical value when accessing an active-low GPIO, so switching > > > to it simplifies a lot the driver. > > > > > > Signed-off-by: Maxime Ripard > > > --- > > > drivers/gpu/drm/vc4/vc4_hdmi.c | 24 +++- > > > drivers/gpu/drm/vc4/vc4_hdmi.h | 3 +-- > > > 2 files changed, 8 insertions(+), 19 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c > > > b/drivers/gpu/drm/vc4/vc4_hdmi.c > > > index ccc6c8079dc6..34622c59f6a7 100644 > > > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c > > > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c > > > @@ -159,10 +159,9 @@ vc4_hdmi_connector_detect(struct drm_connector > > > *connector, bool force) > > > struct vc4_hdmi *vc4_hdmi = connector_to_vc4_hdmi(connector); > > > bool connected = false; > > > > > > - if (vc4_hdmi->hpd_gpio) { > > > - if (gpio_get_value_cansleep(vc4_hdmi->hpd_gpio) ^ > > > - vc4_hdmi->hpd_active_low) > > > - connected = true; > > > + if (vc4_hdmi->hpd_gpio && > > > + gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio)) { > > > + connected = true; > > > } else if (drm_probe_ddc(vc4_hdmi->ddc)) { > > > connected = true; > > > } else if (HDMI_READ(HDMI_HOTPLUG) & VC4_HDMI_HOTPLUG_CONNECTED) { > > > @@ -1993,7 +1992,6 @@ static int vc4_hdmi_bind(struct device *dev, struct > > > device *master, void *data) > > > struct vc4_hdmi *vc4_hdmi; > > > struct drm_encoder *encoder; > > > struct device_node *ddc_node; > > > - u32 value; > > > int ret; > > > > > > vc4_hdmi = devm_kzalloc(dev, sizeof(*vc4_hdmi), GFP_KERNEL); > > > @@ -2031,18 +2029,10 @@ static int vc4_hdmi_bind(struct device *dev, > > > struct device *master, void *data) > > > /* Only use the GPIO HPD pin if present in the DT, otherwise > > >* we'll use the HDMI core's register. > > >*/ > > > - if (of_find_property(dev->of_node, "hpd-gpios", )) { > > > - enum of_gpio_flags hpd_gpio_flags; > > > - > > > - vc4_hdmi->hpd_gpio = of_get_named_gpio_flags(dev->of_node, > > > - "hpd-gpios", 0, > > > - _gpio_flags); > > > - if (vc4_hdmi->hpd_gpio < 0) { > > > - ret = vc4_hdmi->hpd_gpio; > > > - goto err_put_ddc; > > > - } > > > - > > > - vc4_hdmi->hpd_active_low = hpd_gpio_flags & OF_GPIO_ACTIVE_LOW; > > > + vc4_hdmi->hpd_gpio = devm_gpiod_get_optional(dev, "hpd", GPIOD_IN); > > > + if (IS_ERR(vc4_hdmi->hpd_gpio)) { > > > + ret = PTR_ERR(vc4_hdmi->hpd_gpio); > > > + goto err_put_ddc; > > > } > > > > > > vc4_hdmi->disable_wifi_frequencies = > > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h > > > b/drivers/gpu/drm/vc4/vc4_hdmi.h > > > index 060bcaefbeb5..2688a55461d6 100644 > > > --- a/drivers/gpu/drm/vc4/vc4_hdmi.h > > > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.h > > > @@ -146,8 +146,7 @@ struct vc4_hdmi { > > > /* VC5 Only */ > > > void __iomem *rm_regs; > > > > > > - int hpd_gpio; > > > - bool hpd_active_low; > > > + struct gpio_desc *hpd_gpio; > > > > > > /* > > >* On some systems (like the RPi4), some modes are in the same > > > -- > > > 2.31.1 > > > > This patch as commit 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod") > > causes my Raspberry Pi 3 to lock up shortly after boot in combination > > with commit 411efa18e4b0 ("drm/vc4: hdmi: Move the HSM clock enable to > > runtime_pm"). The serial console and ssh are completely unresponsive and > > I do not see any messages in dmesg with "debug ignore_loglevel". The > > device is running with a 32-bit kernel (multi_v7_defconfig) with 32-bit > > userspace. If there is any further information that I can provide, > > please let me know. > > Thanks for reporting this. The same bug has been reported on wednesday > on the RPi repo here: > https://github.com/raspberrypi/linux/pull/4418 > > More specifically, this commit should fix it: > https://github.com/raspberrypi/linux/pull/4418/commits/6d404373c20a794da3d6a7b4f1373903183bb5d0 > > Even though it's based on the 5.10 kernel, it should apply without any > warning on a mainline tree. Let me know if it fixes your issue too Thank you for the links and the quick reply. Unfortunately, I applied this patch on top of commit d6b63b5b7d7f ("Merge tag 'sound-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound") in mainline, which does reproduce this issue still and it did not fix the issue. In fact, it did not even get to the raspberrypi login prompt before it locked up, yet again without any real output in the serial console except for maybe this message? [7.582480] vc4-drm
Re: Start fixing the shared to exclusive fence dependencies.
On Fri, Jul 02, 2021 at 01:16:38PM +0200, Christian König wrote: > Hey Daniel, > > even when you are not 100% done with the driver audit I think we should > push that patch set here to drm-misc-next now so that it can end up in > 5.15. So I think I got them all, just need to type up some good docs all over the place next week and send it out. -Daniel > > Not having any dependency between the exclusive and the shared fence > signaling order is just way more defensive than the current model. > > As discussed I'm holding back any amdgpu and TTM workarounds which could > be removed for now. > > Thoughts? > > Thanks, > Christian. > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
[PATCH] drm/amdgpu: Return error if no RAS
In amdgpu_ras_query_error_count() return an error if the device doesn't support RAS. This prevents that function from having to always set the values of the integer pointers (if set), and thus prevents function side effects--always to have to set values of integers if integer pointers set, regardless of whether RAS is supported or not--with this change this side effect is mitigated. Also, if no pointers are set, don't count, since we've no way of reporting the counts. Also, give this function a kernel-doc. Cc: Alexander Deucher Cc: John Clements Cc: Hawking Zhang Reported-by: Tom Rix Fixes: a46751fbcde505 ("drm/amdgpu: Fix RAS function interface") Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 49 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 6 +-- 2 files changed, 38 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index c6ae63893dbdb2..ed698b2be79023 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -813,7 +813,7 @@ static int amdgpu_ras_enable_all_features(struct amdgpu_device *adev, /* query/inject/cure begin */ int amdgpu_ras_query_error_status(struct amdgpu_device *adev, - struct ras_query_if *info) + struct ras_query_if *info) { struct ras_manager *obj = amdgpu_ras_find_obj(adev, >head); struct ras_err_data err_data = {0, 0, 0, NULL}; @@ -1047,17 +1047,32 @@ int amdgpu_ras_error_inject(struct amdgpu_device *adev, return ret; } -/* get the total error counts on all IPs */ -void amdgpu_ras_query_error_count(struct amdgpu_device *adev, - unsigned long *ce_count, - unsigned long *ue_count) +/** + * amdgpu_ras_query_error_count -- Get error counts of all IPs + * adev: pointer to AMD GPU device + * ce_count: pointer to an integer to be set to the count of correctible errors. + * ue_count: pointer to an integer to be set to the count of uncorrectible + * errors. + * + * If set, @ce_count or @ue_count, count and return the corresponding + * error counts in those integer pointers. Return 0 if the device + * supports RAS. Return -EINVAL if the device doesn't support RAS. + */ +int amdgpu_ras_query_error_count(struct amdgpu_device *adev, +unsigned long *ce_count, +unsigned long *ue_count) { struct amdgpu_ras *con = amdgpu_ras_get_context(adev); struct ras_manager *obj; unsigned long ce, ue; if (!adev->ras_enabled || !con) - return; + return -EINVAL; + + /* Don't count since no reporting. +*/ + if (!ce_count && !ue_count) + return 0; ce = 0; ue = 0; @@ -1065,9 +1080,11 @@ void amdgpu_ras_query_error_count(struct amdgpu_device *adev, struct ras_query_if info = { .head = obj->head, }; + int res; - if (amdgpu_ras_query_error_status(adev, )) - return; + res = amdgpu_ras_query_error_status(adev, ); + if (res) + return res; ce += info.ce_count; ue += info.ue_count; @@ -1078,6 +1095,8 @@ void amdgpu_ras_query_error_count(struct amdgpu_device *adev, if (ue_count) *ue_count = ue; + + return 0; } /* query/inject/cure end */ @@ -2145,9 +2164,10 @@ static void amdgpu_ras_counte_dw(struct work_struct *work) /* Cache new values. */ - amdgpu_ras_query_error_count(adev, _count, _count); - atomic_set(>ras_ce_count, ce_count); - atomic_set(>ras_ue_count, ue_count); + if (amdgpu_ras_query_error_count(adev, _count, _count) == 0) { + atomic_set(>ras_ce_count, ce_count); + atomic_set(>ras_ue_count, ue_count); + } pm_runtime_mark_last_busy(dev->dev); Out: @@ -2320,9 +2340,10 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev, /* Those are the cached values at init. */ - amdgpu_ras_query_error_count(adev, _count, _count); - atomic_set(>ras_ce_count, ce_count); - atomic_set(>ras_ue_count, ue_count); + if (amdgpu_ras_query_error_count(adev, _count, _count) == 0) { + atomic_set(>ras_ce_count, ce_count); + atomic_set(>ras_ue_count, ue_count); + } return 0; cleanup: diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h index 283afd791db107..4d9c63f2f37718 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h @@ -491,9 +491,9 @@ int amdgpu_ras_request_reset_on_boot(struct amdgpu_device *adev, void amdgpu_ras_resume(struct amdgpu_device *adev); void
Re: [PATCH 4/4] drm/msm: always wait for the exclusive fence
On Fri, Jul 02, 2021 at 01:16:42PM +0200, Christian König wrote: > Drivers also need to to sync to the exclusive fence when > a shared one is present. > > Completely untested since the driver won't even compile on !ARM. It's really not that hard to set up a cross-compiler, reasonable distros have that now all packages. Does explain though why you tend to break the arm build with drm-misc patches. Please fix this. > Signed-off-by: Christian König Reviewed-by: Daniel Vetter > --- > drivers/gpu/drm/msm/msm_gem.c | 16 +++- > 1 file changed, 7 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c > index a94a43de95ef..72a07e311de3 100644 > --- a/drivers/gpu/drm/msm/msm_gem.c > +++ b/drivers/gpu/drm/msm/msm_gem.c > @@ -817,17 +817,15 @@ int msm_gem_sync_object(struct drm_gem_object *obj, > struct dma_fence *fence; > int i, ret; > > - fobj = dma_resv_shared_list(obj->resv); > - if (!fobj || (fobj->shared_count == 0)) { > - fence = dma_resv_excl_fence(obj->resv); > - /* don't need to wait on our own fences, since ring is fifo */ > - if (fence && (fence->context != fctx->context)) { > - ret = dma_fence_wait(fence, true); > - if (ret) > - return ret; > - } > + fence = dma_resv_excl_fence(obj->resv); > + /* don't need to wait on our own fences, since ring is fifo */ > + if (fence && (fence->context != fctx->context)) { > + ret = dma_fence_wait(fence, true); > + if (ret) > + return ret; > } > > + fobj = dma_resv_shared_list(obj->resv); > if (!exclusive || !fobj) > return 0; > > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 3/4] drm/nouveau: always wait for the exclusive fence
On Fri, Jul 02, 2021 at 01:16:41PM +0200, Christian König wrote: > Drivers also need to to sync to the exclusive fence when > a shared one is present. > > Signed-off-by: Christian König Reviewed-by: Daniel Vetter > --- > drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c > b/drivers/gpu/drm/nouveau/nouveau_fence.c > index 6b43918035df..05d0b3eb3690 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c > @@ -358,7 +358,7 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct > nouveau_channel *chan, bool e > fobj = dma_resv_shared_list(resv); > fence = dma_resv_excl_fence(resv); > > - if (fence && (!exclusive || !fobj || !fobj->shared_count)) { > + if (fence) { > struct nouveau_channel *prev = NULL; > bool must_wait = true; > > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH] drm/amdgpu: initialize amdgpu_ras_query_error_count() error count parameters
That's a good find, but I'd rather functions have no side effects. I'll follow up with a patch which correctly fixes this. Regards, Luben On 2021-07-02 3:52 p.m., t...@redhat.com wrote: > From: Tom Rix > > Static analysis reports this problem > amdgpu_ras.c:2324:2: warning: 2nd function call argument is an > uninitialized value > atomic_set(>ras_ce_count, ce_count); > ^~~~ > > ce_count is normally set by the earlier call to > amdgpu_ras_query_error_count(). But amdgpu_ras_query_error_count() > can return early without setting, leaving its error count parameters > in a garbage state. > > Initialize the error count parameters earlier. > > Fixes: a46751fbcde5 ("drm/amdgpu: Fix RAS function interface") > Signed-off-by: Tom Rix > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > index 875874ea745ec..c80fa545aa2b8 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > @@ -1056,6 +1056,12 @@ void amdgpu_ras_query_error_count(struct amdgpu_device > *adev, > struct ras_manager *obj; > unsigned long ce, ue; > > + if (ce_count) > + *ce_count = 0; > + > + if (ue_count) > + *ue_count = 0; > + > if (!adev->ras_enabled || !con) > return; >
Re: [PATCH 2/4] dma-buf: fix dma_resv_test_signaled test_all handling v2
On Fri, Jul 02, 2021 at 01:16:40PM +0200, Christian König wrote: > As the name implies if testing all fences is requested we > should indeed test all fences and not skip the exclusive > one because we see shared ones. > > v2: fix logic once more > > Signed-off-by: Christian König Reviewed-by: Daniel Vetter > --- > drivers/dma-buf/dma-resv.c | 33 - > 1 file changed, 12 insertions(+), 21 deletions(-) > > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c > index 4ab02b6c387a..18dd5a6ca06c 100644 > --- a/drivers/dma-buf/dma-resv.c > +++ b/drivers/dma-buf/dma-resv.c > @@ -618,25 +618,21 @@ static inline int dma_resv_test_signaled_single(struct > dma_fence *passed_fence) > */ > bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) > { > - unsigned int seq, shared_count; > + struct dma_fence *fence; > + unsigned int seq; > int ret; > > rcu_read_lock(); > retry: > ret = true; > - shared_count = 0; > seq = read_seqcount_begin(>seq); > > if (test_all) { > struct dma_resv_list *fobj = dma_resv_shared_list(obj); > - unsigned int i; > - > - if (fobj) > - shared_count = fobj->shared_count; > + unsigned int i, shared_count; > > + shared_count = fobj ? fobj->shared_count : 0; > for (i = 0; i < shared_count; ++i) { > - struct dma_fence *fence; > - > fence = rcu_dereference(fobj->shared[i]); > ret = dma_resv_test_signaled_single(fence); > if (ret < 0) > @@ -644,24 +640,19 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool > test_all) > else if (!ret) > break; > } > - > - if (read_seqcount_retry(>seq, seq)) > - goto retry; > } > > - if (!shared_count) { > - struct dma_fence *fence_excl = dma_resv_excl_fence(obj); > - > - if (fence_excl) { > - ret = dma_resv_test_signaled_single(fence_excl); > - if (ret < 0) > - goto retry; > + fence = dma_resv_excl_fence(obj); > + if (ret && fence) { > + ret = dma_resv_test_signaled_single(fence); > + if (ret < 0) > + goto retry; > > - if (read_seqcount_retry(>seq, seq)) > - goto retry; > - } > } > > + if (read_seqcount_retry(>seq, seq)) > + goto retry; > + > rcu_read_unlock(); > return ret; > } > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 1/4] dma-buf: add some more kerneldoc to dma_resv_add_shared_fence
On Fri, Jul 02, 2021 at 01:16:39PM +0200, Christian König wrote: > Explicitly document that code can't assume that shared fences > signal after the exclusive fence. > > Signed-off-by: Christian König > --- > drivers/dma-buf/dma-resv.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c > index f26c71747d43..4ab02b6c387a 100644 > --- a/drivers/dma-buf/dma-resv.c > +++ b/drivers/dma-buf/dma-resv.c > @@ -235,7 +235,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max); > * @fence: the shared fence to add > * > * Add a fence to a shared slot, obj->lock must be held, and > - * dma_resv_reserve_shared() has been called. > + * dma_resv_reserve_shared() has been called. The shared fences can signal in > + * any order and there is especially no guarantee that shared fences signal > + * after the exclusive one. Code relying on any signaling order is broken and > + * needs to be fixed. This feels like the last place I'd go look for how I should handle dependencies. It's the function for adding shared fences after all, has absolutely nothing to do with whether we should wait for them. I'll type up something else. -Daniel > */ > void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence) > { > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
[PATCH v2 09/11] drm/gem: Delete gem array fencing helpers
Integrated into the scheduler now and all users converted over. Signed-off-by: Daniel Vetter Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org --- drivers/gpu/drm/drm_gem.c | 96 --- include/drm/drm_gem.h | 5 -- 2 files changed, 101 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index 68deb1de8235..24d49a2636e0 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1294,99 +1294,3 @@ drm_gem_unlock_reservations(struct drm_gem_object **objs, int count, ww_acquire_fini(acquire_ctx); } EXPORT_SYMBOL(drm_gem_unlock_reservations); - -/** - * drm_gem_fence_array_add - Adds the fence to an array of fences to be - * waited on, deduplicating fences from the same context. - * - * @fence_array: array of dma_fence * for the job to block on. - * @fence: the dma_fence to add to the list of dependencies. - * - * This functions consumes the reference for @fence both on success and error - * cases. - * - * Returns: - * 0 on success, or an error on failing to expand the array. - */ -int drm_gem_fence_array_add(struct xarray *fence_array, - struct dma_fence *fence) -{ - struct dma_fence *entry; - unsigned long index; - u32 id = 0; - int ret; - - if (!fence) - return 0; - - /* Deduplicate if we already depend on a fence from the same context. -* This lets the size of the array of deps scale with the number of -* engines involved, rather than the number of BOs. -*/ - xa_for_each(fence_array, index, entry) { - if (entry->context != fence->context) - continue; - - if (dma_fence_is_later(fence, entry)) { - dma_fence_put(entry); - xa_store(fence_array, index, fence, GFP_KERNEL); - } else { - dma_fence_put(fence); - } - return 0; - } - - ret = xa_alloc(fence_array, , fence, xa_limit_32b, GFP_KERNEL); - if (ret != 0) - dma_fence_put(fence); - - return ret; -} -EXPORT_SYMBOL(drm_gem_fence_array_add); - -/** - * drm_gem_fence_array_add_implicit - Adds the implicit dependencies tracked - * in the GEM object's reservation object to an array of dma_fences for use in - * scheduling a rendering job. - * - * This should be called after drm_gem_lock_reservations() on your array of - * GEM objects used in the job but before updating the reservations with your - * own fences. - * - * @fence_array: array of dma_fence * for the job to block on. - * @obj: the gem object to add new dependencies from. - * @write: whether the job might write the object (so we need to depend on - * shared fences in the reservation object). - */ -int drm_gem_fence_array_add_implicit(struct xarray *fence_array, -struct drm_gem_object *obj, -bool write) -{ - int ret; - struct dma_fence **fences; - unsigned int i, fence_count; - - if (!write) { - struct dma_fence *fence = - dma_resv_get_excl_unlocked(obj->resv); - - return drm_gem_fence_array_add(fence_array, fence); - } - - ret = dma_resv_get_fences(obj->resv, NULL, - _count, ); - if (ret || !fence_count) - return ret; - - for (i = 0; i < fence_count; i++) { - ret = drm_gem_fence_array_add(fence_array, fences[i]); - if (ret) - break; - } - - for (; i < fence_count; i++) - dma_fence_put(fences[i]); - kfree(fences); - return ret; -} -EXPORT_SYMBOL(drm_gem_fence_array_add_implicit); diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 240049566592..6d5e33b89074 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -409,11 +409,6 @@ int drm_gem_lock_reservations(struct drm_gem_object **objs, int count, struct ww_acquire_ctx *acquire_ctx); void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count, struct ww_acquire_ctx *acquire_ctx); -int drm_gem_fence_array_add(struct xarray *fence_array, - struct dma_fence *fence); -int drm_gem_fence_array_add_implicit(struct xarray *fence_array, -struct drm_gem_object *obj, -bool write); int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev, u32 handle, u64 *offset); -- 2.32.0.rc2
[PATCH v2 10/11] drm/sched: Don't store self-dependencies
This is essentially part of drm_sched_dependency_optimized(), which only amdgpu seems to make use of. Use it a bit more. This would mean that as-is amdgpu can't use the dependency helpers, at least not with the current approach amdgpu has for deciding whether a vm_flush is needed. Since amdgpu also has very special rules around implicit fencing it can't use those helpers either, and adding a drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too onerous. That way the special case handling for amdgpu sticks even more out and we have higher chances that reviewers that go across all drivers wont miss it. Reviewed-by: Lucas Stach Signed-off-by: Daniel Vetter Cc: "Christian König" Cc: Daniel Vetter Cc: Luben Tuikov Cc: Andrey Grodzovsky Cc: Alex Deucher Cc: Jack Zhang --- drivers/gpu/drm/scheduler/sched_main.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 12d533486518..de76f7e14e0d 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -651,6 +651,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job, if (!fence) return 0; + /* if it's a fence from us it's guaranteed to be earlier */ + if (fence->context == job->entity->fence_context || + fence->context == job->entity->fence_context + 1) { + dma_fence_put(fence); + return 0; + } + /* Deduplicate if we already depend on a fence from the same context. * This lets the size of the array of deps scale with the number of * engines involved, rather than the number of BOs. -- 2.32.0.rc2
[PATCH v2 11/11] drm/sched: Check locking in drm_sched_job_await_implicit
You really need to hold the reservation here or all kinds of funny things can happen between grabbing the dependencies and inserting the new fences. Signed-off-by: Daniel Vetter Cc: "Christian König" Cc: Daniel Vetter Cc: Luben Tuikov Cc: Andrey Grodzovsky Cc: Alex Deucher Cc: Jack Zhang --- drivers/gpu/drm/scheduler/sched_main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index de76f7e14e0d..47f869aff335 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -705,6 +705,8 @@ int drm_sched_job_await_implicit(struct drm_sched_job *job, struct dma_fence **fences; unsigned int i, fence_count; + dma_resv_assert_held(obj->resv); + if (!write) { struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv); -- 2.32.0.rc2
[PATCH v2 06/11] drm/v3d: Move drm_sched_job_init to v3d_job_init
Prep work for using the scheduler dependency handling. We need to call drm_sched_job_init earlier so we can use the new drm_sched_job_await* functions for dependency handling here. v2: Slightly better commit message and rebase to include the drm_sched_job_arm() call (Emma). v3: Cleanup jobs under construction correctly (Emma) Signed-off-by: Daniel Vetter Cc: Emma Anholt --- drivers/gpu/drm/v3d/v3d_drv.h | 1 + drivers/gpu/drm/v3d/v3d_gem.c | 88 ++--- drivers/gpu/drm/v3d/v3d_sched.c | 15 +++--- 3 files changed, 44 insertions(+), 60 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 8a390738d65b..1d870261eaac 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -332,6 +332,7 @@ int v3d_submit_csd_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_wait_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); +void v3d_job_cleanup(struct v3d_job *job); void v3d_job_put(struct v3d_job *job); void v3d_reset(struct v3d_dev *v3d); void v3d_invalidate_caches(struct v3d_dev *v3d); diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 69ac20e11b09..5eccd3658938 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -392,6 +392,12 @@ v3d_render_job_free(struct kref *ref) v3d_job_free(ref); } +void v3d_job_cleanup(struct v3d_job *job) +{ + drm_sched_job_cleanup(>base); + v3d_job_put(job); +} + void v3d_job_put(struct v3d_job *job) { kref_put(>refcount, job->free); @@ -433,9 +439,10 @@ v3d_wait_bo_ioctl(struct drm_device *dev, void *data, static int v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, struct v3d_job *job, void (*free)(struct kref *ref), -u32 in_sync) +u32 in_sync, enum v3d_queue queue) { struct dma_fence *in_fence = NULL; + struct v3d_file_priv *v3d_priv = file_priv->driver_priv; int ret; job->v3d = v3d; @@ -446,35 +453,33 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, return ret; xa_init_flags(>deps, XA_FLAGS_ALLOC); + ret = drm_sched_job_init(>base, _priv->sched_entity[queue], +v3d_priv); + if (ret) + goto fail; ret = drm_syncobj_find_fence(file_priv, in_sync, 0, 0, _fence); if (ret == -EINVAL) - goto fail; + goto fail_job; ret = drm_gem_fence_array_add(>deps, in_fence); if (ret) - goto fail; + goto fail_job; kref_init(>refcount); return 0; +fail_job: + drm_sched_job_cleanup(>base); fail: xa_destroy(>deps); pm_runtime_put_autosuspend(v3d->drm.dev); return ret; } -static int -v3d_push_job(struct v3d_file_priv *v3d_priv, -struct v3d_job *job, enum v3d_queue queue) +static void +v3d_push_job(struct v3d_job *job) { - int ret; - - ret = drm_sched_job_init(>base, _priv->sched_entity[queue], -v3d_priv); - if (ret) - return ret; - drm_sched_job_arm(>base); job->done_fence = dma_fence_get(>base.s_fence->finished); @@ -483,8 +488,6 @@ v3d_push_job(struct v3d_file_priv *v3d_priv, kref_get(>refcount); drm_sched_entity_push_job(>base); - - return 0; } static void @@ -530,7 +533,6 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { struct v3d_dev *v3d = to_v3d_dev(dev); - struct v3d_file_priv *v3d_priv = file_priv->driver_priv; struct drm_v3d_submit_cl *args = data; struct v3d_bin_job *bin = NULL; struct v3d_render_job *render; @@ -556,7 +558,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, INIT_LIST_HEAD(>unref_list); ret = v3d_job_init(v3d, file_priv, >base, - v3d_render_job_free, args->in_sync_rcl); + v3d_render_job_free, args->in_sync_rcl, V3D_RENDER); if (ret) { kfree(render); return ret; @@ -570,7 +572,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, } ret = v3d_job_init(v3d, file_priv, >base, - v3d_job_free, args->in_sync_bcl); + v3d_job_free, args->in_sync_bcl, V3D_BIN); if (ret) { v3d_job_put(>base); kfree(bin); @@ -592,7 +594,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, goto fail; } - ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0); + ret = v3d_job_init(v3d, file_priv, clean_job,
[PATCH v2 07/11] drm/v3d: Use scheduler dependency handling
With the prep work out of the way this isn't tricky anymore. Aside: The chaining of the various jobs is a bit awkward, with the possibility of failure in bad places. I think with the drm_sched_job_init/arm split and maybe preloading the job->dependencies xarray this should be fixable. Signed-off-by: Daniel Vetter --- drivers/gpu/drm/v3d/v3d_drv.h | 5 - drivers/gpu/drm/v3d/v3d_gem.c | 25 - drivers/gpu/drm/v3d/v3d_sched.c | 29 + 3 files changed, 9 insertions(+), 50 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 1d870261eaac..f80f4ff1f7aa 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -192,11 +192,6 @@ struct v3d_job { struct drm_gem_object **bo; u32 bo_count; - /* Array of struct dma_fence * to block on before submitting this job. -*/ - struct xarray deps; - unsigned long last_dep; - /* v3d fence to be signaled by IRQ handler when the job is complete. */ struct dma_fence *irq_fence; diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 5eccd3658938..42b07ffbea5e 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -257,8 +257,8 @@ v3d_lock_bo_reservations(struct v3d_job *job, return ret; for (i = 0; i < job->bo_count; i++) { - ret = drm_gem_fence_array_add_implicit(>deps, - job->bo[i], true); + ret = drm_sched_job_await_implicit(>base, + job->bo[i], true); if (ret) { drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx); @@ -354,8 +354,6 @@ static void v3d_job_free(struct kref *ref) { struct v3d_job *job = container_of(ref, struct v3d_job, refcount); - unsigned long index; - struct dma_fence *fence; int i; for (i = 0; i < job->bo_count; i++) { @@ -364,11 +362,6 @@ v3d_job_free(struct kref *ref) } kvfree(job->bo); - xa_for_each(>deps, index, fence) { - dma_fence_put(fence); - } - xa_destroy(>deps); - dma_fence_put(job->irq_fence); dma_fence_put(job->done_fence); @@ -452,7 +445,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, if (ret < 0) return ret; - xa_init_flags(>deps, XA_FLAGS_ALLOC); ret = drm_sched_job_init(>base, _priv->sched_entity[queue], v3d_priv); if (ret) @@ -462,7 +454,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, if (ret == -EINVAL) goto fail_job; - ret = drm_gem_fence_array_add(>deps, in_fence); + ret = drm_sched_job_await_fence(>base, in_fence); if (ret) goto fail_job; @@ -472,7 +464,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, fail_job: drm_sched_job_cleanup(>base); fail: - xa_destroy(>deps); pm_runtime_put_autosuspend(v3d->drm.dev); return ret; } @@ -619,8 +610,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, if (bin) { v3d_push_job(>base); - ret = drm_gem_fence_array_add(>base.deps, - dma_fence_get(bin->base.done_fence)); + ret = drm_sched_job_await_fence(>base.base, + dma_fence_get(bin->base.done_fence)); if (ret) goto fail_unreserve; } @@ -630,7 +621,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, if (clean_job) { struct dma_fence *render_fence = dma_fence_get(render->base.done_fence); - ret = drm_gem_fence_array_add(_job->deps, render_fence); + ret = drm_sched_job_await_fence(_job->base, render_fence); if (ret) goto fail_unreserve; v3d_push_job(clean_job); @@ -820,8 +811,8 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, mutex_lock(>sched_lock); v3d_push_job(>base); - ret = drm_gem_fence_array_add(_job->deps, - dma_fence_get(job->base.done_fence)); + ret = drm_sched_job_await_fence(_job->base, + dma_fence_get(job->base.done_fence)); if (ret) goto fail_unreserve; diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 3f352d73af9c..f0de584f452c 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -13,7 +13,7 @@ * jobs when bulk background jobs are queued up, we submit a new job *
[PATCH v2 08/11] drm/etnaviv: Use scheduler dependency handling
We need to pull the drm_sched_job_init much earlier, but that's very minor surgery. Signed-off-by: Daniel Vetter Cc: Lucas Stach Cc: Russell King Cc: Christian Gmeiner Cc: Sumit Semwal Cc: "Christian König" Cc: etna...@lists.freedesktop.org Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org --- drivers/gpu/drm/etnaviv/etnaviv_gem.h| 5 +- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 32 +- drivers/gpu/drm/etnaviv/etnaviv_sched.c | 61 +--- drivers/gpu/drm/etnaviv/etnaviv_sched.h | 3 +- 4 files changed, 20 insertions(+), 81 deletions(-) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h b/drivers/gpu/drm/etnaviv/etnaviv_gem.h index 98e60df882b6..63688e6e4580 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h @@ -80,9 +80,6 @@ struct etnaviv_gem_submit_bo { u64 va; struct etnaviv_gem_object *obj; struct etnaviv_vram_mapping *mapping; - struct dma_fence *excl; - unsigned int nr_shared; - struct dma_fence **shared; }; /* Created per submit-ioctl, to track bo's and cmdstream bufs, etc, @@ -95,7 +92,7 @@ struct etnaviv_gem_submit { struct etnaviv_file_private *ctx; struct etnaviv_gpu *gpu; struct etnaviv_iommu_context *mmu_context, *prev_mmu_context; - struct dma_fence *out_fence, *in_fence; + struct dma_fence *out_fence; int out_fence_id; struct list_head node; /* GPU active submit list */ struct etnaviv_cmdbuf cmdbuf; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index 4dd7d9d541c0..92478a50a580 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -188,16 +188,10 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit) if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT) continue; - if (bo->flags & ETNA_SUBMIT_BO_WRITE) { - ret = dma_resv_get_fences(robj, >excl, - >nr_shared, - >shared); - if (ret) - return ret; - } else { - bo->excl = dma_resv_get_excl_unlocked(robj); - } - + ret = drm_sched_job_await_implicit(>sched_job, >obj->base, + bo->flags & ETNA_SUBMIT_BO_WRITE); + if (ret) + return ret; } return ret; @@ -403,8 +397,6 @@ static void submit_cleanup(struct kref *kref) wake_up_all(>gpu->fence_event); - if (submit->in_fence) - dma_fence_put(submit->in_fence); if (submit->out_fence) { /* first remove from IDR, so fence can not be found anymore */ mutex_lock(>gpu->fence_lock); @@ -537,6 +529,12 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, submit->exec_state = args->exec_state; submit->flags = args->flags; + ret = drm_sched_job_init(>sched_job, +>sched_entity[args->pipe], +submit->ctx); + if (ret) + goto err_submit_objects; + ret = submit_lookup_objects(submit, file, bos, args->nr_bos); if (ret) goto err_submit_objects; @@ -549,11 +547,15 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, } if (args->flags & ETNA_SUBMIT_FENCE_FD_IN) { - submit->in_fence = sync_file_get_fence(args->fence_fd); - if (!submit->in_fence) { + struct dma_fence *in_fence = sync_file_get_fence(args->fence_fd); + if (!in_fence) { ret = -EINVAL; goto err_submit_objects; } + + ret = drm_sched_job_await_fence(>sched_job, in_fence); + if (ret) + goto err_submit_objects; } ret = submit_pin_objects(submit); @@ -579,7 +581,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto err_submit_objects; - ret = etnaviv_sched_push_job(>sched_entity[args->pipe], submit); + ret = etnaviv_sched_push_job(submit); if (ret) goto err_submit_objects; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index 180bb633d5c5..c98d67320be3 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -17,58 +17,6 @@ module_param_named(job_hang_limit, etnaviv_job_hang_limit, int , 0444); static int etnaviv_hw_jobs_limit = 4; module_param_named(hw_job_limit, etnaviv_hw_jobs_limit, int , 0444); -static struct
[PATCH v2 05/11] drm/lima: use scheduler dependency tracking
Nothing special going on here. Aside reviewing the code, it seems like drm_sched_job_arm() should be moved into lima_sched_context_queue_task and put under some mutex together with drm_sched_push_job(). See the kerneldoc for drm_sched_push_job(). Signed-off-by: Daniel Vetter --- drivers/gpu/drm/lima/lima_gem.c | 4 ++-- drivers/gpu/drm/lima/lima_sched.c | 21 - drivers/gpu/drm/lima/lima_sched.h | 3 --- 3 files changed, 2 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c index c528f40981bb..e54a88d5037a 100644 --- a/drivers/gpu/drm/lima/lima_gem.c +++ b/drivers/gpu/drm/lima/lima_gem.c @@ -267,7 +267,7 @@ static int lima_gem_sync_bo(struct lima_sched_task *task, struct lima_bo *bo, if (explicit) return 0; - return drm_gem_fence_array_add_implicit(>deps, >base.base, write); + return drm_sched_job_await_implicit(>base, >base.base, write); } static int lima_gem_add_deps(struct drm_file *file, struct lima_submit *submit) @@ -285,7 +285,7 @@ static int lima_gem_add_deps(struct drm_file *file, struct lima_submit *submit) if (err) return err; - err = drm_gem_fence_array_add(>task->deps, fence); + err = drm_sched_job_await_fence(>task->base, fence); if (err) { dma_fence_put(fence); return err; diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c index e968b5a8f0b0..99d5f6f1a882 100644 --- a/drivers/gpu/drm/lima/lima_sched.c +++ b/drivers/gpu/drm/lima/lima_sched.c @@ -134,24 +134,15 @@ int lima_sched_task_init(struct lima_sched_task *task, task->num_bos = num_bos; task->vm = lima_vm_get(vm); - xa_init_flags(>deps, XA_FLAGS_ALLOC); - return 0; } void lima_sched_task_fini(struct lima_sched_task *task) { - struct dma_fence *fence; - unsigned long index; int i; drm_sched_job_cleanup(>base); - xa_for_each(>deps, index, fence) { - dma_fence_put(fence); - } - xa_destroy(>deps); - if (task->bos) { for (i = 0; i < task->num_bos; i++) drm_gem_object_put(>bos[i]->base.base); @@ -186,17 +177,6 @@ struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task) return fence; } -static struct dma_fence *lima_sched_dependency(struct drm_sched_job *job, - struct drm_sched_entity *entity) -{ - struct lima_sched_task *task = to_lima_task(job); - - if (!xa_empty(>deps)) - return xa_erase(>deps, task->last_dep++); - - return NULL; -} - static int lima_pm_busy(struct lima_device *ldev) { int ret; @@ -472,7 +452,6 @@ static void lima_sched_free_job(struct drm_sched_job *job) } static const struct drm_sched_backend_ops lima_sched_ops = { - .dependency = lima_sched_dependency, .run_job = lima_sched_run_job, .timedout_job = lima_sched_timedout_job, .free_job = lima_sched_free_job, diff --git a/drivers/gpu/drm/lima/lima_sched.h b/drivers/gpu/drm/lima/lima_sched.h index ac70006b0e26..6a11764d87b3 100644 --- a/drivers/gpu/drm/lima/lima_sched.h +++ b/drivers/gpu/drm/lima/lima_sched.h @@ -23,9 +23,6 @@ struct lima_sched_task { struct lima_vm *vm; void *frame; - struct xarray deps; - unsigned long last_dep; - struct lima_bo **bos; int num_bos; -- 2.32.0.rc2
[PATCH v2 04/11] drm/panfrost: use scheduler dependency tracking
Just deletes some code that's now more shared. Note that thanks to the split into drm_sched_job_init/arm we can now easily pull the _init() part from under the submission lock way ahead where we're adding the sync file in-fences as dependencies. v2: Correctly clean up the partially set up job, now that job_init() and job_arm() are apart (Emma). Reviewed-by: Steven Price (v1) Signed-off-by: Daniel Vetter Cc: Rob Herring Cc: Tomeu Vizoso Cc: Steven Price Cc: Alyssa Rosenzweig Cc: Sumit Semwal Cc: "Christian König" Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org --- drivers/gpu/drm/panfrost/panfrost_drv.c | 16 --- drivers/gpu/drm/panfrost/panfrost_job.c | 37 +++-- drivers/gpu/drm/panfrost/panfrost_job.h | 5 +--- 3 files changed, 17 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 1ffaef5ec5ff..9f53bea07d61 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev, if (ret) goto fail; - ret = drm_gem_fence_array_add(>deps, fence); + ret = drm_sched_job_await_fence(>base, fence); if (ret) goto fail; @@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, struct drm_panfrost_submit *args = data; struct drm_syncobj *sync_out = NULL; struct panfrost_job *job; - int ret = 0; + int ret = 0, slot; if (!args->jc) return -EINVAL; @@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, kref_init(>refcount); - xa_init_flags(>deps, XA_FLAGS_ALLOC); - job->pfdev = pfdev; job->jc = args->jc; job->requirements = args->requirements; job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev); job->file_priv = file->driver_priv; + slot = panfrost_job_get_slot(job); + + ret = drm_sched_job_init(>base, +>file_priv->sched_entity[slot], +NULL); + if (ret) + goto fail_job_put; + ret = panfrost_copy_in_sync(dev, file, args, job); if (ret) goto fail_job; @@ -283,6 +289,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, drm_syncobj_replace_fence(sync_out, job->render_done_fence); fail_job: + drm_sched_job_cleanup(>base); +fail_job_put: panfrost_job_put(job); fail_out_sync: if (sync_out) diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 4bc962763e1f..86c843d8822e 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -102,7 +102,7 @@ static struct dma_fence *panfrost_fence_create(struct panfrost_device *pfdev, in return >base; } -static int panfrost_job_get_slot(struct panfrost_job *job) +int panfrost_job_get_slot(struct panfrost_job *job) { /* JS0: fragment jobs. * JS1: vertex/tiler jobs @@ -242,13 +242,13 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js) static int panfrost_acquire_object_fences(struct drm_gem_object **bos, int bo_count, - struct xarray *deps) + struct drm_sched_job *job) { int i, ret; for (i = 0; i < bo_count; i++) { /* panfrost always uses write mode in its current uapi */ - ret = drm_gem_fence_array_add_implicit(deps, bos[i], true); + ret = drm_sched_job_await_implicit(job, bos[i], true); if (ret) return ret; } @@ -269,31 +269,21 @@ static void panfrost_attach_object_fences(struct drm_gem_object **bos, int panfrost_job_push(struct panfrost_job *job) { struct panfrost_device *pfdev = job->pfdev; - int slot = panfrost_job_get_slot(job); - struct drm_sched_entity *entity = >file_priv->sched_entity[slot]; struct ww_acquire_ctx acquire_ctx; int ret = 0; - ret = drm_gem_lock_reservations(job->bos, job->bo_count, _ctx); if (ret) return ret; mutex_lock(>sched_lock); - - ret = drm_sched_job_init(>base, entity, NULL); - if (ret) { - mutex_unlock(>sched_lock); - goto unlock; - } - drm_sched_job_arm(>base); job->render_done_fence = dma_fence_get(>base.s_fence->finished); ret = panfrost_acquire_object_fences(job->bos, job->bo_count, ->deps); +
[PATCH v2 02/11] drm/sched: Add dependency tracking
Instead of just a callback we can just glue in the gem helpers that panfrost, v3d and lima currently use. There's really not that many ways to skin this cat. On the naming bikeshed: The idea for using _await_ to denote adding dependencies to a job comes from i915, where that's used quite extensively all over the place, in lots of datastructures. v2: Rebased. Reviewed-by: Steven Price (v1) Signed-off-by: Daniel Vetter Cc: David Airlie Cc: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: Andrey Grodzovsky Cc: Lee Jones Cc: Nirmoy Das Cc: Boris Brezillon Cc: Luben Tuikov Cc: Alex Deucher Cc: Jack Zhang Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org --- drivers/gpu/drm/scheduler/sched_entity.c | 18 +++- drivers/gpu/drm/scheduler/sched_main.c | 103 +++ include/drm/gpu_scheduler.h | 31 ++- 3 files changed, 146 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index f7347c284886..b6f72fafd504 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, job->sched->ops->free_job(job); } +static struct dma_fence * +drm_sched_job_dependency(struct drm_sched_job *job, +struct drm_sched_entity *entity) +{ + if (!xa_empty(>dependencies)) + return xa_erase(>dependencies, job->last_dependency++); + + if (job->sched->ops->dependency) + return job->sched->ops->dependency(job, entity); + + return NULL; +} + /** * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed * @@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity) struct drm_sched_fence *s_fence = job->s_fence; /* Wait for all dependencies to avoid data corruptions */ - while ((f = job->sched->ops->dependency(job, entity))) + while ((f = drm_sched_job_dependency(job, entity))) dma_fence_wait(f, false); drm_sched_fence_scheduled(s_fence); @@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity) */ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) { - struct drm_gpu_scheduler *sched = entity->rq->sched; struct drm_sched_job *sched_job; sched_job = to_drm_sched_job(spsc_queue_peek(>job_queue)); @@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) return NULL; while ((entity->dependency = - sched->ops->dependency(sched_job, entity))) { + drm_sched_job_dependency(sched_job, entity))) { trace_drm_sched_job_wait_dep(sched_job, entity->dependency); if (drm_sched_entity_add_dependency_cb(entity)) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 5e84e1500c32..12d533486518 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -605,6 +605,8 @@ int drm_sched_job_init(struct drm_sched_job *job, INIT_LIST_HEAD(>list); + xa_init_flags(>dependencies, XA_FLAGS_ALLOC); + return 0; } EXPORT_SYMBOL(drm_sched_job_init); @@ -628,6 +630,98 @@ void drm_sched_job_arm(struct drm_sched_job *job) } EXPORT_SYMBOL(drm_sched_job_arm); +/** + * drm_sched_job_await_fence - adds the fence as a job dependency + * @job: scheduler job to add the dependencies to + * @fence: the dma_fence to add to the list of dependencies. + * + * Note that @fence is consumed in both the success and error cases. + * + * Returns: + * 0 on success, or an error on failing to expand the array. + */ +int drm_sched_job_await_fence(struct drm_sched_job *job, + struct dma_fence *fence) +{ + struct dma_fence *entry; + unsigned long index; + u32 id = 0; + int ret; + + if (!fence) + return 0; + + /* Deduplicate if we already depend on a fence from the same context. +* This lets the size of the array of deps scale with the number of +* engines involved, rather than the number of BOs. +*/ + xa_for_each(>dependencies, index, entry) { + if (entry->context != fence->context) + continue; + + if (dma_fence_is_later(fence, entry)) { + dma_fence_put(entry); + xa_store(>dependencies, index, fence, GFP_KERNEL); + } else { + dma_fence_put(fence); + } + return 0; + } + + ret = xa_alloc(>dependencies, , fence, xa_limit_32b, GFP_KERNEL); + if (ret != 0) +
[PATCH v2 03/11] drm/sched: drop entity parameter from drm_sched_push_job
Originally a job was only bound to the queue when we pushed this, but now that's done in drm_sched_job_init, making that parameter entirely redundant. Remove it. The same applies to the context parameter in lima_sched_context_queue_task, simplify that too. Reviewed-by: Steven Price (v1) Signed-off-by: Daniel Vetter Cc: Lucas Stach Cc: Russell King Cc: Christian Gmeiner Cc: Qiang Yu Cc: Rob Herring Cc: Tomeu Vizoso Cc: Steven Price Cc: Alyssa Rosenzweig Cc: Emma Anholt Cc: David Airlie Cc: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: Alex Deucher Cc: Nirmoy Das Cc: Dave Airlie Cc: Chen Li Cc: Lee Jones Cc: Deepak R Varma Cc: Kevin Wang Cc: Luben Tuikov Cc: "Marek Olšák" Cc: Maarten Lankhorst Cc: Andrey Grodzovsky Cc: Dennis Li Cc: Boris Brezillon Cc: etna...@lists.freedesktop.org Cc: l...@lists.freedesktop.org Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_sched.c | 2 +- drivers/gpu/drm/lima/lima_gem.c | 3 +-- drivers/gpu/drm/lima/lima_sched.c| 5 ++--- drivers/gpu/drm/lima/lima_sched.h| 3 +-- drivers/gpu/drm/panfrost/panfrost_job.c | 2 +- drivers/gpu/drm/scheduler/sched_entity.c | 6 ++ drivers/gpu/drm/v3d/v3d_gem.c| 2 +- include/drm/gpu_scheduler.h | 3 +-- 10 files changed, 12 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index a4ec092af9a7..18f63567fb69 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, trace_amdgpu_cs_ioctl(job); amdgpu_vm_bo_trace_cs(>vm, >ticket); - drm_sched_entity_push_job(>base, entity); + drm_sched_entity_push_job(>base); amdgpu_vm_move_to_lru_tail(p->adev, >vm); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 5ddb955d2315..b86099c1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity, *f = dma_fence_get(>base.s_fence->finished); amdgpu_job_free_resources(job); - drm_sched_entity_push_job(>base, entity); + drm_sched_entity_push_job(>base); return 0; } diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index 05f412204118..180bb633d5c5 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity, /* the scheduler holds on to the job now */ kref_get(>refcount); - drm_sched_entity_push_job(>sched_job, sched_entity); + drm_sched_entity_push_job(>sched_job); out_unlock: mutex_unlock(>gpu->fence_lock); diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c index de62966243cd..c528f40981bb 100644 --- a/drivers/gpu/drm/lima/lima_gem.c +++ b/drivers/gpu/drm/lima/lima_gem.c @@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit) goto err_out2; } - fence = lima_sched_context_queue_task( - submit->ctx->context + submit->pipe, submit->task); + fence = lima_sched_context_queue_task(submit->task); for (i = 0; i < submit->nr_bos; i++) { if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE) diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c index 38f755580507..e968b5a8f0b0 100644 --- a/drivers/gpu/drm/lima/lima_sched.c +++ b/drivers/gpu/drm/lima/lima_sched.c @@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe, drm_sched_entity_fini(>base); } -struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context, - struct lima_sched_task *task) +struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task) { struct dma_fence *fence = dma_fence_get(>base.s_fence->finished); trace_lima_task_submit(task); - drm_sched_entity_push_job(>base, >base); + drm_sched_entity_push_job(>base); return fence; } diff --git a/drivers/gpu/drm/lima/lima_sched.h b/drivers/gpu/drm/lima/lima_sched.h index 90f03c48ef4a..ac70006b0e26 100644 --- a/drivers/gpu/drm/lima/lima_sched.h +++ b/drivers/gpu/drm/lima/lima_sched.h @@ -98,8 +98,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe, atomic_t *guilty); void lima_sched_context_fini(struct lima_sched_pipe *pipe, struct lima_sched_context
[PATCH v2 01/11] drm/sched: Split drm_sched_job_init
This is a very confusingly named function, because not just does it init an object, it arms it and provides a point of no return for pushing a job into the scheduler. It would be nice if that's a bit clearer in the interface. But the real reason is that I want to push the dependency tracking helpers into the scheduler code, and that means drm_sched_job_init must be called a lot earlier, without arming the job. v2: - don't change .gitignore (Steven) - don't forget v3d (Emma) v3: Emma noticed that I leak the memory allocated in drm_sched_job_init if we bail out before the point of no return in subsequent driver patches. To be able to fix this change drm_sched_job_cleanup() so it can handle being called both before and after drm_sched_job_arm(). Also improve the kerneldoc for this. Acked-by: Steven Price (v2) Signed-off-by: Daniel Vetter Cc: Lucas Stach Cc: Russell King Cc: Christian Gmeiner Cc: Qiang Yu Cc: Rob Herring Cc: Tomeu Vizoso Cc: Steven Price Cc: Alyssa Rosenzweig Cc: David Airlie Cc: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: Masahiro Yamada Cc: Kees Cook Cc: Adam Borowski Cc: Nick Terrell Cc: Mauro Carvalho Chehab Cc: Paul Menzel Cc: Sami Tolvanen Cc: Viresh Kumar Cc: Alex Deucher Cc: Dave Airlie Cc: Nirmoy Das Cc: Deepak R Varma Cc: Lee Jones Cc: Kevin Wang Cc: Chen Li Cc: Luben Tuikov Cc: "Marek Olšák" Cc: Dennis Li Cc: Maarten Lankhorst Cc: Andrey Grodzovsky Cc: Sonny Jiang Cc: Boris Brezillon Cc: Tian Tao Cc: Jack Zhang Cc: etna...@lists.freedesktop.org Cc: l...@lists.freedesktop.org Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org Cc: Emma Anholt --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 ++ drivers/gpu/drm/etnaviv/etnaviv_sched.c | 2 ++ drivers/gpu/drm/lima/lima_sched.c| 2 ++ drivers/gpu/drm/panfrost/panfrost_job.c | 2 ++ drivers/gpu/drm/scheduler/sched_entity.c | 6 ++-- drivers/gpu/drm/scheduler/sched_fence.c | 17 + drivers/gpu/drm/scheduler/sched_main.c | 46 +--- drivers/gpu/drm/v3d/v3d_gem.c| 2 ++ include/drm/gpu_scheduler.h | 7 +++- 10 files changed, 74 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index c5386d13eb4a..a4ec092af9a7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, if (r) goto error_unlock; + drm_sched_job_arm(>base); + /* No memory allocation is allowed while holding the notifier lock. * The lock is held until amdgpu_cs_submit is finished and fence is * added to BOs. diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index d33e6d97cc89..5ddb955d2315 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity, if (r) return r; + drm_sched_job_arm(>base); + *f = dma_fence_get(>base.s_fence->finished); amdgpu_job_free_resources(job); drm_sched_entity_push_job(>base, entity); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index feb6da1b6ceb..05f412204118 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity, if (ret) goto out_unlock; + drm_sched_job_arm(>sched_job); + submit->out_fence = dma_fence_get(>sched_job.s_fence->finished); submit->out_fence_id = idr_alloc_cyclic(>gpu->fence_idr, submit->out_fence, 0, diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c index dba8329937a3..38f755580507 100644 --- a/drivers/gpu/drm/lima/lima_sched.c +++ b/drivers/gpu/drm/lima/lima_sched.c @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task, return err; } + drm_sched_job_arm(>base); + task->num_bos = num_bos; task->vm = lima_vm_get(vm); diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 71a72fb50e6b..2992dc85325f 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -288,6 +288,8 @@ int panfrost_job_push(struct panfrost_job *job) goto unlock; } + drm_sched_job_arm(>base); + job->render_done_fence = dma_fence_get(>base.s_fence->finished); ret = panfrost_acquire_object_fences(job->bos, job->bo_count, diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
[PATCH v2 00/11] drm/scheduler dependency tracking
Hi all 2nd major round of my scheduler dependency handling patches. Emma noticed a big fumble in that I just didn't bother cleaning up between drm_sched_job_init() and drm_sched_job_arm(). This here should fix it now. Review and testing very much welcome. Cheers, Daniel Daniel Vetter (11): drm/sched: Split drm_sched_job_init drm/sched: Add dependency tracking drm/sched: drop entity parameter from drm_sched_push_job drm/panfrost: use scheduler dependency tracking drm/lima: use scheduler dependency tracking drm/v3d: Move drm_sched_job_init to v3d_job_init drm/v3d: Use scheduler dependency handling drm/etnaviv: Use scheduler dependency handling drm/gem: Delete gem array fencing helpers drm/sched: Don't store self-dependencies drm/sched: Check locking in drm_sched_job_await_implicit drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 4 +- drivers/gpu/drm/drm_gem.c| 96 --- drivers/gpu/drm/etnaviv/etnaviv_gem.h| 5 +- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 32 ++-- drivers/gpu/drm/etnaviv/etnaviv_sched.c | 63 +--- drivers/gpu/drm/etnaviv/etnaviv_sched.h | 3 +- drivers/gpu/drm/lima/lima_gem.c | 7 +- drivers/gpu/drm/lima/lima_sched.c| 28 +--- drivers/gpu/drm/lima/lima_sched.h| 6 +- drivers/gpu/drm/panfrost/panfrost_drv.c | 16 +- drivers/gpu/drm/panfrost/panfrost_job.c | 39 + drivers/gpu/drm/panfrost/panfrost_job.h | 5 +- drivers/gpu/drm/scheduler/sched_entity.c | 30 ++-- drivers/gpu/drm/scheduler/sched_fence.c | 17 +- drivers/gpu/drm/scheduler/sched_main.c | 158 ++- drivers/gpu/drm/v3d/v3d_drv.h| 6 +- drivers/gpu/drm/v3d/v3d_gem.c| 115 ++ drivers/gpu/drm/v3d/v3d_sched.c | 44 +- include/drm/drm_gem.h| 5 - include/drm/gpu_scheduler.h | 41 - 21 files changed, 330 insertions(+), 394 deletions(-) -- 2.32.0.rc2
Re: [PATCH] drm/vc4: dsi: Only register our component once a DSI device is attached
Hi Dave, On Fri, Jul 02, 2021 at 06:44:22PM +0100, Dave Stevenson wrote: > On Fri, 2 Jul 2021 at 17:47, Laurent Pinchart wrote: > > On Mon, Jun 21, 2021 at 04:59:51PM +0300, Laurent Pinchart wrote: > >> On Mon, Jun 21, 2021 at 04:09:05PM +0300, Laurent Pinchart wrote: > >>> On Mon, Jun 21, 2021 at 03:56:16PM +0300, Laurent Pinchart wrote: > On Mon, Jun 21, 2021 at 12:49:14PM +0100, Dave Stevenson wrote: > > On Sun, 20 Jun 2021 at 23:49, Laurent Pinchart wrote: > >> On Sun, Jun 20, 2021 at 09:42:27PM +0300, Laurent Pinchart wrote: > >>> On Sun, Jun 20, 2021 at 03:29:03PM +0100, Dave Stevenson wrote: > On Sun, 20 Jun 2021 at 04:26, Laurent Pinchart wrote: > > > > Hi Maxime, > > > > I'm testing this, and I'm afraid it causes an issue with all the > > I2C-controlled bridges. I'm focussing on the newly merged > > ti-sn65dsi83 > > driver at the moment, but other are affected the same way. > > > > With this patch, the DSI component is only added when the DSI > > device is > > attached to the host with mipi_dsi_attach(). In the ti-sn65dsi83 > > driver, > > this happens in the bridge attach callback, which is called when the > > bridge is attached by a call to drm_bridge_attach() in > > vc4_dsi_bind(). > > This creates a circular dependency, and the DRM/KMS device is never > > created. > > > > How should this be solved ? Dave, I think you have shown an > > interest in > > the sn65dsi83 recently, any help would be appreciated. On a side > > note, > > I've tested the ti-sn65dsi83 driver on a v5.10 RPi kernel, without > > much > > success (on top of commit e1499baa0b0c I get a very weird frame > > rate - > > 147 fps of 99 fps instead of 60 fps - and nothing on the screen, > > and on > > top of the latest v5.10 RPi branch, I get lock-related warnings at > > every > > page flip), which is why I tried v5.12 and noticed this patch. Is it > > worth trying to bring up the display on the v5.10 RPi kernel in > > parallel > > to fixing the issue introduced in this patch, or is DSI known to be > > broken there ? > > I've been looking at SN65DSI83/4, but as I don't have any hardware > I've largely been suggesting things to try to those on the forums who > do [1]. > > My branch at > https://github.com/6by9/linux/tree/rpi-5.10.y-sn65dsi8x-marek > is the latest one I've worked on. It's rpi-5.10.y with Marek's driver > cherry-picked, and an overlay and simple-panel definition by others. > It also has a rework for vc4_dsi to use pm_runtime, instead of > breaking up the DSI bridge chain (which is flawed as it never calls > the bridge mode_set or mode_valid functions which sn65dsi83 relies > on). > >> > >> I've looked at that, and I'm afraid it doesn't go in the right > >> direction. The drm_encoder.crtc field is deprecated and documented as > >> only meaningful for non-atomic drivers. You're not introducing its > >> usage, but moving the configuration code from .enable() to the runtime > >> PM resume handler will make it impossible to fix this. The driver should > >> instead move to the .atomic_enable() function. If you need > >> enable/pre_enable in the DSI encoder, then you should turn it into a > >> drm_bridge. > > > > Is this something you're looking at by any chance ? I'm testing the > > ti-sn65dsi83 driver with VC4. I've spent a couple of hours debugging, > > only to realise that the vc4_dsi driver (before the rework you mention > > above) doesn't call .mode_set() on the bridges... Applying my sn65dsi83 > > series that removes .mode_set() didn't help much as vc4_dsi doesn't call > > the atomic operations either :-) I'll test your branch now. > > This is one of the reasons for my email earlier today - thank you for > your reply. > > The current mainline vc4_dsi driver deliberately breaks the bridge > chain so that it gets called before the panel/bridge pre_enable and > can power everything up, therefore pre_enable can call host_transfer > to configure the panel/bridge over the DSI interface. > However we've both noted that it doesn't forward on the mode_set and > mode_valid calls, and my investigations say that it doesn't have > enough information to make those calls. > > My branch returns the chain to normal, and tries to use pm_runtime to > power up the PHY at the first usage (host_transfer or _enable). The > PHY enable needs to know the link frequency to use, hence my question > over how that should be determined. > Currently it's coming from drm_encoder.crtc, but you say that's > deprecated. If a mode hasn't been set then we have no clock > information and bad things will happen. To make sure
[PATCH] drm/i915: Improve debug Kconfig texts a bit
We're not consistently recommending these for developers only. I stumbled over this due to DRM_I915_LOW_LEVEL_TRACEPOINTS, which was added in commit 354d036fcf70654cff2e2cbdda54a835d219b9d2 Author: Tvrtko Ursulin Date: Tue Feb 21 11:01:42 2017 + drm/i915/tracepoints: Add request submit and execute tracepoints to "alleviate the performance impact concerns." Which is nonsense. Tvrtko and Joonas pointed out on irc that the real (but undocumented reason) was stable abi concerns for tracepoints, see https://lwn.net/Articles/705270/ and the specific change that was blocked around tracepoints: https://lwn.net/Articles/442113/ Anyway to make it a notch clearer why we have this Kconfig option consistly add the "Recommended for driver developers only." to it and all the other debug options we have. Cc: Tvrtko Ursulin Cc: Joonas Lahtinen Cc: Matthew Brost Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/Kconfig.debug | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug index 2ca88072d30f..f27c0b5873f7 100644 --- a/drivers/gpu/drm/i915/Kconfig.debug +++ b/drivers/gpu/drm/i915/Kconfig.debug @@ -215,6 +215,8 @@ config DRM_I915_LOW_LEVEL_TRACEPOINTS This provides the ability to precisely monitor engine utilisation and also analyze the request dependency resolving timeline. + Recommended for driver developers only. + If in doubt, say "N". config DRM_I915_DEBUG_VBLANK_EVADE @@ -228,6 +230,8 @@ config DRM_I915_DEBUG_VBLANK_EVADE is exceeded, even if there isn't an actual risk of missing the vblank. + Recommended for driver developers only. + If in doubt, say "N". config DRM_I915_DEBUG_RUNTIME_PM @@ -240,4 +244,6 @@ config DRM_I915_DEBUG_RUNTIME_PM runtime PM functionality. This may introduce overhead during driver loading, suspend and resume operations. + Recommended for driver developers only. + If in doubt, say "N" -- 2.32.0.rc2
[PATCH] drm/amdgpu: initialize amdgpu_ras_query_error_count() error count parameters
From: Tom Rix Static analysis reports this problem amdgpu_ras.c:2324:2: warning: 2nd function call argument is an uninitialized value atomic_set(>ras_ce_count, ce_count); ^~~~ ce_count is normally set by the earlier call to amdgpu_ras_query_error_count(). But amdgpu_ras_query_error_count() can return early without setting, leaving its error count parameters in a garbage state. Initialize the error count parameters earlier. Fixes: a46751fbcde5 ("drm/amdgpu: Fix RAS function interface") Signed-off-by: Tom Rix --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 875874ea745ec..c80fa545aa2b8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -1056,6 +1056,12 @@ void amdgpu_ras_query_error_count(struct amdgpu_device *adev, struct ras_manager *obj; unsigned long ce, ue; + if (ce_count) + *ce_count = 0; + + if (ue_count) + *ue_count = 0; + if (!adev->ras_enabled || !con) return; -- 2.26.3
Re: [PATCH v5 0/2] drm/i915: IRQ fixes
On Thu, Jul 01, 2021 at 07:36:16PM +0200, Thomas Zimmermann wrote: > Fix a bug in the usage of IRQs and cleanup references to the DRM > IRQ midlayer. > > Preferably this patchset would be merged through drm-misc-next. > > v5: > * go back to _hardirq() after CI tests reported atomic > context in PCI probe; add rsp comment > v4: > * switch IRQ code to intel_synchronize_irq() (Daniel) > v3: > * also use intel_synchronize_hardirq() from other callsite > v2: > * split patch > * also fix comment > * add intel_synchronize_hardirq() (Ville) > * update Fixes tag (Daniel) Ok now I actually pushed the right patch set. -Daniel > > Thomas Zimmermann (2): > drm/i915: Use the correct IRQ during resume > drm/i915: Drop all references to DRM IRQ midlayer > > drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- > drivers/gpu/drm/i915/gt/intel_ring_submission.c | 7 +-- > drivers/gpu/drm/i915/i915_drv.c | 1 - > drivers/gpu/drm/i915/i915_irq.c | 10 +- > drivers/gpu/drm/i915/i915_irq.h | 1 + > 5 files changed, 12 insertions(+), 9 deletions(-) > > > base-commit: 67f5a18128770817e4218a9e496d2bf5047c51e8 > prerequisite-patch-id: c2b2f08f0eccc9f5df0c0da49fa1d36267deb11d > prerequisite-patch-id: c67e5d886a47b7d0266d81100837557fda34cb24 > prerequisite-patch-id: 0cca17365e65370fa95d193ed2f1c88917ee1aef > prerequisite-patch-id: 12b9894350a0b56579d29542943465ef5134751c > prerequisite-patch-id: 3e1c37d3425f4820fe36ea3da57c65e166fe0ee5 > prerequisite-patch-id: 1017c860a0bf95ce370d82b8db1745f5548fb321 > prerequisite-patch-id: dcc022baab7c172978de9809702c2f4f54323047 > prerequisite-patch-id: 0d05ee247042b43d5ab8f3af216e708a8e09bee8 > prerequisite-patch-id: 110c411161bed6072c32185940fcd052d0bdb09a > prerequisite-patch-id: d2d1aeccffdfadf2b951487b8605f59c795d84cf > prerequisite-patch-id: 85fe31e27ca13adc0d1bcc7c19b1ce238a77ee6a > prerequisite-patch-id: c61fdacbe035ba5c17f1ff393bc9087f16aaea7b > prerequisite-patch-id: c4821af5dbba4d121769f1da85d91fbb53020ec0 > prerequisite-patch-id: 0b20ef3302abfe6dc123dbc54b9dd087865f935b > prerequisite-patch-id: d34eb96cbbdeb91870ace4250ea75920b1653dc2 > prerequisite-patch-id: 7f64fce347d15232134d7636ca7a8d9f5bf1a3a0 > prerequisite-patch-id: c83be7a285eb6682cdae0df401ab5d4c208f036b > prerequisite-patch-id: eb1a44d2eb2685cea154dd3f17f5f463dfafd39a > prerequisite-patch-id: 92a8c37dae4b8394fd6702f4af58ac7815ac3069 > prerequisite-patch-id: f0237988fe4ae6eba143432d1ace8beb52d935f8 > prerequisite-patch-id: bcf4d29437ed7cb78225dec4c99249eb40c18302 > prerequisite-patch-id: 6407b4c7f1b80af8d329d5f796b30da11959e936 > prerequisite-patch-id: 4a69e6e49d691b555f0e0874d638cd204dcb0c48 > prerequisite-patch-id: be09cfa8a67dd435a25103b85bd4b1649c5190a3 > prerequisite-patch-id: 813ecc9f94251c3d669155faf64c0c9e6a458393 > prerequisite-patch-id: beb2b5000a1682cbd74a7e2ab1566fcae5bccbf0 > prerequisite-patch-id: 754c8878611864475a0b75fd49ff38e71a21c795 > prerequisite-patch-id: d7d4bac3c19f94ba9593143b3c147d83d82cb71f > prerequisite-patch-id: 983d1efbe060743f5951e474961fa431d886d757 > prerequisite-patch-id: 3c78b20c3b9315cd39e0ae9ea1510c6121bf9ca9 > -- > 2.32.0 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v2] drm/dp_mst: Fix return code on sideband message failure
JFYI: will try to take a look at this at the start of next week On Tue, 2021-06-29 at 16:07 -0700, Kuogee Hsieh wrote: > From: Rajkumar Subbiah > > Commit 2f015ec6eab6 ("drm/dp_mst: Add sideband down request tracing + > selftests") added some debug code for sideband message tracing. But > it seems to have unintentionally changed the behavior on sideband message > failure. It catches and returns failure only if DRM_UT_DP is enabled. > Otherwise it ignores the error code and returns success. So on an MST > unplug, the caller is unaware that the clear payload message failed and > ends up waiting for 4 seconds for the response. Fixes the issue by > returning the proper error code. > > Changes in V2: > -- Revise commit text as review comment > -- add Fixes text > > Fixes: 2f015ec6eab6 ("drm/dp_mst: Add sideband down request tracing + > selftests") > > Signed-off-by: Rajkumar Subbiah > Signed-off-by: Kuogee Hsieh > > Reviewed-by: Stephen Boyd > --- > drivers/gpu/drm/drm_dp_mst_topology.c | 10 ++ > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c > b/drivers/gpu/drm/drm_dp_mst_topology.c > index 1590144..8d97430 100644 > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > @@ -2887,11 +2887,13 @@ static int process_single_tx_qlock(struct > drm_dp_mst_topology_mgr *mgr, > idx += tosend + 1; > > ret = drm_dp_send_sideband_msg(mgr, up, chunk, idx); > - if (unlikely(ret) && drm_debug_enabled(DRM_UT_DP)) { > - struct drm_printer p = drm_debug_printer(DBG_PREFIX); > + if (unlikely(ret)) { > + if (drm_debug_enabled(DRM_UT_DP)) { > + struct drm_printer p = > drm_debug_printer(DBG_PREFIX); > > - drm_printf(, "sideband msg failed to send\n"); > - drm_dp_mst_dump_sideband_msg_tx(, txmsg); > + drm_printf(, "sideband msg failed to send\n"); > + drm_dp_mst_dump_sideband_msg_tx(, txmsg); > + } > return ret; > } > -- Cheers, Lyude Paul (she/her) Software Engineer at Red Hat
Re: [PATCH] dma-buf: fix and rework dma_buf_poll v5
On Fri, Jul 02, 2021 at 12:31:43PM +0200, Christian König wrote: > Daniel pointed me towards this function and there are multiple obvious > problems > in the implementation. > > First of all the retry loop is not working as intended. In general the retry > makes only sense if you grab the reference first and then check the sequence > values. > > Then we should always also wait for the exclusive fence. > > It's also good practice to keep the reference around when installing callbacks > to fences you don't own. > > And last the whole implementation was unnecessary complex and rather hard to > understand which could lead to probably unexpected behavior of the IOCTL. > > Fix all this by reworking the implementation from scratch. Dropping the > whole RCU approach and taking the lock instead. > > Only mildly tested and needs a thoughtful review of the code. > > v2: fix the reference counting as well > v3: keep the excl fence handling as is for stable > v4: back to testing all fences, drop RCU > v5: handle in and out separately > > Signed-off-by: Christian König > CC: sta...@vger.kernel.org > --- > drivers/dma-buf/dma-buf.c | 152 +- > include/linux/dma-buf.h | 2 +- > 2 files changed, 68 insertions(+), 86 deletions(-) > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index eadd1eaa2fb5..439e2379e1cb 100644 > --- a/drivers/dma-buf/dma-buf.c > +++ b/drivers/dma-buf/dma-buf.c > @@ -72,7 +72,7 @@ static void dma_buf_release(struct dentry *dentry) >* If you hit this BUG() it means someone dropped their ref to the >* dma-buf while still having pending operation to the buffer. >*/ > - BUG_ON(dmabuf->cb_shared.active || dmabuf->cb_excl.active); > + BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); > > dmabuf->ops->release(dmabuf); > > @@ -202,16 +202,57 @@ static void dma_buf_poll_cb(struct dma_fence *fence, > struct dma_fence_cb *cb) > wake_up_locked_poll(dcb->poll, dcb->active); > dcb->active = 0; > spin_unlock_irqrestore(>poll->lock, flags); > + dma_fence_put(fence); > +} > + > +static bool dma_buf_poll_shared(struct dma_resv *resv, > + struct dma_buf_poll_cb_t *dcb) > +{ > + struct dma_resv_list *fobj = dma_resv_get_list(resv); > + struct dma_fence *fence; > + int i, r; > + > + if (!fobj) > + return false; > + > + for (i = 0; i < fobj->shared_count; ++i) { > + fence = rcu_dereference_protected(fobj->shared[i], > + dma_resv_held(resv)); > + dma_fence_get(fence); > + r = dma_fence_add_callback(fence, >cb, dma_buf_poll_cb); > + if (!r) > + return true; > + dma_fence_put(fence); > + } > + > + return false; > +} > + > +static bool dma_buf_poll_excl(struct dma_resv *resv, > + struct dma_buf_poll_cb_t *dcb) > +{ > + struct dma_fence *fence = dma_resv_get_excl(resv); > + int r; > + > + if (!fence) > + return false; > + > + dma_fence_get(fence); > + r = dma_fence_add_callback(fence, >cb, dma_buf_poll_cb); > + if (!r) > + return true; > + dma_fence_put(fence); > + > + return false; > } > > static __poll_t dma_buf_poll(struct file *file, poll_table *poll) > { > struct dma_buf *dmabuf; > struct dma_resv *resv; > - struct dma_resv_list *fobj; > - struct dma_fence *fence_excl; > + unsigned shared_count; > __poll_t events; > - unsigned shared_count, seq; > + int r, i; > > dmabuf = file->private_data; > if (!dmabuf || !dmabuf->resv) > @@ -225,101 +266,42 @@ static __poll_t dma_buf_poll(struct file *file, > poll_table *poll) > if (!events) > return 0; > > -retry: > - seq = read_seqcount_begin(>seq); > - rcu_read_lock(); > - > - fobj = rcu_dereference(resv->fence); > - if (fobj) > - shared_count = fobj->shared_count; > - else > - shared_count = 0; > - fence_excl = rcu_dereference(resv->fence_excl); > - if (read_seqcount_retry(>seq, seq)) { > - rcu_read_unlock(); > - goto retry; > - } > - > - if (fence_excl && (!(events & EPOLLOUT) || shared_count == 0)) { > - struct dma_buf_poll_cb_t *dcb = >cb_excl; > - __poll_t pevents = EPOLLIN; > + dma_resv_lock(resv, NULL); > > - if (shared_count == 0) > - pevents |= EPOLLOUT; > + if (events & EPOLLOUT) { > + struct dma_buf_poll_cb_t *dcb = >cb_out; > > + /* Check that callback isn't busy */ > spin_lock_irq(>poll.lock); > - if (dcb->active) { > - dcb->active |= pevents; > - events &= ~pevents; > - } else > - dcb->active = pevents; > +
Re: [Intel-gfx] [PATCH v2 3/3] drm/i915/uapi: reject set_domain for discrete
On Fri, Jul 02, 2021 at 03:31:08PM +0100, Tvrtko Ursulin wrote: > > On 01/07/2021 16:10, Matthew Auld wrote: > > The CPU domain should be static for discrete, and on DG1 we don't need > > any flushing since everything is already coherent, so really all this > > Knowledge of the write combine buffer is assumed to be had by anyone involved? > > > does is an object wait, for which we have an ioctl. Longer term the > > desired caching should be an immutable creation time property for the > > BO, which can be set with something like gem_create_ext. > > > > One other user is iris + userptr, which uses the set_domain to probe all > > the pages to check if the GUP succeeds, however keeping the set_domain > > around just for that seems rather scuffed. We could equally just submit > > a dummy batch, which should hopefully be good enough, otherwise adding a > > new creation time flag for userptr might be an option. Although longer > > term we will also have vm_bind, which should also be a nice fit for > > this, so adding a whole new flag is likely overkill. > > Execbuf sounds horrible. But it all reminds me of past work by Chris which is > surprisingly hard to find in the archives. Patches like: > > commit 7706a433388016983052a27c0fd74a64b1897ae7 > Author: Chris Wilson > Date: Wed Nov 8 17:04:07 2017 + > > drm/i915/userptr: Probe existence of backing struct pages upon creation > Jason Ekstrand requested a more efficient method than userptr+set-domain > to determine if the userptr object was backed by a complete set of pages > upon creation. To be more efficient than simply populating the userptr > using get_user_pages() (as done by the call to set-domain or execbuf), > we can walk the tree of vm_area_struct and check for gaps or vma not > backed by struct page (VM_PFNMAP). The question is how to handle > VM_MIXEDMAP which may be either struct page or pfn backed... > > commit 7ca21d3390eec23db99b8131ed18bc036efaba18 > Author: Chris Wilson > Date: Wed Nov 8 17:48:22 2017 + > > drm/i915/userptr: Add a flag to populate the userptr on creation > Acquiring the backing struct pages for the userptr range is not free; > the first client for userptr would insist on frequently creating userptr > objects ahead of time and not use them. For that first client, deferring > the cost of populating the userptr (calling get_user_pages()) to the > actual execbuf was a substantial improvement. However, not all clients > are the same, and most would like to validate that the userptr is valid > and backed by struct pages upon creation, so offer a > I915_USERPTR_POPULATE flag to do just that. > Note that big difference between I915_USERPTR_POPULATE and the deferred > scheme is that POPULATE is guaranteed to be synchronous, the result is > known before the ioctl returns (and the handle exposed). However, due to > system memory pressure, the object may be paged out before use, > requiring them to be paged back in on execbuf (as may always happen). > > At least with the first one I think I was skeptical, since probing at > point A makes a weak test versus userptr getting used at point B. > Populate is kind of same really when user controls the backing store. At > least these two arguments I think stand if we are trying to sell these > flags as validation. But if the idea is limited to pure preload, with no > guarantees that it keeps working by time of real use, then I guess it > may be passable. Well we've thrown this out again because there was no userspace. But if this is requested by mesa, then the _PROBE flag should be entirely sufficient. Since I don't want to hold up dg1 pciids on this it'd be nice if we could just go ahead with the dummy batch, if Ken/Jordan don't object - iris is the only umd that needs this. > Disclaimer that I haven't been following the story on why it is > desirable to abandon set domain. Only judging from this series, mmap > caching mode is implied from the object? Should set domain availability > be driven by the object backing store instead of outright rejection? In theory yes. In practice umd have allowed and all the api are now allocating objects with static properties, and the only reason we ever call set_domain is due to slightly outdated buffer caching schemes dating back to og libdrm from 12+ years ago. The other practical reason is that clflush is simply the slowest way to upload data of all the ones we have :-) So even when this comes back I don't expect this ioctl will come back. > > Regards, > > Tvrtko > > Suggested-by: Daniel Vetter > > Signed-off-by: Matthew Auld > > Cc: Thomas Hellström > > Cc: Maarten Lankhorst > > Cc: Jordan Justen > > Cc: Kenneth Graunke > > Cc: Jason Ekstrand > > Cc: Daniel Vetter > > Cc: Ramalingam C > > --- > > drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git
Re: [PATCH 2/3] drm/i915/uapi: reject caching ioctls for discrete
On Thu, Jul 01, 2021 at 03:36:49PM +0100, Matthew Auld wrote: > It's a noop on DG1, and in the future when need to support other devices > which let us control the coherency, then it should be an immutable > creation time property for the BO. > > Suggested-by: Daniel Vetter > Signed-off-by: Matthew Auld > Cc: Thomas Hellström > Cc: Maarten Lankhorst > Cc: Kenneth Graunke > Cc: Jason Ekstrand > Cc: Daniel Vetter > Cc: Ramalingam C For this and the next can you pls add kerneldoc for the uapi structs and then add a note there that on dgfx they're disallowed? Same for the next one. At least I'd like if we can document uapi here as we go, so that we have something to point people to when they as "what has changed? what should I do in my userspace driver?". Also please make sure these two have acks from mesa devs before you land them. Thanks, Daniel > --- > drivers/gpu/drm/i915/gem/i915_gem_domain.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c > b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > index 7d1400b13429..43004bef55cb 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > @@ -268,6 +268,9 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, > void *data, > struct drm_i915_gem_object *obj; > int err = 0; > > + if (IS_DGFX(to_i915(dev))) > + return -ENODEV; > + > rcu_read_lock(); > obj = i915_gem_object_lookup_rcu(file, args->handle); > if (!obj) { > @@ -303,6 +306,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, > void *data, > enum i915_cache_level level; > int ret = 0; > > + if (IS_DGFX(i915)) > + return -ENODEV; > + > switch (args->caching) { > case I915_CACHING_NONE: > level = I915_CACHE_NONE; > -- > 2.26.3 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v4 0/2] drm/i915: IRQ fixes
On Thu, Jul 01, 2021 at 10:58:31AM +0200, Thomas Zimmermann wrote: > Fix a bug in the usage of IRQs and cleanup references to the DRM > IRQ midlayer. > > Preferably this patchset would be merged through drm-misc-next. > > v4: > * switch IRQ code to intel_synchronize_irq() (Daniel) > v3: > * also use intel_synchronize_hardirq() from other callsite > v2: > * split patch > * also fix comment > * add intel_synchronize_hardirq() (Ville) > * update Fixes tag (Daniel) > > Thomas Zimmermann (2): > drm/i915: Use the correct IRQ during resume > drm/i915: Drop all references to DRM IRQ midlayer Both pushed to drm-intel-gt-next, thanks for your patches. -Daniel > > drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- > drivers/gpu/drm/i915/gt/intel_ring_submission.c | 2 +- > drivers/gpu/drm/i915/i915_drv.c | 1 - > drivers/gpu/drm/i915/i915_irq.c | 5 - > 4 files changed, 2 insertions(+), 8 deletions(-) > > > base-commit: 67f5a18128770817e4218a9e496d2bf5047c51e8 > -- > 2.32.0 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH] drm: mxsfb: Enable recovery on underflow
On 7/1/21 12:49 AM, Marek Vasut wrote: On 6/21/21 2:13 PM, Laurent Pinchart wrote: Hi Marek, Thank you for the patch. On Mon, Jun 21, 2021 at 12:47:01AM +0200, Marek Vasut wrote: There is some sort of corner case behavior of the controller, which could rarely be triggered at least on i.MX6SX connected to 800x480 DPI panel and i.MX8MM connected to DPI->DSI->LVDS bridged 1920x1080 panel (and likely on other setups too), where the image on the panel shifts to the right and wraps around. This happens either when the controller is enabled on boot or even later during run time. The condition does not correct itself automatically, i.e. the display image remains shifted. It seems this problem is known and is due to sporadic underflows of the LCDIF FIFO. While the LCDIF IP does have underflow/overflow IRQs, neither of the IRQs trigger and neither IRQ status bit is asserted when this condition occurs. All known revisions of the LCDIF IP have CTRL1 RECOVER_ON_UNDERFLOW bit, which is described in the reference manual since i.MX23 as " Set this bit to enable the LCDIF block to recover in the next field/frame if there was an underflow in the current field/frame. " Enable this bit to mitigate the sporadic underflows. Fixes: 45d59d704080 ("drm: Add new driver for MXSFB controller") Signed-off-by: Marek Vasut Cc: Daniel Abrecht Cc: Emil Velikov Cc: Laurent Pinchart Cc: Lucas Stach Cc: Stefan Agner --- drivers/gpu/drm/mxsfb/mxsfb_kms.c | 29 + drivers/gpu/drm/mxsfb/mxsfb_regs.h | 1 + 2 files changed, 30 insertions(+) diff --git a/drivers/gpu/drm/mxsfb/mxsfb_kms.c b/drivers/gpu/drm/mxsfb/mxsfb_kms.c index 300e7bab0f43..01e0f525360f 100644 --- a/drivers/gpu/drm/mxsfb/mxsfb_kms.c +++ b/drivers/gpu/drm/mxsfb/mxsfb_kms.c @@ -115,6 +115,35 @@ static void mxsfb_enable_controller(struct mxsfb_drm_private *mxsfb) reg |= VDCTRL4_SYNC_SIGNALS_ON; writel(reg, mxsfb->base + LCDC_VDCTRL4); + /* + * Enable recovery on underflow. + * + * There is some sort of corner case behavior of the controller, + * which could rarely be triggered at least on i.MX6SX connected + * to 800x480 DPI panel and i.MX8MM connected to DPI->DSI->LVDS + * bridged 1920x1080 panel (and likely on other setups too), where + * the image on the panel shifts to the right and wraps around. + * This happens either when the controller is enabled on boot or + * even later during run time. The condition does not correct + * itself automatically, i.e. the display image remains shifted. + * + * It seems this problem is known and is due to sporadic underflows + * of the LCDIF FIFO. While the LCDIF IP does have underflow/overflow + * IRQs, neither of the IRQs trigger and neither IRQ status bit is + * asserted when this condition occurs. + * + * All known revisions of the LCDIF IP have CTRL1 RECOVER_ON_UNDERFLOW + * bit, which is described in the reference manual since i.MX23 as + * " + * Set this bit to enable the LCDIF block to recover in the next + * field/frame if there was an underflow in the current field/frame. + * " + * Enable this bit to mitigate the sporadic underflows. + */ + reg = readl(mxsfb->base + LCDC_CTRL1); + reg |= CTRL1_RECOVER_ON_UNDERFLOW; + writel(reg, mxsfb->base + LCDC_CTRL1); Looks good to me. Thanks for the detailed explanation. Reviewed-by: Laurent Pinchart So who do I CC to pick it? Robert ? There are a few more mxsfb fixes which are RB'd and would be nice if they were picked too. +CC Daniel, can those RB'd mxsfb patches be picked ?
Re: [PATCH -next] drm: vmwgfx: add header file for ttm_range_manager
On Wed, Jun 30, 2021 at 08:36:29PM +, Zack Rusin wrote: > > > > On Jun 30, 2021, at 16:32, Randy Dunlap wrote: > > > > Add a header file for ttm_range_manager function prototypes to > > eliminate build errors: > > > > ../drivers/gpu/drm/vmwgfx/vmwgfx_drv.c: In function ‘vmw_vram_manager_init’: > > ../drivers/gpu/drm/vmwgfx/vmwgfx_drv.c:678:8: error: implicit declaration > > of function ‘ttm_range_man_init’; did you mean ‘ttm_tt_mgr_init’? > > [-Werror=implicit-function-declaration] > > ret = ttm_range_man_init(_priv->bdev, TTM_PL_VRAM, false, > > ../drivers/gpu/drm/vmwgfx/vmwgfx_drv.c: In function ‘vmw_vram_manager_fini’: > > ../drivers/gpu/drm/vmwgfx/vmwgfx_drv.c:690:2: error: implicit declaration > > of function ‘ttm_range_man_fini’; did you mean ‘ttm_pool_mgr_fini’? > > [-Werror=implicit-function-declaration] > > ttm_range_man_fini(_priv->bdev, TTM_PL_VRAM); > > > > Fixes: 9c3006a4cc1b ("drm/ttm: remove available_caching") > > Fixes: a343160235f5 ("drm/vmwgfx/ttm: fix the non-THP cleanup path.") > > Signed-off-by: Randy Dunlap > > Cc: "VMware Graphics" > > Cc: Roland Scheidegger > > Cc: Zack Rusin > > Cc: dri-devel@lists.freedesktop.org > > Cc: Dave Airlie > > Cc: Christian König > > Thank you. That change has been part of drm-misc for a few weeks now: > https://cgit.freedesktop.org/drm/drm-misc/commit/?id=352a81b71ea0a3ce8f929aa60afe369d738a0c6a > I think it should be part of the next merge of drm-misc to linux-next. If not > I’ll port it to drm-misc-fixes. It should probably be in drm-misc-next-fixes. drm-misc-next is for 5.15. drm-misc-fixes was for 5.14 and will only reopen after -rc1. See https://drm.pages.freedesktop.org/maintainer-tools/committer-drm-misc.html#where-do-i-apply-my-patch Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v7 0/5] drm: address potential UAF bugs with drm_master ptrs
On Fri, Jul 02, 2021 at 12:53:53AM +0800, Desmond Cheong Zhi Xi wrote: > This patch series addresses potential use-after-free errors when > dereferencing pointers to struct drm_master. These were identified after one > such bug was caught by Syzbot in drm_getunique(): > https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803 > > The series is broken up into five patches: > > 1. Move a call to drm_is_current_master() out from a section locked by > >mode_config.mutex in drm_mode_getconnector(). This patch does not apply > to stable. > > 2. Move a call to _drm_lease_held() out from the section locked by > >mode_config.idr_mutex in __drm_mode_object_find(). > > 3. Implement a locked version of drm_is_current_master() function that's used > within drm_auth.c. > > 4. Serialize drm_file.master by introducing a new lock that's held whenever > the value of drm_file.master changes. > > 5. Identify areas in drm_lease.c where pointers to struct drm_master are > dereferenced, and ensure that the master pointers are not freed during use. > > Changes in v6 -> v7: > - Patch 2: > Modify code alignment as suggested by the intel-gfx CI. > > Update commit message based on the changes to patch 5. > > - Patch 4: > Add patch 4 to the series. This patch adds a new lock to serialize > drm_file.master, in response to the lockdep splat by the intel-gfx CI. > > - Patch 5: > Move kerneldoc comment about protecting drm_file.master with > drm_device.master_mutex into patch 4. > > Update drm_file_get_master to use the new drm_file.master_lock instead of > drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI. So there's another one now because master->leases is protected by the mode_config.idr_mutex, and that's a bit awkward to untangle. Also I'm really surprised that there was now lockdep through the atomic code anywhere. The reason seems to be that somehow CI reboot first before it managed to run any of the kms_atomic tests, and we can only hit this when we go through the atomic kms ioctl, the legacy kms ioctl don't have that specific issue. Anyway I think this approach doesn't look too workable, and we need something new. But first things first: Are you still on board working on this? You started with a simple patch to fix a UAF bug, now we're deep into reworking tricky locking ... If you feel like you want out I'm totally fine with that. Anyway, I think we need to split drm_device->master_mutex up into two parts: - One part that protects the actual access/changes, which I think for simplicity we'll just leave as the current lock. That lock is a very inner lock, since for the drm_lease.c stuff it has to nest within mode_config.idr_mutex even. - Now the issue with checking master status/leases/whatever as an innermost lock is that you can race, it's a classic time of check vs time of use race: By the time we actually use the thing we validate we'er allowed to use, we might now have access anymore. There's two reasons for that: * DROPMASTER ioctl could remove the master rights, which removes access rights also for all leases * REVOKE_LEASE ioctl can do the same but only for a specific lease This is the thing we're trying to protect against in fbcon code, but that's very spotty protection because all the ioctls by other users aren't actually protected against this. So I think for this we need some kind of big reader lock. Now for the implementation, there's a few things: - I think best option for this big reader lock would be to just use srcu. We only need to flush out all current readers when we drop master or revoke a lease, so synchronize_srcu is perfectly good enough for this purpose. - The fbdev code would switch over to srcu in drm_master_internal_acquire() and drm_master_internal_release(). Ofc within drm_master_internal_acquire we'd still need to check master status with the normal master_mutex. - While we revamp all this we should fix the ioctl checks in drm_ioctl.c. Just noticed that drm_ioctl_permit() could and should be unexported, last user was removed. Within drm_ioctl_kernel we'd then replace the check for drm_is_current_master with the drm_master_internal_acquire/release. - This alone does nothing, we still need to make sure that dropmaster and revoke_lease ioctl flush out all other access before they return to userspace. We can't just call synchronize_srcu because due to the ioctl code in drm_ioctl_kernel we're in that sruc section, we'd need to add a DRM_MASTER_FLUSH ioctl flag which we'd check only when DRM_MASTER is set, and use to call synchronize_srcu. Maybe wrap that in a drm_master_flush or so, or perhaps a drm_master_internal_release_flush. - Also maybe we should drop the _internal_ from that name. Feels a bit wrong when we're also going to use this in the ioctl handler. Thoughts? Totally silly and overkill? Cheers, Daniel > Changes in v5 -> v6: > - Patch 2: >
Re: [Freedreno] [RFC 2/6] drm/msm/dpu: support setting up two independent DSI connectors
On 2021-07-02 10:10, Dmitry Baryshkov wrote: On 02/07/2021 18:52, abhin...@codeaurora.org wrote: On 2021-07-02 02:20, Dmitry Baryshkov wrote: On 02/07/2021 00:12, abhin...@codeaurora.org wrote: On 2021-06-09 14:17, Dmitry Baryshkov wrote: Move setting up encoders from set_encoder_mode to _dpu_kms_initialize_dsi() / _dpu_kms_initialize_displayport(). This allows us to support not only "single DSI" and "dual DSI" but also "two independent DSI" configurations. In future this would also help adding support for multiple DP connectors. Signed-off-by: Dmitry Baryshkov I will have to see Bjorn's changes to check why it was dependent on this cleanup. Is the plan to call _dpu_kms_initialize_displayport() twice? Yes. He needs to initialize several displayport interfaces. With the current code he has to map ids in the set_encoder_mode, using encoder ids (to fill up the info.h_tile_instance, which is hardcoded to 0 for DP in the current code). But still I am not able to put together where is the dependency on that series with this one. Can you please elaborate on that a little bit? It is possible to support independent outputs with the current code. I did that for DSI, Bjorn did for DP. However it results in quite an ugly code to map received encoder in set_encoder_mode back to the DSI (DP) instances to fill the h_tiles. If we drop the whole set_encoder_mode story and call dpu_encoder_setup right from the _dpu_kms_initialize_dsi() (or _dpu_kms_initialize_displayport()), supporting multiple outputs becomes an easy task. Okay got it, I think it will become more clear once he posts. --- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 89 - 1 file changed, 44 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c index 1d3a4f395e74..b63e1c948ff2 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c @@ -471,30 +471,55 @@ static int _dpu_kms_initialize_dsi(struct drm_device *dev, struct dpu_kms *dpu_kms) { struct drm_encoder *encoder = NULL; + struct msm_display_info info; int i, rc = 0; if (!(priv->dsi[0] || priv->dsi[1])) return rc; - /*TODO: Support two independent DSI connectors */ - encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI); - if (IS_ERR(encoder)) { - DPU_ERROR("encoder init failed for dsi display\n"); - return PTR_ERR(encoder); - } - - priv->encoders[priv->num_encoders++] = encoder; - for (i = 0; i < ARRAY_SIZE(priv->dsi); i++) { if (!priv->dsi[i]) continue; + if (!encoder) { + encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI); + if (IS_ERR(encoder)) { + DPU_ERROR("encoder init failed for dsi display\n"); + return PTR_ERR(encoder); + } + + priv->encoders[priv->num_encoders++] = encoder; + + memset(, 0, sizeof(info)); + info.intf_type = encoder->encoder_type; + info.capabilities = msm_dsi_is_cmd_mode(priv->dsi[i]) ? + MSM_DISPLAY_CAP_CMD_MODE : + MSM_DISPLAY_CAP_VID_MODE; + } + rc = msm_dsi_modeset_init(priv->dsi[i], dev, encoder); if (rc) { DPU_ERROR("modeset_init failed for dsi[%d], rc = %d\n", i, rc); break; } + + info.h_tile_instance[info.num_of_h_tiles++] = i; + + if (!msm_dsi_is_dual_dsi(priv->dsi[i])) { I would like to clarify the terminology of dual_dsi in the current DSI driver before the rest of the reviews. Today IS_DUAL_DSI() means that two DSIs are driving the same display and the two DSIs are operating in master-slave mode and are being driven by the same PLL. Yes Usually, dual independent DSI means two DSIs driving two separate panels using two separate PLLs ( DSI0 with PLL0 and DSI1 with PLL1) Let's stop calling it 'dual'. I'd suggest to continue using what was there in the source file: 'two independent DSI'. I assume thats happening due to the foll logic and both DSI PHYs are operating in STANDALONE mode: if (!IS_DUAL_DSI()) { ret = msm_dsi_host_register(msm_dsi->host, true); if (ret) return ret; msm_dsi_phy_set_usecase(msm_dsi->phy, MSM_DSI_PHY_STANDALONE); ret = msm_dsi_host_set_src_pll(msm_dsi->host, msm_dsi->phy); Yes. If we have two independent DSI outputs, we'd like them to work in STANDALONE mode. + rc = dpu_encoder_setup(dev, encoder, ); + if (rc) + DPU_ERROR("failed to setup DPU encoder %d: rc:%d\n", + encoder->base.id, rc); + encoder = NULL; + } + } + + if (encoder) { We will hit this case only for split-DSI right? ( that is two DSIs driving the same panel ). Yes, only in this case. Even single DSI will
[Bug 212469] plymouth animation freezes during shutdown
https://bugzilla.kernel.org/show_bug.cgi?id=212469 Norbert (asteri...@gmx.de) changed: What|Removed |Added CC||asteri...@gmx.de --- Comment #8 from Norbert (asteri...@gmx.de) --- I tried plymouth_0.9.5git20210323-0ubuntu1_amd64 without any change. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches
On Fri, 2 Jul 2021 12:49:55 -0400 Alyssa Rosenzweig wrote: > > > ``` > > > > #define PANFROST_BO_REF_EXCLUSIVE 0x1 > > > > +#define PANFROST_BO_REF_NO_IMPLICIT_DEP0x2 > > > ``` > > > > > > This seems logically backwards. NO_IMPLICIT_DEP makes sense if we're > > > trying to keep backwards compatibility, but here you're crafting a new > > > interface totally from scratch. If anything, isn't BO_REF_IMPLICIT_DEP > > > the flag you'd want? > > > > AFAICT, all other drivers make the no-implicit-dep an opt-in, and I > > didn't want to do things differently in panfrost. But if that's really > > an issue, I can make it an opt-out. > > I don't have strong feelings either way. I was just under the > impressions other drivers did this for b/w compat reasons which don't > apply here. Okay, I think I'll keep it like that unless there's a strong reason to make no-implicit dep the default. It's safer to oversync than the skip the synchronization, so it does feel like something the user should explicitly enable. > > > > Hmm. I'm not /opposed/ and I know kbase uses strides but it seems like > > > somewhat unwarranted complexity, and there is a combinatoric explosion > > > here (if jobs, bo refs, and syncobj refs use 3 different versions, as > > > this encoding permits... as opposed to just specifying a UABI version or > > > something like that) > > > > Sounds like a good idea. I'll add a version field and map that > > to a tuple. > > Cc Steven, does this make sense? I have this approach working, and I must admit I prefer it to the per-object stride field passed to the submit struct.
Re: [PATCH] drm/stm: ltdc: improve pm_runtime to stop clocks
On 7/2/21 11:23 AM, Raphael Gallais-Pou wrote: Hello Marek, Hi, Sorry for the late answer. No worries, take your time On 6/30/21 2:35 AM, Marek Vasut wrote: On 6/29/21 1:58 PM, Raphael GALLAIS-POU - foss wrote: [...] +++ b/drivers/gpu/drm/stm/ltdc.c @@ -425,10 +425,17 @@ static void ltdc_crtc_atomic_enable(struct drm_crtc *crtc, { struct ltdc_device *ldev = crtc_to_ltdc(crtc); struct drm_device *ddev = crtc->dev; + int ret; DRM_DEBUG_DRIVER("\n"); - pm_runtime_get_sync(ddev->dev); + if (!pm_runtime_active(ddev->dev)) { + ret = pm_runtime_get_sync(ddev->dev); All these if (!pm_runtime_active()) then pm_runtime_get_sync() calls look like workaround for some larger issue. Shouldn't the pm_runtime do some refcounting on its own , so this shouldn't be needed ? This problem purely comes from the driver internals, so I don't think it is a workaround. Because of the "ltdc_crtc_mode_set_nofb" function which does not have any "symmetrical" call, such as enable/disable functions, there was two calls to pm_runtime_get_sync against one call to pm_runtime_put_sync. This instability resulted in the LTDC clocks being always enabled, even when the peripheral was disabled. This could be seen in the clk_summary as explained in the patch summary among other things. By doing so, we first check if the clocks are not already activated, and in that case we call pm_runtime_get_sync. I just have to wonder, how come other drivers don't need these if (!pm_runtime_active()) pm_runtime_get_sync() conditions. I think they just get/put the runtime PM within a call itself, not across function calls. Maybe that could be the right fix here too ?
Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches
On Fri, 2 Jul 2021 12:49:55 -0400 Alyssa Rosenzweig wrote: > > > Why is there padding instead of putting point first? > > > > We can move the point field first, but we need to keep the explicit > > padding: the struct has to be 64bit aligned because of the __u64 field > > (which the compiler takes care of) but if we don't have an explicit > > padding, the unused 32bits are undefined, which might cause trouble if > > we extend the struct at some point, since we sort of expect that old > > userspace keep this unused 32bit slot to 0, while new users set > > non-zero values if they have to. > > Makes sense. Reordering still probably makes sense. Actually, I can't re-order if I want the new in_syncs parser to work with the old ioctl(), which you and Steven asked me to do :-).
Re: [PATCH] drm/vc4: dsi: Only register our component once a DSI device is attached
Hi Laurent On Fri, 2 Jul 2021 at 17:47, Laurent Pinchart wrote: > > Hi Dave, > > On Mon, Jun 21, 2021 at 04:59:51PM +0300, Laurent Pinchart wrote: > > On Mon, Jun 21, 2021 at 04:09:05PM +0300, Laurent Pinchart wrote: > > > On Mon, Jun 21, 2021 at 03:56:16PM +0300, Laurent Pinchart wrote: > > > > On Mon, Jun 21, 2021 at 12:49:14PM +0100, Dave Stevenson wrote: > > > > > On Sun, 20 Jun 2021 at 23:49, Laurent Pinchart wrote: > > > > > > On Sun, Jun 20, 2021 at 09:42:27PM +0300, Laurent Pinchart wrote: > > > > > > > On Sun, Jun 20, 2021 at 03:29:03PM +0100, Dave Stevenson wrote: > > > > > > > > On Sun, 20 Jun 2021 at 04:26, Laurent Pinchart wrote: > > > > > > > > > > > > > > > > > > Hi Maxime, > > > > > > > > > > > > > > > > > > I'm testing this, and I'm afraid it causes an issue with all > > > > > > > > > the > > > > > > > > > I2C-controlled bridges. I'm focussing on the newly merged > > > > > > > > > ti-sn65dsi83 > > > > > > > > > driver at the moment, but other are affected the same way. > > > > > > > > > > > > > > > > > > With this patch, the DSI component is only added when the DSI > > > > > > > > > device is > > > > > > > > > attached to the host with mipi_dsi_attach(). In the > > > > > > > > > ti-sn65dsi83 driver, > > > > > > > > > this happens in the bridge attach callback, which is called > > > > > > > > > when the > > > > > > > > > bridge is attached by a call to drm_bridge_attach() in > > > > > > > > > vc4_dsi_bind(). > > > > > > > > > This creates a circular dependency, and the DRM/KMS device is > > > > > > > > > never > > > > > > > > > created. > > > > > > > > > > > > > > > > > > How should this be solved ? Dave, I think you have shown an > > > > > > > > > interest in > > > > > > > > > the sn65dsi83 recently, any help would be appreciated. On a > > > > > > > > > side note, > > > > > > > > > I've tested the ti-sn65dsi83 driver on a v5.10 RPi kernel, > > > > > > > > > without much > > > > > > > > > success (on top of commit e1499baa0b0c I get a very weird > > > > > > > > > frame rate - > > > > > > > > > 147 fps of 99 fps instead of 60 fps - and nothing on the > > > > > > > > > screen, and on > > > > > > > > > top of the latest v5.10 RPi branch, I get lock-related > > > > > > > > > warnings at every > > > > > > > > > page flip), which is why I tried v5.12 and noticed this > > > > > > > > > patch. Is it > > > > > > > > > worth trying to bring up the display on the v5.10 RPi kernel > > > > > > > > > in parallel > > > > > > > > > to fixing the issue introduced in this patch, or is DSI known > > > > > > > > > to be > > > > > > > > > broken there ? > > > > > > > > > > > > > > > > I've been looking at SN65DSI83/4, but as I don't have any > > > > > > > > hardware > > > > > > > > I've largely been suggesting things to try to those on the > > > > > > > > forums who > > > > > > > > do [1]. > > > > > > > > > > > > > > > > My branch at > > > > > > > > https://github.com/6by9/linux/tree/rpi-5.10.y-sn65dsi8x-marek > > > > > > > > is the latest one I've worked on. It's rpi-5.10.y with Marek's > > > > > > > > driver > > > > > > > > cherry-picked, and an overlay and simple-panel definition by > > > > > > > > others. > > > > > > > > It also has a rework for vc4_dsi to use pm_runtime, instead of > > > > > > > > breaking up the DSI bridge chain (which is flawed as it never > > > > > > > > calls > > > > > > > > the bridge mode_set or mode_valid functions which sn65dsi83 > > > > > > > > relies > > > > > > > > on). > > > > I've looked at that, and I'm afraid it doesn't go in the right > > direction. The drm_encoder.crtc field is deprecated and documented as > > only meaningful for non-atomic drivers. You're not introducing its > > usage, but moving the configuration code from .enable() to the runtime > > PM resume handler will make it impossible to fix this. The driver should > > instead move to the .atomic_enable() function. If you need > > enable/pre_enable in the DSI encoder, then you should turn it into a > > drm_bridge. > > Is this something you're looking at by any chance ? I'm testing the > ti-sn65dsi83 driver with VC4. I've spent a couple of hours debugging, > only to realise that the vc4_dsi driver (before the rework you mention > above) doesn't call .mode_set() on the bridges... Applying my sn65dsi83 > series that removes .mode_set() didn't help much as vc4_dsi doesn't call > the atomic operations either :-) I'll test your branch now. This is one of the reasons for my email earlier today - thank you for your reply. The current mainline vc4_dsi driver deliberately breaks the bridge chain so that it gets called before the panel/bridge pre_enable and can power everything up, therefore pre_enable can call host_transfer to configure the panel/bridge over the DSI interface. However we've both noted that it doesn't forward on the mode_set and mode_valid calls, and my investigations say that it doesn't have enough information to make those calls. My branch returns the chain to normal, and
Re: [Freedreno] [RFC 2/6] drm/msm/dpu: support setting up two independent DSI connectors
On 02/07/2021 18:52, abhin...@codeaurora.org wrote: On 2021-07-02 02:20, Dmitry Baryshkov wrote: On 02/07/2021 00:12, abhin...@codeaurora.org wrote: On 2021-06-09 14:17, Dmitry Baryshkov wrote: Move setting up encoders from set_encoder_mode to _dpu_kms_initialize_dsi() / _dpu_kms_initialize_displayport(). This allows us to support not only "single DSI" and "dual DSI" but also "two independent DSI" configurations. In future this would also help adding support for multiple DP connectors. Signed-off-by: Dmitry Baryshkov I will have to see Bjorn's changes to check why it was dependent on this cleanup. Is the plan to call _dpu_kms_initialize_displayport() twice? Yes. He needs to initialize several displayport interfaces. With the current code he has to map ids in the set_encoder_mode, using encoder ids (to fill up the info.h_tile_instance, which is hardcoded to 0 for DP in the current code). But still I am not able to put together where is the dependency on that series with this one. Can you please elaborate on that a little bit? It is possible to support independent outputs with the current code. I did that for DSI, Bjorn did for DP. However it results in quite an ugly code to map received encoder in set_encoder_mode back to the DSI (DP) instances to fill the h_tiles. If we drop the whole set_encoder_mode story and call dpu_encoder_setup right from the _dpu_kms_initialize_dsi() (or _dpu_kms_initialize_displayport()), supporting multiple outputs becomes an easy task. Okay got it, I think it will become more clear once he posts. --- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 89 - 1 file changed, 44 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c index 1d3a4f395e74..b63e1c948ff2 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c @@ -471,30 +471,55 @@ static int _dpu_kms_initialize_dsi(struct drm_device *dev, struct dpu_kms *dpu_kms) { struct drm_encoder *encoder = NULL; + struct msm_display_info info; int i, rc = 0; if (!(priv->dsi[0] || priv->dsi[1])) return rc; - /*TODO: Support two independent DSI connectors */ - encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI); - if (IS_ERR(encoder)) { - DPU_ERROR("encoder init failed for dsi display\n"); - return PTR_ERR(encoder); - } - - priv->encoders[priv->num_encoders++] = encoder; - for (i = 0; i < ARRAY_SIZE(priv->dsi); i++) { if (!priv->dsi[i]) continue; + if (!encoder) { + encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI); + if (IS_ERR(encoder)) { + DPU_ERROR("encoder init failed for dsi display\n"); + return PTR_ERR(encoder); + } + + priv->encoders[priv->num_encoders++] = encoder; + + memset(, 0, sizeof(info)); + info.intf_type = encoder->encoder_type; + info.capabilities = msm_dsi_is_cmd_mode(priv->dsi[i]) ? + MSM_DISPLAY_CAP_CMD_MODE : + MSM_DISPLAY_CAP_VID_MODE; + } + rc = msm_dsi_modeset_init(priv->dsi[i], dev, encoder); if (rc) { DPU_ERROR("modeset_init failed for dsi[%d], rc = %d\n", i, rc); break; } + + info.h_tile_instance[info.num_of_h_tiles++] = i; + + if (!msm_dsi_is_dual_dsi(priv->dsi[i])) { I would like to clarify the terminology of dual_dsi in the current DSI driver before the rest of the reviews. Today IS_DUAL_DSI() means that two DSIs are driving the same display and the two DSIs are operating in master-slave mode and are being driven by the same PLL. Yes Usually, dual independent DSI means two DSIs driving two separate panels using two separate PLLs ( DSI0 with PLL0 and DSI1 with PLL1) Let's stop calling it 'dual'. I'd suggest to continue using what was there in the source file: 'two independent DSI'. I assume thats happening due to the foll logic and both DSI PHYs are operating in STANDALONE mode: if (!IS_DUAL_DSI()) { ret = msm_dsi_host_register(msm_dsi->host, true); if (ret) return ret; msm_dsi_phy_set_usecase(msm_dsi->phy, MSM_DSI_PHY_STANDALONE); ret = msm_dsi_host_set_src_pll(msm_dsi->host, msm_dsi->phy); Yes. If we have two independent DSI outputs, we'd like them to work in STANDALONE mode. + rc = dpu_encoder_setup(dev, encoder, ); + if (rc) + DPU_ERROR("failed to setup DPU encoder %d: rc:%d\n", + encoder->base.id, rc); + encoder = NULL; + } + } + + if (encoder) { We will hit this case only for split-DSI right? ( that is two DSIs driving the same panel ). Yes, only in this case. Even single DSI will be created in the above loop now. So this looks a bit confusing
Re: [PATCH 1/2] drm/i915/gem: Correct the locking and pin pattern for dma-buf
On Thu, Jul 1, 2021 at 4:24 PM Michael J. Ruhl wrote: > > From: Thomas Hellström > > If our exported dma-bufs are imported by another instance of our driver, > that instance will typically have the imported dma-bufs locked during > dma_buf_map_attachment(). But the exporter also locks the same reservation > object in the map_dma_buf() callback, which leads to recursive locking. > > So taking the lock inside _pin_pages_unlocked() is incorrect. > > Additionally, the current pinning code path is contrary to the defined > way that pinning should occur. > > Remove the explicit pin/unpin from the map/umap functions and move them > to the attach/detach allowing correct locking to occur, and to match > the static dma-buf drm_prime pattern. > > Add a live selftest to exercise both dynamic and non-dynamic > exports. > > v2: > - Extend the selftest with a fake dynamic importer. > - Provide real pin and unpin callbacks to not abuse the interface. > v3: (ruhl) > - Remove the dynamic export support and move the pinning into the > attach/detach path. > > Reported-by: Michael J. Ruhl > Signed-off-by: Thomas Hellström > Signed-off-by: Michael J. Ruhl CI splat is because I got the locking rules wrong, I thought ->attach/detach are called under the dma_resv_lock, because when we used the old dma_buf->lock those calls where protected by that lock under the same critical section as adding/removing from the list. But we changed that in f45f57cce584 ("dma-buf: stop using the dmabuf->lock so much v2") 15fd552d186c ("dma-buf: change DMA-buf locking convention v3") Because keeping dma_resv_lock over ->attach/detach would go boom on all the ttm drivers, which pin/unpin the buffer in there. Iow we need the unlocked version there, but also having this split up is a bit awkward and might be good to patch up so that it's atomic again. Would mean updating a bunch of drivers. Christian, any thoughts? Mike, for now I'd just keep using the _unlocked variants and we should be fine. -Daniel > --- > drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 46 ++-- > .../drm/i915/gem/selftests/i915_gem_dmabuf.c | 111 +- > 2 files changed, 143 insertions(+), 14 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > index 616c3a2f1baf..00338c8d3739 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > @@ -12,6 +12,8 @@ > #include "i915_gem_object.h" > #include "i915_scatterlist.h" > > +I915_SELFTEST_DECLARE(static bool force_different_devices;) > + > static struct drm_i915_gem_object *dma_buf_to_obj(struct dma_buf *buf) > { > return to_intel_bo(buf->priv); > @@ -25,15 +27,11 @@ static struct sg_table *i915_gem_map_dma_buf(struct > dma_buf_attachment *attachme > struct scatterlist *src, *dst; > int ret, i; > > - ret = i915_gem_object_pin_pages_unlocked(obj); > - if (ret) > - goto err; > - > /* Copy sg so that we make an independent mapping */ > st = kmalloc(sizeof(struct sg_table), GFP_KERNEL); > if (st == NULL) { > ret = -ENOMEM; > - goto err_unpin_pages; > + goto err; > } > > ret = sg_alloc_table(st, obj->mm.pages->nents, GFP_KERNEL); > @@ -58,8 +56,6 @@ static struct sg_table *i915_gem_map_dma_buf(struct > dma_buf_attachment *attachme > sg_free_table(st); > err_free: > kfree(st); > -err_unpin_pages: > - i915_gem_object_unpin_pages(obj); > err: > return ERR_PTR(ret); > } > @@ -68,13 +64,9 @@ static void i915_gem_unmap_dma_buf(struct > dma_buf_attachment *attachment, >struct sg_table *sg, >enum dma_data_direction dir) > { > - struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf); > - > dma_unmap_sgtable(attachment->dev, sg, dir, DMA_ATTR_SKIP_CPU_SYNC); > sg_free_table(sg); > kfree(sg); > - > - i915_gem_object_unpin_pages(obj); > } > > static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct dma_buf_map > *map) > @@ -168,7 +160,32 @@ static int i915_gem_end_cpu_access(struct dma_buf > *dma_buf, enum dma_data_direct > return err; > } > > +/** > + * i915_gem_dmabuf_attach - Do any extra attach work necessary > + * @dmabuf: imported dma-buf > + * @attach: new attach to do work on > + * > + */ > +static int i915_gem_dmabuf_attach(struct dma_buf *dmabuf, > + struct dma_buf_attachment *attach) > +{ > + struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf); > + > + assert_object_held(obj); > + return i915_gem_object_pin_pages(obj); > +} > + > +static void i915_gem_dmabuf_detach(struct dma_buf *dmabuf, > + struct dma_buf_attachment *attach) > +{ > + struct drm_i915_gem_object *obj =
Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches
> > What is handle? What is point? > > Handle is a syncobj handle, point is the point in a syncobj timeline. > I'll document those fields. OK. > > Why is there padding instead of putting point first? > > We can move the point field first, but we need to keep the explicit > padding: the struct has to be 64bit aligned because of the __u64 field > (which the compiler takes care of) but if we don't have an explicit > padding, the unused 32bits are undefined, which might cause trouble if > we extend the struct at some point, since we sort of expect that old > userspace keep this unused 32bit slot to 0, while new users set > non-zero values if they have to. Makes sense. Reordering still probably makes sense. > > ``` > > > #define PANFROST_BO_REF_EXCLUSIVE0x1 > > > +#define PANFROST_BO_REF_NO_IMPLICIT_DEP 0x2 > > ``` > > > > This seems logically backwards. NO_IMPLICIT_DEP makes sense if we're > > trying to keep backwards compatibility, but here you're crafting a new > > interface totally from scratch. If anything, isn't BO_REF_IMPLICIT_DEP > > the flag you'd want? > > AFAICT, all other drivers make the no-implicit-dep an opt-in, and I > didn't want to do things differently in panfrost. But if that's really > an issue, I can make it an opt-out. I don't have strong feelings either way. I was just under the impressions other drivers did this for b/w compat reasons which don't apply here. > > Hmm. I'm not /opposed/ and I know kbase uses strides but it seems like > > somewhat unwarranted complexity, and there is a combinatoric explosion > > here (if jobs, bo refs, and syncobj refs use 3 different versions, as > > this encoding permits... as opposed to just specifying a UABI version or > > something like that) > > Sounds like a good idea. I'll add a version field and map that > to a tuple. Cc Steven, does this make sense? > > > + /** > > > + * If the submission fails, this encodes the index of the job > > > + * failed. > > > + */ > > > + __u32 fail_idx; > > ``` > > > > What if multiple jobs fail? > > We stop at the first failure. Note that it's not an execution failure, > but a submission failure (AKA, userspace passed wrong params, like > invalid BO or synobj handles). I see, ok.
Re: [PATCH] drm/vc4: dsi: Only register our component once a DSI device is attached
Hi Dave, On Mon, Jun 21, 2021 at 04:59:51PM +0300, Laurent Pinchart wrote: > On Mon, Jun 21, 2021 at 04:09:05PM +0300, Laurent Pinchart wrote: > > On Mon, Jun 21, 2021 at 03:56:16PM +0300, Laurent Pinchart wrote: > > > On Mon, Jun 21, 2021 at 12:49:14PM +0100, Dave Stevenson wrote: > > > > On Sun, 20 Jun 2021 at 23:49, Laurent Pinchart wrote: > > > > > On Sun, Jun 20, 2021 at 09:42:27PM +0300, Laurent Pinchart wrote: > > > > > > On Sun, Jun 20, 2021 at 03:29:03PM +0100, Dave Stevenson wrote: > > > > > > > On Sun, 20 Jun 2021 at 04:26, Laurent Pinchart wrote: > > > > > > > > > > > > > > > > Hi Maxime, > > > > > > > > > > > > > > > > I'm testing this, and I'm afraid it causes an issue with all the > > > > > > > > I2C-controlled bridges. I'm focussing on the newly merged > > > > > > > > ti-sn65dsi83 > > > > > > > > driver at the moment, but other are affected the same way. > > > > > > > > > > > > > > > > With this patch, the DSI component is only added when the DSI > > > > > > > > device is > > > > > > > > attached to the host with mipi_dsi_attach(). In the > > > > > > > > ti-sn65dsi83 driver, > > > > > > > > this happens in the bridge attach callback, which is called > > > > > > > > when the > > > > > > > > bridge is attached by a call to drm_bridge_attach() in > > > > > > > > vc4_dsi_bind(). > > > > > > > > This creates a circular dependency, and the DRM/KMS device is > > > > > > > > never > > > > > > > > created. > > > > > > > > > > > > > > > > How should this be solved ? Dave, I think you have shown an > > > > > > > > interest in > > > > > > > > the sn65dsi83 recently, any help would be appreciated. On a > > > > > > > > side note, > > > > > > > > I've tested the ti-sn65dsi83 driver on a v5.10 RPi kernel, > > > > > > > > without much > > > > > > > > success (on top of commit e1499baa0b0c I get a very weird frame > > > > > > > > rate - > > > > > > > > 147 fps of 99 fps instead of 60 fps - and nothing on the > > > > > > > > screen, and on > > > > > > > > top of the latest v5.10 RPi branch, I get lock-related warnings > > > > > > > > at every > > > > > > > > page flip), which is why I tried v5.12 and noticed this patch. > > > > > > > > Is it > > > > > > > > worth trying to bring up the display on the v5.10 RPi kernel in > > > > > > > > parallel > > > > > > > > to fixing the issue introduced in this patch, or is DSI known > > > > > > > > to be > > > > > > > > broken there ? > > > > > > > > > > > > > > I've been looking at SN65DSI83/4, but as I don't have any hardware > > > > > > > I've largely been suggesting things to try to those on the forums > > > > > > > who > > > > > > > do [1]. > > > > > > > > > > > > > > My branch at > > > > > > > https://github.com/6by9/linux/tree/rpi-5.10.y-sn65dsi8x-marek > > > > > > > is the latest one I've worked on. It's rpi-5.10.y with Marek's > > > > > > > driver > > > > > > > cherry-picked, and an overlay and simple-panel definition by > > > > > > > others. > > > > > > > It also has a rework for vc4_dsi to use pm_runtime, instead of > > > > > > > breaking up the DSI bridge chain (which is flawed as it never > > > > > > > calls > > > > > > > the bridge mode_set or mode_valid functions which sn65dsi83 relies > > > > > > > on). > > I've looked at that, and I'm afraid it doesn't go in the right > direction. The drm_encoder.crtc field is deprecated and documented as > only meaningful for non-atomic drivers. You're not introducing its > usage, but moving the configuration code from .enable() to the runtime > PM resume handler will make it impossible to fix this. The driver should > instead move to the .atomic_enable() function. If you need > enable/pre_enable in the DSI encoder, then you should turn it into a > drm_bridge. Is this something you're looking at by any chance ? I'm testing the ti-sn65dsi83 driver with VC4. I've spent a couple of hours debugging, only to realise that the vc4_dsi driver (before the rework you mention above) doesn't call .mode_set() on the bridges... Applying my sn65dsi83 series that removes .mode_set() didn't help much as vc4_dsi doesn't call the atomic operations either :-) I'll test your branch now. > > > > > > > I ran it on Friday in the lab and encountered an issue with > > > > > > > vc4_dsi > > > > > > > should vc4_dsi_encoder_mode_fixup wish for a divider of 7 > > > > > > > (required > > > > > > > for this 800x1280 panel over 4 lanes) where it resulted in an > > > > > > > invalid > > > > > > > mode configuration. That resulted in patch [2] which then gave me > > > > > > > sensible numbers. > > I have that commit in my branch, but still get 125 fps instead of 60 fps > with kmstest --flip (after reverting commit 1c3834201272 "drm/vc4: > Increase the core clock based on HVS load"). I'm not sure if [2] is the > cause of this, but there seems to be an improvement: in my previous > tests, the mode was fixed up every time I would start the application, > with the timings getting more and more bizarre at every run :-) > > > > >
Re: Questions over DSI within DRM.
Hi Dave, (Expanding the CC list a bit) On Fri, Jul 02, 2021 at 12:03:31PM +0100, Dave Stevenson wrote: > Hi All > > I'm trying to get DSI devices working reliably on the Raspberry Pi, > but I'm hitting a number of places where it isn't clear as to the > expected behaviour within DRM. Not a surprise. I dread reading the rest of this e-mail though :-) > Power on state. Many devices want the DSI clock and/or data lanes in > LP-11 state when they are powered up. When they are powered up, or when they are enabled ? > With the normal calling sequence of: > - panel/bridge pre_enable calls from connector towards the encoder. > - encoder enable which also enables video. > - panel/bridge enable calls from encoder to connector. > there is no point at which the DSI tx is initialised but not > transmitting video. What DSI states are expected to be adopted at each > point? That's undefined I'm afraid, and it should be documented. The upside is that you can propose the behaviour that you need :-) > On a similar theme, some devices want the clock lane in HS mode early > so they can use it in place of an external oscillator, but the data > lanes still in LP-11. There appears to be no way for the > display/bridge to signal this requirement or it be achieved. You're right. A lng time ago, the omapdrm driver had an internal infrastructure that didn't use drm_bridge or drm_panel and instead required omapdrm-specific drivers for those components. It used to model the display pipeline in a different way than drm_bridge, with the sync explicitly setting the source state. A DSI sink could thus control its enable sequence, interleaving programming of the sink with control of the source. Migrating omapdrm to the drm_bridge model took a really large effort, which makes me believe that transitioning the whole subsystem to sink-controlled sources would be close to impossible. We could add DSI-specific operations, or add another enable bridge operation (post_pre_enable ? :-D). Neither would scale, but it may be enough. > host_transfer calls can supposedly be made at any time, however unless > MIPI_DSI_MSG_USE_LPM is set in the message then we're meant to send it > in high speed mode. If this is before a mode has been set, what > defines the link frequency parameters at this point? Adopting a random > default sounds like a good way to get undefined behaviour. > > DSI burst mode needs to set the DSI link frequency independently of > the display mode. How is that meant to be configured? I would have > expected it to come from DT due to link frequency often being chosen > based on EMC restrictions, but I don't see such a thing in any > binding. Undefined too. DSI support was added to DRM without any design effort, it's more a hack than a real solution. The issue with devices that can be controlled over both DSI and I2C is completely unhandled. So far nobody has really cared about implementing DSI right as far as I can tell. > As a follow on, bridge devices can support burst mode (eg TI's > SN65DSI83 that's just been merged), so it needs to know the desired > panel timings for the output side of the bridge, but the DSI link > timings to set up the bridge's PLL. What's the correct way for > signalling that? drm_crtc_state->adjusted_mode vs > drm_crtc_state->mode? Except mode is userspace's request, not what has > been validated/updated by the panel/bridge. adjusted_mode is also a bit of a hack, it solves very specific issues, and its design assumes a single encoder in the chain with no extra bridges. We should instead add modes to the bridge state, and negotiate modes along the pipeline the same way we negotiate formats. > vc4 has constraints that the DSI host interface is fed off an integer > divider from a typically 3GHz clock, so the host interface needs to > signal that burst mode is in use even if the panel/bridge doesn't need > to run in burst mode. (This does mean that displays that require a > very precise link frequency can not be supported). > It currently updates the adjusted_mode via drm_encoder_helper_funcs > mode_fixup, but is that the correct thing to do, or is there a better > solution? > I'd have expected the DSI tx to be responsible for configuring burst > mode parameters anyway, so the mechanism required would seem to be > just the normal approach for adopting burst mode if that is defined. > > Some DSI host interfaces are implemented as bridges, others are > encoders. Pro's and con's of each? I suspect I'm just missing the > history here. It's indeed history. drm_encoder can't go away as it has been erronously exposed to userspace, but going forward, everything should be a bridge. The drm_encoder will still be required, but should just be a dummy, representing the chain of bridges. > When it comes to the MIPI_DSI_MODE_* flags, which ones are mutually > exclusive, or are assumed based on others? Does a burst mode DSI sink > set both MIPI_DSI_MODE_VIDEO and MIPI_DSI_MODE_VIDEO_BURST, or just > the
Re: [PATCH v3 4/7] drm/panfrost: Add the ability to create submit queues
On Fri, 2 Jul 2021 17:49:10 +0200 Boris Brezillon wrote: > On Fri, 2 Jul 2021 16:05:30 +0100 > Steven Price wrote: > > > On 02/07/2021 15:32, Boris Brezillon wrote: > > > Needed to keep VkQueues isolated from each other. > > > > > > v3: > > > * Limit the number of submitqueue per context to 16 > > > * Fix a deadlock > > > > > > Signed-off-by: Boris Brezillon > > > > 16 ought to be enough for anyone ;) > > > > Reviewed-by: Steven Price > > Oops, forgot to change the submitqueue_get() prototype. Will address > that in v4. I meant submitqueue_create().
Re: [Freedreno] [RFC 2/6] drm/msm/dpu: support setting up two independent DSI connectors
On 2021-07-02 02:20, Dmitry Baryshkov wrote: On 02/07/2021 00:12, abhin...@codeaurora.org wrote: On 2021-06-09 14:17, Dmitry Baryshkov wrote: Move setting up encoders from set_encoder_mode to _dpu_kms_initialize_dsi() / _dpu_kms_initialize_displayport(). This allows us to support not only "single DSI" and "dual DSI" but also "two independent DSI" configurations. In future this would also help adding support for multiple DP connectors. Signed-off-by: Dmitry Baryshkov I will have to see Bjorn's changes to check why it was dependent on this cleanup. Is the plan to call _dpu_kms_initialize_displayport() twice? Yes. He needs to initialize several displayport interfaces. With the current code he has to map ids in the set_encoder_mode, using encoder ids (to fill up the info.h_tile_instance, which is hardcoded to 0 for DP in the current code). But still I am not able to put together where is the dependency on that series with this one. Can you please elaborate on that a little bit? It is possible to support independent outputs with the current code. I did that for DSI, Bjorn did for DP. However it results in quite an ugly code to map received encoder in set_encoder_mode back to the DSI (DP) instances to fill the h_tiles. If we drop the whole set_encoder_mode story and call dpu_encoder_setup right from the _dpu_kms_initialize_dsi() (or _dpu_kms_initialize_displayport()), supporting multiple outputs becomes an easy task. Okay got it, I think it will become more clear once he posts. --- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 89 - 1 file changed, 44 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c index 1d3a4f395e74..b63e1c948ff2 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c @@ -471,30 +471,55 @@ static int _dpu_kms_initialize_dsi(struct drm_device *dev, struct dpu_kms *dpu_kms) { struct drm_encoder *encoder = NULL; + struct msm_display_info info; int i, rc = 0; if (!(priv->dsi[0] || priv->dsi[1])) return rc; - /*TODO: Support two independent DSI connectors */ - encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI); - if (IS_ERR(encoder)) { - DPU_ERROR("encoder init failed for dsi display\n"); - return PTR_ERR(encoder); - } - - priv->encoders[priv->num_encoders++] = encoder; - for (i = 0; i < ARRAY_SIZE(priv->dsi); i++) { if (!priv->dsi[i]) continue; + if (!encoder) { + encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI); + if (IS_ERR(encoder)) { + DPU_ERROR("encoder init failed for dsi display\n"); + return PTR_ERR(encoder); + } + + priv->encoders[priv->num_encoders++] = encoder; + + memset(, 0, sizeof(info)); + info.intf_type = encoder->encoder_type; + info.capabilities = msm_dsi_is_cmd_mode(priv->dsi[i]) ? + MSM_DISPLAY_CAP_CMD_MODE : + MSM_DISPLAY_CAP_VID_MODE; + } + rc = msm_dsi_modeset_init(priv->dsi[i], dev, encoder); if (rc) { DPU_ERROR("modeset_init failed for dsi[%d], rc = %d\n", i, rc); break; } + + info.h_tile_instance[info.num_of_h_tiles++] = i; + + if (!msm_dsi_is_dual_dsi(priv->dsi[i])) { I would like to clarify the terminology of dual_dsi in the current DSI driver before the rest of the reviews. Today IS_DUAL_DSI() means that two DSIs are driving the same display and the two DSIs are operating in master-slave mode and are being driven by the same PLL. Yes Usually, dual independent DSI means two DSIs driving two separate panels using two separate PLLs ( DSI0 with PLL0 and DSI1 with PLL1) Let's stop calling it 'dual'. I'd suggest to continue using what was there in the source file: 'two independent DSI'. I assume thats happening due to the foll logic and both DSI PHYs are operating in STANDALONE mode: if (!IS_DUAL_DSI()) { ret = msm_dsi_host_register(msm_dsi->host, true); if (ret) return ret; msm_dsi_phy_set_usecase(msm_dsi->phy, MSM_DSI_PHY_STANDALONE); ret = msm_dsi_host_set_src_pll(msm_dsi->host, msm_dsi->phy); Yes. If we have two independent DSI outputs, we'd like them to work in STANDALONE mode. + rc = dpu_encoder_setup(dev, encoder, ); + if (rc) + DPU_ERROR("failed to setup DPU encoder %d: rc:%d\n", + encoder->base.id, rc); + encoder = NULL; + } + } + + if (encoder) { We will hit this case only for split-DSI right? ( that is two DSIs driving the same panel ). Yes, only in this case. Even single DSI will be created in the above loop now. So this looks a bit confusing at the moment. What is so confusing? I can
Re: [PATCH v3 4/7] drm/panfrost: Add the ability to create submit queues
On Fri, 2 Jul 2021 16:05:30 +0100 Steven Price wrote: > On 02/07/2021 15:32, Boris Brezillon wrote: > > Needed to keep VkQueues isolated from each other. > > > > v3: > > * Limit the number of submitqueue per context to 16 > > * Fix a deadlock > > > > Signed-off-by: Boris Brezillon > > 16 ought to be enough for anyone ;) > > Reviewed-by: Steven Price Oops, forgot to change the submitqueue_get() prototype. Will address that in v4.
Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches
On Fri, 2 Jul 2021 11:13:16 -0400 Alyssa Rosenzweig wrote: > ``` > > +/* Syncobj reference passed at job submission time to encode explicit > > + * input/output fences. > > + */ > > +struct drm_panfrost_syncobj_ref { > > + __u32 handle; > > + __u32 pad; > > + __u64 point; > > +}; > ``` > > What is handle? What is point? Handle is a syncobj handle, point is the point in a syncobj timeline. I'll document those fields. > Why is there padding instead of putting point first? We can move the point field first, but we need to keep the explicit padding: the struct has to be 64bit aligned because of the __u64 field (which the compiler takes care of) but if we don't have an explicit padding, the unused 32bits are undefined, which might cause trouble if we extend the struct at some point, since we sort of expect that old userspace keep this unused 32bit slot to 0, while new users set non-zero values if they have to. > > ``` > > #define PANFROST_BO_REF_EXCLUSIVE 0x1 > > +#define PANFROST_BO_REF_NO_IMPLICIT_DEP0x2 > ``` > > This seems logically backwards. NO_IMPLICIT_DEP makes sense if we're > trying to keep backwards compatibility, but here you're crafting a new > interface totally from scratch. If anything, isn't BO_REF_IMPLICIT_DEP > the flag you'd want? AFAICT, all other drivers make the no-implicit-dep an opt-in, and I didn't want to do things differently in panfrost. But if that's really an issue, I can make it an opt-out. > > ``` > > + /** > > +* Stride of the jobs array (needed to ease extension of the > > +* BATCH_SUBMIT ioctl). Should be set to > > +* sizeof(struct drm_panfrost_job). > > +*/ > > + __u32 job_stride; > ... > > + /** > > +* Stride of the BO and syncobj reference arrays (needed to ease > > +* extension of the BATCH_SUBMIT ioctl). Should be set to > > +* sizeof(struct drm_panfrost_bo_ref). > > +*/ > > + __u32 bo_ref_stride; > > + __u32 syncobj_ref_stride; > ``` > > Hmm. I'm not /opposed/ and I know kbase uses strides but it seems like > somewhat unwarranted complexity, and there is a combinatoric explosion > here (if jobs, bo refs, and syncobj refs use 3 different versions, as > this encoding permits... as opposed to just specifying a UABI version or > something like that) Sounds like a good idea. I'll add a version field and map that to a tuple. > > ``` > > + /** > > +* If the submission fails, this encodes the index of the job > > +* failed. > > +*/ > > + __u32 fail_idx; > ``` > > What if multiple jobs fail? We stop at the first failure. Note that it's not an execution failure, but a submission failure (AKA, userspace passed wrong params, like invalid BO or synobj handles).
Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches
> Better, but I was hoping we can mostly delete panfrost_ioctl_submit(), > leaving something along the lines of: > > static int panfrost_ioctl_submit(struct drm_device *dev, void *data, > struct drm_file *file) > { > struct panfrost_submitqueue *queue; > struct drm_panfrost_submit *args = data; > struct drm_panfrost_job submit_args = { > .head = args->jc, > .bos = args->bo_handles, > .in_syncs = args->in_syncs, > .out_syncs = >out_sync, // FIXME > .in_sync_count = args->in_sync_count, > .out_sync_count = args->out_sync > 0 ? 1 : 0, > .bo_count = args->bo_handle_count, > .requirements = args->requirements > }; > int ret; > > queue = panfrost_submitqueue_get(file->driver_priv, 0); > > ret = panfrost_submit_job(dev, file, queue, _args, > sizeof(u32), ...); > > return ret; > } > > But obviously the out_sync part needs special handling as we can't just > pass a kernel pointer in like that ;) This, a dozen times this.
Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches
On 02/07/2021 15:32, Boris Brezillon wrote: > This should help limit the number of ioctls when submitting multiple > jobs. The new ioctl also supports syncobj timelines and BO access flags. > > v3: > * Re-use panfrost_get_job_bos() and panfrost_get_job_in_syncs() in the > old submit path > > Signed-off-by: Boris Brezillon Better, but I was hoping we can mostly delete panfrost_ioctl_submit(), leaving something along the lines of: static int panfrost_ioctl_submit(struct drm_device *dev, void *data, struct drm_file *file) { struct panfrost_submitqueue *queue; struct drm_panfrost_submit *args = data; struct drm_panfrost_job submit_args = { .head = args->jc, .bos = args->bo_handles, .in_syncs = args->in_syncs, .out_syncs = >out_sync, // FIXME .in_sync_count = args->in_sync_count, .out_sync_count = args->out_sync > 0 ? 1 : 0, .bo_count = args->bo_handle_count, .requirements = args->requirements }; int ret; queue = panfrost_submitqueue_get(file->driver_priv, 0); ret = panfrost_submit_job(dev, file, queue, _args, sizeof(u32), ...); return ret; } But obviously the out_sync part needs special handling as we can't just pass a kernel pointer in like that ;) I'd like the above the duplication of things like this: > + kref_init(>refcount); > + > + job->pfdev = pfdev; > + job->jc = args->head; > + job->requirements = args->requirements; > + job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev); > + job->file_priv = file_priv->driver_priv; > + xa_init_flags(>deps, XA_FLAGS_ALLOC); As otherwise someone is going to mess up in the future and this is going to diverge between the two ioctls. Steve > --- > drivers/gpu/drm/panfrost/panfrost_drv.c | 366 +++- > drivers/gpu/drm/panfrost/panfrost_job.c | 3 + > include/uapi/drm/panfrost_drm.h | 84 ++ > 3 files changed, 375 insertions(+), 78 deletions(-) > > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c > b/drivers/gpu/drm/panfrost/panfrost_drv.c > index 6529e5972b47..e2897de6e77d 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c > @@ -138,111 +138,95 @@ panfrost_get_job_mappings(struct drm_file *file_priv, > struct panfrost_job *job) > return 0; > } > > -/** > - * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects > - * referenced by the job. > - * @dev: DRM device > - * @file_priv: DRM file for this fd > - * @args: IOCTL args > - * @job: job being set up > - * > - * Resolve handles from userspace to BOs and attach them to job. > - * > - * Note that this function doesn't need to unreference the BOs on > - * failure, because that will happen at panfrost_job_cleanup() time. > - */ > +#define PANFROST_BO_REF_ALLOWED_FLAGS \ > + (PANFROST_BO_REF_EXCLUSIVE | PANFROST_BO_REF_NO_IMPLICIT_DEP) > + > static int > -panfrost_lookup_bos(struct drm_device *dev, > - struct drm_file *file_priv, > - struct drm_panfrost_submit *args, > - struct panfrost_job *job) > +panfrost_get_job_bos(struct drm_file *file_priv, > + u64 refs, u32 ref_stride, u32 count, > + struct panfrost_job *job) > { > + void __user *in = u64_to_user_ptr(refs); > unsigned int i; > - int ret; > > - job->bo_count = args->bo_handle_count; > + job->bo_count = count; > > - if (!job->bo_count) > + if (!count) > return 0; > > + job->bos = kvmalloc_array(job->bo_count, sizeof(*job->bos), > + GFP_KERNEL | __GFP_ZERO); > job->bo_flags = kvmalloc_array(job->bo_count, > sizeof(*job->bo_flags), > GFP_KERNEL | __GFP_ZERO); > - if (!job->bo_flags) > + if (!job->bos || !job->bo_flags) > return -ENOMEM; > > - for (i = 0; i < job->bo_count; i++) > - job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE; > + for (i = 0; i < count; i++) { > + struct drm_panfrost_bo_ref ref = { }; > + int ret; > > - ret = drm_gem_objects_lookup(file_priv, > - (void __user *)(uintptr_t)args->bo_handles, > - job->bo_count, >bos); > - if (ret) > - return ret; > + ret = copy_struct_from_user(, sizeof(ref), > + in + (i * ref_stride), > + ref_stride); > + if (ret) > + return ret; > > - return panfrost_get_job_mappings(file_priv, job); > + /* Prior to the BATCH_SUBMIT ioctl all accessed BOs were > + * treated as exclusive. > +
Re: [PATCH v3] drm/dbi: Print errors for mipi_dbi_command()
Den 02.07.2021 15.56, skrev Linus Walleij: > The macro mipi_dbi_command() does not report errors unless you wrap it > in another macro to do the error reporting. > > Report a rate-limited error so we know what is going on. > > Drop the only user in DRM using mipi_dbi_command() and actually checking > the error explicitly, let it use mipi_dbi_command_buf() directly > instead. You forgot to remove this section. With that fixed: Reviewed-by: Noralf Trønnes > > After this any code wishing to send command arrays can rely on > mipi_dbi_command() providing an appropriate error message if something > goes wrong. > > Suggested-by: Noralf Trønnes > Suggested-by: Douglas Anderson > Signed-off-by: Linus Walleij > --- > ChangeLog v2->v3: > - Make the macro actually return the error value if need be, by > putting a single ret; at the end of the macro. (Neat trick from > StackOverflow!) > - Switch the site where I switched mipi_dbi_command() to > mipi_dbi_command_buf() back to what it was. > - Print the failed command in the error message. > - Put the dbi in (parens) since drivers/gpu/drm/tiny/st7586.c was > passing >dbi as parameter to mipi_dbi_command() > and this would expand to > struct device *dev = &>dbi->spi->dev > which can't be parsed but > struct device *dev = &(>dbi)->spi-dev; > should work. I hope. > ChangeLog v1->v2: > - Fish out the struct device * from the DBI SPI client and use > that to print the errors associated with the SPI device. > --- > include/drm/drm_mipi_dbi.h | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/include/drm/drm_mipi_dbi.h b/include/drm/drm_mipi_dbi.h > index f543d6e3e822..05e194958265 100644 > --- a/include/drm/drm_mipi_dbi.h > +++ b/include/drm/drm_mipi_dbi.h > @@ -183,7 +183,12 @@ int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer > *fb, > #define mipi_dbi_command(dbi, cmd, seq...) \ > ({ \ > const u8 d[] = { seq }; \ > - mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \ > + struct device *dev = &(dbi)->spi->dev; \ > + int ret; \ > + ret = mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \ > + if (ret) \ > + dev_err_ratelimited(dev, "error %d when sending command > %#02x\n", ret, cmd); \ > + ret; \ > }) > > #ifdef CONFIG_DEBUG_FS >
Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
On 2021-07-02 14:58, Will Deacon wrote: Hi Nathan, On Thu, Jul 01, 2021 at 12:52:20AM -0700, Nathan Chancellor wrote: On 7/1/2021 12:40 AM, Will Deacon wrote: On Wed, Jun 30, 2021 at 08:56:51AM -0700, Nathan Chancellor wrote: On Wed, Jun 30, 2021 at 12:43:48PM +0100, Will Deacon wrote: On Wed, Jun 30, 2021 at 05:17:27PM +0800, Claire Chang wrote: `BUG: unable to handle page fault for address: 003a8290` and the fact it crashed at `_raw_spin_lock_irqsave` look like the memory (maybe dev->dma_io_tlb_mem) was corrupted? The dev->dma_io_tlb_mem should be set here (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/pci/probe.c#n2528) through device_initialize. I'm less sure about this. 'dma_io_tlb_mem' should be pointing at 'io_tlb_default_mem', which is a page-aligned allocation from memblock. The spinlock is at offset 0x24 in that structure, and looking at the register dump from the crash: Jun 29 18:28:42 hp-4300G kernel: RSP: 0018:adb4013db9e8 EFLAGS: 00010006 Jun 29 18:28:42 hp-4300G kernel: RAX: 003a8290 RBX: RCX: 8900572ad580 Jun 29 18:28:42 hp-4300G kernel: RDX: 89005653f024 RSI: 000c RDI: 1d17 Jun 29 18:28:42 hp-4300G kernel: RBP: 0a20d000 R08: 000c R09: Jun 29 18:28:42 hp-4300G kernel: R10: 0a20d000 R11: 89005653f000 R12: 0212 Jun 29 18:28:42 hp-4300G kernel: R13: 1000 R14: 0002 R15: 0020 Jun 29 18:28:42 hp-4300G kernel: FS: 7f1f8898ea40() GS:89005728() knlGS: Jun 29 18:28:42 hp-4300G kernel: CS: 0010 DS: ES: CR0: 80050033 Jun 29 18:28:42 hp-4300G kernel: CR2: 003a8290 CR3: 0001020d CR4: 00350ee0 Jun 29 18:28:42 hp-4300G kernel: Call Trace: Jun 29 18:28:42 hp-4300G kernel: _raw_spin_lock_irqsave+0x39/0x50 Jun 29 18:28:42 hp-4300G kernel: swiotlb_tbl_map_single+0x12b/0x4c0 Then that correlates with R11 holding the 'dma_io_tlb_mem' pointer and RDX pointing at the spinlock. Yet RAX is holding junk :/ I agree that enabling KASAN would be a good idea, but I also think we probably need to get some more information out of swiotlb_tbl_map_single() to see see what exactly is going wrong in there. I can certainly enable KASAN and if there is any debug print I can add or dump anything, let me know! I bit the bullet and took v5.13 with swiotlb/for-linus-5.14 merged in, built x86 defconfig and ran it on my laptop. However, it seems to work fine! Please can you share your .config? Sure thing, it is attached. It is just Arch Linux's config run through olddefconfig. The original is below in case you need to diff it. https://raw.githubusercontent.com/archlinux/svntogit-packages/9045405dc835527164f3034b3ceb9a67c7a53cd4/trunk/config If there is anything more that I can provide, please let me know. I eventually got this booting (for some reason it was causing LD to SEGV trying to link it for a while...) and sadly it works fine on my laptop. Hmm. Did you manage to try again with KASAN? It might also be worth taking the IOMMU out of the equation, since that interfaces differently with SWIOTLB and I couldn't figure out the code path from the log you provided. What happens if you boot with "amd_iommu=off swiotlb=force"? Oh, now there's a thing... the chat from the IOMMU API in the boot log implies that the IOMMU *should* be in the picture - we see that default domains are IOMMU_DOMAIN_DMA default and the GPU :0c:00.0 was added to a group. That means dev->dma_ops should be set and DMA API calls should be going through iommu-dma, yet the callstack in the crash says we've gone straight from dma_map_page_attrs() to swiotlb_map(), implying the inline dma_direct_map_page() path. If dev->dma_ops didn't look right in the first place, it's perhaps less surprising that dev->dma_io_tlb_mem might be wild as well. It doesn't seem plausible that we should have a race between initialising the device and probing its driver, so maybe the whole dev pointer is getting trampled earlier in the callchain (or is fundamentally wrong to begin with, but from a quick skim of the amdgpu code it did look like adev->dev and adev->pdev are appropriately set early on by amdgpu_pci_probe()). (although word of warning here: i915 dies horribly on my laptop if I pass swiotlb=force, even with the distro 5.10 kernel) FWIW I'd imagine you probably need to massively increase the SWIOTLB buffer size to have hope of that working. Robin.
Re: [PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches
``` > +/* Syncobj reference passed at job submission time to encode explicit > + * input/output fences. > + */ > +struct drm_panfrost_syncobj_ref { > + __u32 handle; > + __u32 pad; > + __u64 point; > +}; ``` What is handle? What is point? Why is there padding instead of putting point first? ``` > #define PANFROST_BO_REF_EXCLUSIVE0x1 > +#define PANFROST_BO_REF_NO_IMPLICIT_DEP 0x2 ``` This seems logically backwards. NO_IMPLICIT_DEP makes sense if we're trying to keep backwards compatibility, but here you're crafting a new interface totally from scratch. If anything, isn't BO_REF_IMPLICIT_DEP the flag you'd want? ``` > + /** > + * Stride of the jobs array (needed to ease extension of the > + * BATCH_SUBMIT ioctl). Should be set to > + * sizeof(struct drm_panfrost_job). > + */ > + __u32 job_stride; ... > + /** > + * Stride of the BO and syncobj reference arrays (needed to ease > + * extension of the BATCH_SUBMIT ioctl). Should be set to > + * sizeof(struct drm_panfrost_bo_ref). > + */ > + __u32 bo_ref_stride; > + __u32 syncobj_ref_stride; ``` Hmm. I'm not /opposed/ and I know kbase uses strides but it seems like somewhat unwarranted complexity, and there is a combinatoric explosion here (if jobs, bo refs, and syncobj refs use 3 different versions, as this encoding permits... as opposed to just specifying a UABI version or something like that) ``` > + /** > + * If the submission fails, this encodes the index of the job > + * failed. > + */ > + __u32 fail_idx; ``` What if multiple jobs fail? ``` > + /** > + * ID of the queue to submit those jobs to. 0 is the default > + * submit queue and should always exists. If you need a dedicated > + * queue, create it with DRM_IOCTL_PANFROST_CREATE_SUBMITQUEUE. > + */ > + __u32 queue; ``` s/exists/exist/
Re: [Intel-gfx] [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+
On 02.07.2021 10:09, Martin Peres wrote: > On 02/07/2021 10:29, Pekka Paalanen wrote: >> On Thu, 1 Jul 2021 21:28:06 +0200 >> Daniel Vetter wrote: >> >>> On Thu, Jul 1, 2021 at 8:27 PM Martin Peres >>> wrote: On 01/07/2021 11:14, Pekka Paalanen wrote: > On Wed, 30 Jun 2021 11:58:25 -0700 > John Harrison wrote: > >> On 6/30/2021 01:22, Martin Peres wrote: >>> On 24/06/2021 10:05, Matthew Brost wrote: From: Daniele Ceraolo Spurio Unblock GuC submission on Gen11+ platforms. Signed-off-by: Michal Wajdeczko Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/uc/intel_guc.h | 1 + drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 8 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h | 3 +-- drivers/gpu/drm/i915/gt/uc/intel_uc.c | 14 +- 4 files changed, 19 insertions(+), 7 deletions(-) > > ... > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c index 7a69c3c027e9..61be0aa81492 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c @@ -34,8 +34,15 @@ static void uc_expand_default_options(struct intel_uc *uc) return; } - /* Default: enable HuC authentication only */ - i915->params.enable_guc = ENABLE_GUC_LOAD_HUC; + /* Intermediate platforms are HuC authentication only */ + if (IS_DG1(i915) || IS_ALDERLAKE_S(i915)) { + drm_dbg(>drm, "Disabling GuC only due to old platform\n"); >>> >>> This comment does not seem accurate, given that DG1 is barely >>> out, and >>> ADL is not out yet. How about: >>> >>> "Disabling GuC on untested platforms"? >>> >> Just because something is not in the shops yet does not mean it is >> new. >> Technology is always obsolete by the time it goes on sale. > > That is a very good reason to not use terminology like "new", "old", > "current", "modern" etc. at all. > > End users like me definitely do not share your interpretation of > "old". Yep, old and new is relative. In the end, what matters is the validation effort, which is why I was proposing "untested platforms". Also, remember that you are not writing these messages for Intel engineers, but instead are writing for Linux *users*. >>> >>> It's drm_dbg. Users don't read this stuff, at least not users with no >>> clue what the driver does and stuff like that. >> >> If I had a problem, I would read it, and I have no clue what anything >> of that is. > > Exactly. > > This level of defense for what is clearly a bad *debug* message (at the > very least, the grammar) makes no sense at all! > > I don't want to hear arguments like "Not my patch" from a developer > literally sending the patch to the ML and who added his SoB to the > patch, playing with words, or minimizing the problem of having such a > message. Agree that 'not my patch' is never a good excuse, but equally we can't blame original patch author as patch was updated few times since then. Maybe to avoid confusions and simplify reviews, we could split this patch into two smaller: first one that really unblocks GuC submission on all Gen11+ (see __guc_submission_supported) and second one that updates defaults for Gen12+ (see uc_expand_default_options), as original patch (from ~2019) evolved more than what title/commit message says. Then we can fix all messaging and make sure it's clear and understood. Thanks, Michal > > All of the above are just clear signals for the community to get off > your playground, which is frankly unacceptable. Your email address does > not matter. > > In the spirit of collaboration, your response should have been "Good > catch, how about or ?". This would not have wasted everyone's > time in an attempt to just have it your way. > > My level of confidence in this GuC transition was already low, but you > guys are working hard to shoot yourself in the foot. Trust should be > earned! > > Martin > >> >> >> Thanks, >> pq >> > ___ > Intel-gfx mailing list > intel-...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [PATCH v3 4/7] drm/panfrost: Add the ability to create submit queues
On 02/07/2021 15:32, Boris Brezillon wrote: > Needed to keep VkQueues isolated from each other. > > v3: > * Limit the number of submitqueue per context to 16 > * Fix a deadlock > > Signed-off-by: Boris Brezillon 16 ought to be enough for anyone ;) Reviewed-by: Steven Price > --- > drivers/gpu/drm/panfrost/Makefile | 3 +- > drivers/gpu/drm/panfrost/panfrost_device.h| 2 +- > drivers/gpu/drm/panfrost/panfrost_drv.c | 69 +++-- > drivers/gpu/drm/panfrost/panfrost_job.c | 47 ++ > drivers/gpu/drm/panfrost/panfrost_job.h | 9 +- > .../gpu/drm/panfrost/panfrost_submitqueue.c | 136 ++ > .../gpu/drm/panfrost/panfrost_submitqueue.h | 27 > include/uapi/drm/panfrost_drm.h | 17 +++ > 8 files changed, 264 insertions(+), 46 deletions(-) > create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c > create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h > > diff --git a/drivers/gpu/drm/panfrost/Makefile > b/drivers/gpu/drm/panfrost/Makefile > index b71935862417..e99192b66ec9 100644 > --- a/drivers/gpu/drm/panfrost/Makefile > +++ b/drivers/gpu/drm/panfrost/Makefile > @@ -9,6 +9,7 @@ panfrost-y := \ > panfrost_gpu.o \ > panfrost_job.o \ > panfrost_mmu.o \ > - panfrost_perfcnt.o > + panfrost_perfcnt.o \ > + panfrost_submitqueue.o > > obj-$(CONFIG_DRM_PANFROST) += panfrost.o > diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h > b/drivers/gpu/drm/panfrost/panfrost_device.h > index 8b25278f34c8..51c0ba4e50f5 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_device.h > +++ b/drivers/gpu/drm/panfrost/panfrost_device.h > @@ -137,7 +137,7 @@ struct panfrost_mmu { > struct panfrost_file_priv { > struct panfrost_device *pfdev; > > - struct drm_sched_entity sched_entity[NUM_JOB_SLOTS]; > + struct idr queues; > > struct panfrost_mmu *mmu; > }; > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c > b/drivers/gpu/drm/panfrost/panfrost_drv.c > index b6b5997c9366..6529e5972b47 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c > @@ -19,6 +19,7 @@ > #include "panfrost_job.h" > #include "panfrost_gpu.h" > #include "panfrost_perfcnt.h" > +#include "panfrost_submitqueue.h" > > static bool unstable_ioctls; > module_param_unsafe(unstable_ioctls, bool, 0600); > @@ -250,6 +251,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, > void *data, > struct panfrost_device *pfdev = dev->dev_private; > struct drm_panfrost_submit *args = data; > struct drm_syncobj *sync_out = NULL; > + struct panfrost_submitqueue *queue; > struct panfrost_job *job; > int ret = 0; > > @@ -259,10 +261,16 @@ static int panfrost_ioctl_submit(struct drm_device > *dev, void *data, > if (args->requirements && args->requirements != PANFROST_JD_REQ_FS) > return -EINVAL; > > + queue = panfrost_submitqueue_get(file->driver_priv, 0); > + if (IS_ERR(queue)) > + return PTR_ERR(queue); > + > if (args->out_sync > 0) { > sync_out = drm_syncobj_find(file, args->out_sync); > - if (!sync_out) > - return -ENODEV; > + if (!sync_out) { > + ret = -ENODEV; > + goto fail_put_queue; > + } > } > > job = kzalloc(sizeof(*job), GFP_KERNEL); > @@ -289,7 +297,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, > void *data, > if (ret) > goto fail_job; > > - ret = panfrost_job_push(job); > + ret = panfrost_job_push(queue, job); > if (ret) > goto fail_job; > > @@ -302,6 +310,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, > void *data, > fail_out_sync: > if (sync_out) > drm_syncobj_put(sync_out); > +fail_put_queue: > + panfrost_submitqueue_put(queue); > > return ret; > } > @@ -451,6 +461,36 @@ static int panfrost_ioctl_madvise(struct drm_device > *dev, void *data, > return ret; > } > > +static int > +panfrost_ioctl_create_submitqueue(struct drm_device *dev, void *data, > + struct drm_file *file_priv) > +{ > + struct panfrost_file_priv *priv = file_priv->driver_priv; > + struct drm_panfrost_create_submitqueue *args = data; > + struct panfrost_submitqueue *queue; > + > + queue = panfrost_submitqueue_create(priv, args->priority, args->flags); > + if (IS_ERR(queue)) > + return PTR_ERR(queue); > + > + args->id = queue->id; > + return 0; > +} > + > +static int > +panfrost_ioctl_destroy_submitqueue(struct drm_device *dev, void *data, > +struct drm_file *file_priv) > +{ > + struct panfrost_file_priv *priv = file_priv->driver_priv; > + u32 id = *((u32 *)data); > + > + /* Default queue can't be destroyed. */ > + if
[PATCH v3 4/7] drm/panfrost: Add the ability to create submit queues
Needed to keep VkQueues isolated from each other. v3: * Limit the number of submitqueue per context to 16 * Fix a deadlock Signed-off-by: Boris Brezillon --- drivers/gpu/drm/panfrost/Makefile | 3 +- drivers/gpu/drm/panfrost/panfrost_device.h| 2 +- drivers/gpu/drm/panfrost/panfrost_drv.c | 69 +++-- drivers/gpu/drm/panfrost/panfrost_job.c | 47 ++ drivers/gpu/drm/panfrost/panfrost_job.h | 9 +- .../gpu/drm/panfrost/panfrost_submitqueue.c | 136 ++ .../gpu/drm/panfrost/panfrost_submitqueue.h | 27 include/uapi/drm/panfrost_drm.h | 17 +++ 8 files changed, 264 insertions(+), 46 deletions(-) create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h diff --git a/drivers/gpu/drm/panfrost/Makefile b/drivers/gpu/drm/panfrost/Makefile index b71935862417..e99192b66ec9 100644 --- a/drivers/gpu/drm/panfrost/Makefile +++ b/drivers/gpu/drm/panfrost/Makefile @@ -9,6 +9,7 @@ panfrost-y := \ panfrost_gpu.o \ panfrost_job.o \ panfrost_mmu.o \ - panfrost_perfcnt.o + panfrost_perfcnt.o \ + panfrost_submitqueue.o obj-$(CONFIG_DRM_PANFROST) += panfrost.o diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h index 8b25278f34c8..51c0ba4e50f5 100644 --- a/drivers/gpu/drm/panfrost/panfrost_device.h +++ b/drivers/gpu/drm/panfrost/panfrost_device.h @@ -137,7 +137,7 @@ struct panfrost_mmu { struct panfrost_file_priv { struct panfrost_device *pfdev; - struct drm_sched_entity sched_entity[NUM_JOB_SLOTS]; + struct idr queues; struct panfrost_mmu *mmu; }; diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index b6b5997c9366..6529e5972b47 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -19,6 +19,7 @@ #include "panfrost_job.h" #include "panfrost_gpu.h" #include "panfrost_perfcnt.h" +#include "panfrost_submitqueue.h" static bool unstable_ioctls; module_param_unsafe(unstable_ioctls, bool, 0600); @@ -250,6 +251,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, struct panfrost_device *pfdev = dev->dev_private; struct drm_panfrost_submit *args = data; struct drm_syncobj *sync_out = NULL; + struct panfrost_submitqueue *queue; struct panfrost_job *job; int ret = 0; @@ -259,10 +261,16 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, if (args->requirements && args->requirements != PANFROST_JD_REQ_FS) return -EINVAL; + queue = panfrost_submitqueue_get(file->driver_priv, 0); + if (IS_ERR(queue)) + return PTR_ERR(queue); + if (args->out_sync > 0) { sync_out = drm_syncobj_find(file, args->out_sync); - if (!sync_out) - return -ENODEV; + if (!sync_out) { + ret = -ENODEV; + goto fail_put_queue; + } } job = kzalloc(sizeof(*job), GFP_KERNEL); @@ -289,7 +297,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, if (ret) goto fail_job; - ret = panfrost_job_push(job); + ret = panfrost_job_push(queue, job); if (ret) goto fail_job; @@ -302,6 +310,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, fail_out_sync: if (sync_out) drm_syncobj_put(sync_out); +fail_put_queue: + panfrost_submitqueue_put(queue); return ret; } @@ -451,6 +461,36 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, void *data, return ret; } +static int +panfrost_ioctl_create_submitqueue(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct panfrost_file_priv *priv = file_priv->driver_priv; + struct drm_panfrost_create_submitqueue *args = data; + struct panfrost_submitqueue *queue; + + queue = panfrost_submitqueue_create(priv, args->priority, args->flags); + if (IS_ERR(queue)) + return PTR_ERR(queue); + + args->id = queue->id; + return 0; +} + +static int +panfrost_ioctl_destroy_submitqueue(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct panfrost_file_priv *priv = file_priv->driver_priv; + u32 id = *((u32 *)data); + + /* Default queue can't be destroyed. */ + if (!id) + return -ENOENT; + + return panfrost_submitqueue_destroy(priv, id); +} + int panfrost_unstable_ioctl_check(void) { if (!unstable_ioctls) @@ -465,6 +505,7 @@ panfrost_open(struct drm_device *dev, struct drm_file *file) int ret;
[PATCH v3 6/7] drm/panfrost: Advertise the SYNCOBJ_TIMELINE feature
Now that we have a new SUBMIT ioctl dealing with timelined syncojbs we can advertise the feature. Signed-off-by: Boris Brezillon Reviewed-by: Steven Price --- drivers/gpu/drm/panfrost/panfrost_drv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index e2897de6e77d..242a16246d79 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -791,7 +791,8 @@ DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops); * - 1.2 - adds AFBC_FEATURES query */ static const struct drm_driver panfrost_drm_driver = { - .driver_features= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ, + .driver_features= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ | + DRIVER_SYNCOBJ_TIMELINE, .open = panfrost_open, .postclose = panfrost_postclose, .ioctls = panfrost_drm_driver_ioctls, -- 2.31.1
[PATCH v3 7/7] drm/panfrost: Bump minor version to reflect the feature additions
We now have a new ioctl that allows submitting multiple jobs at once (among other things) and we support timelined syncobjs. Bump the minor version number to reflect those changes. Signed-off-by: Boris Brezillon Reviewed-by: Steven Price --- drivers/gpu/drm/panfrost/panfrost_drv.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 242a16246d79..33cd34a1213c 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -789,6 +789,8 @@ DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops); * - 1.0 - initial interface * - 1.1 - adds HEAP and NOEXEC flags for CREATE_BO * - 1.2 - adds AFBC_FEATURES query + * - 1.3 - adds the BATCH_SUBMIT, CREATE_SUBMITQUEUE, DESTROY_SUBMITQUEUE + *ioctls and advertises the SYNCOBJ_TIMELINE feature */ static const struct drm_driver panfrost_drm_driver = { .driver_features= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ | @@ -802,7 +804,7 @@ static const struct drm_driver panfrost_drm_driver = { .desc = "panfrost DRM", .date = "20180908", .major = 1, - .minor = 2, + .minor = 3, .gem_create_object = panfrost_gem_create_object, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, -- 2.31.1
[PATCH v3 5/7] drm/panfrost: Add a new ioctl to submit batches
This should help limit the number of ioctls when submitting multiple jobs. The new ioctl also supports syncobj timelines and BO access flags. v3: * Re-use panfrost_get_job_bos() and panfrost_get_job_in_syncs() in the old submit path Signed-off-by: Boris Brezillon --- drivers/gpu/drm/panfrost/panfrost_drv.c | 366 +++- drivers/gpu/drm/panfrost/panfrost_job.c | 3 + include/uapi/drm/panfrost_drm.h | 84 ++ 3 files changed, 375 insertions(+), 78 deletions(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 6529e5972b47..e2897de6e77d 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -138,111 +138,95 @@ panfrost_get_job_mappings(struct drm_file *file_priv, struct panfrost_job *job) return 0; } -/** - * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects - * referenced by the job. - * @dev: DRM device - * @file_priv: DRM file for this fd - * @args: IOCTL args - * @job: job being set up - * - * Resolve handles from userspace to BOs and attach them to job. - * - * Note that this function doesn't need to unreference the BOs on - * failure, because that will happen at panfrost_job_cleanup() time. - */ +#define PANFROST_BO_REF_ALLOWED_FLAGS \ + (PANFROST_BO_REF_EXCLUSIVE | PANFROST_BO_REF_NO_IMPLICIT_DEP) + static int -panfrost_lookup_bos(struct drm_device *dev, - struct drm_file *file_priv, - struct drm_panfrost_submit *args, - struct panfrost_job *job) +panfrost_get_job_bos(struct drm_file *file_priv, +u64 refs, u32 ref_stride, u32 count, +struct panfrost_job *job) { + void __user *in = u64_to_user_ptr(refs); unsigned int i; - int ret; - job->bo_count = args->bo_handle_count; + job->bo_count = count; - if (!job->bo_count) + if (!count) return 0; + job->bos = kvmalloc_array(job->bo_count, sizeof(*job->bos), + GFP_KERNEL | __GFP_ZERO); job->bo_flags = kvmalloc_array(job->bo_count, sizeof(*job->bo_flags), GFP_KERNEL | __GFP_ZERO); - if (!job->bo_flags) + if (!job->bos || !job->bo_flags) return -ENOMEM; - for (i = 0; i < job->bo_count; i++) - job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE; + for (i = 0; i < count; i++) { + struct drm_panfrost_bo_ref ref = { }; + int ret; - ret = drm_gem_objects_lookup(file_priv, -(void __user *)(uintptr_t)args->bo_handles, -job->bo_count, >bos); - if (ret) - return ret; + ret = copy_struct_from_user(, sizeof(ref), + in + (i * ref_stride), + ref_stride); + if (ret) + return ret; - return panfrost_get_job_mappings(file_priv, job); + /* Prior to the BATCH_SUBMIT ioctl all accessed BOs were +* treated as exclusive. +*/ + if (ref_stride == sizeof(u32)) + ref.flags = PANFROST_BO_REF_EXCLUSIVE; + + if ((ref.flags & ~PANFROST_BO_REF_ALLOWED_FLAGS)) + return -EINVAL; + + job->bos[i] = drm_gem_object_lookup(file_priv, ref.handle); + if (!job->bos[i]) + return -EINVAL; + + job->bo_flags[i] = ref.flags; + } + + return 0; } -/** - * panfrost_copy_in_sync() - Sets up job->deps with the sync objects - * referenced by the job. - * @dev: DRM device - * @file_priv: DRM file for this fd - * @args: IOCTL args - * @job: job being set up - * - * Resolve syncobjs from userspace to fences and attach them to job. - * - * Note that this function doesn't need to unreference the fences on - * failure, because that will happen at panfrost_job_cleanup() time. - */ static int -panfrost_copy_in_sync(struct drm_device *dev, - struct drm_file *file_priv, - struct drm_panfrost_submit *args, - struct panfrost_job *job) +panfrost_get_job_in_syncs(struct drm_file *file_priv, + u64 refs, u32 ref_stride, + u32 count, struct panfrost_job *job) { - u32 *handles; - int ret = 0; - int i, in_fence_count; + const void __user *in = u64_to_user_ptr(refs); + unsigned int i; + int ret; - in_fence_count = args->in_sync_count; - - if (!in_fence_count) + if (!count) return 0; - handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL); - if (!handles) { - ret = -ENOMEM; -
[PATCH v3 3/7] drm/panfrost: Add BO access flags to relax dependencies between jobs
Jobs reading from the same BO should not be serialized. Add access flags so we can relax the implicit dependencies in that case. We force exclusive access for now to keep the behavior unchanged, but a new SUBMIT ioctl taking explicit access flags will be introduced. Signed-off-by: Boris Brezillon Reviewed-by: Steven Price --- drivers/gpu/drm/panfrost/panfrost_drv.c | 9 + drivers/gpu/drm/panfrost/panfrost_job.c | 23 +++ drivers/gpu/drm/panfrost/panfrost_job.h | 1 + include/uapi/drm/panfrost_drm.h | 2 ++ 4 files changed, 31 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 9bbc9e78cc85..b6b5997c9366 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -164,6 +164,15 @@ panfrost_lookup_bos(struct drm_device *dev, if (!job->bo_count) return 0; + job->bo_flags = kvmalloc_array(job->bo_count, + sizeof(*job->bo_flags), + GFP_KERNEL | __GFP_ZERO); + if (!job->bo_flags) + return -ENOMEM; + + for (i = 0; i < job->bo_count; i++) + job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE; + ret = drm_gem_objects_lookup(file_priv, (void __user *)(uintptr_t)args->bo_handles, job->bo_count, >bos); diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index fdc1bd7ecf12..152245b122be 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -245,8 +245,16 @@ static int panfrost_acquire_object_fences(struct panfrost_job *job) int i, ret; for (i = 0; i < job->bo_count; i++) { - /* panfrost always uses write mode in its current uapi */ - ret = drm_gem_fence_array_add_implicit(>deps, job->bos[i], true); + bool exclusive = job->bo_flags[i] & PANFROST_BO_REF_EXCLUSIVE; + + if (!exclusive) { + ret = dma_resv_reserve_shared(job->bos[i]->resv, 1); + if (ret) + return ret; + } + + ret = drm_gem_fence_array_add_implicit(>deps, job->bos[i], + exclusive); if (ret) return ret; } @@ -258,8 +266,14 @@ static void panfrost_attach_object_fences(struct panfrost_job *job) { int i; - for (i = 0; i < job->bo_count; i++) - dma_resv_add_excl_fence(job->bos[i]->resv, job->render_done_fence); + for (i = 0; i < job->bo_count; i++) { + struct dma_resv *robj = job->bos[i]->resv; + + if (job->bo_flags[i] & PANFROST_BO_REF_EXCLUSIVE) + dma_resv_add_excl_fence(robj, job->render_done_fence); + else + dma_resv_add_shared_fence(robj, job->render_done_fence); + } } int panfrost_job_push(struct panfrost_job *job) @@ -340,6 +354,7 @@ static void panfrost_job_cleanup(struct kref *ref) kvfree(job->bos); } + kvfree(job->bo_flags); kfree(job); } diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h index 82306a03b57e..1cbc3621b663 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.h +++ b/drivers/gpu/drm/panfrost/panfrost_job.h @@ -32,6 +32,7 @@ struct panfrost_job { struct panfrost_gem_mapping **mappings; struct drm_gem_object **bos; + u32 *bo_flags; u32 bo_count; /* Fence to be signaled by drm-sched once its done with the job */ diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h index 061e700dd06c..45d6c600475c 100644 --- a/include/uapi/drm/panfrost_drm.h +++ b/include/uapi/drm/panfrost_drm.h @@ -224,6 +224,8 @@ struct drm_panfrost_madvise { __u32 retained; /* out, whether backing store still exists */ }; +#define PANFROST_BO_REF_EXCLUSIVE 0x1 + #if defined(__cplusplus) } #endif -- 2.31.1
[PATCH v3 2/7] drm/panfrost: Move the mappings collection out of panfrost_lookup_bos()
So we can re-use it from elsewhere. Signed-off-by: Boris Brezillon Reviewed-by: Steven Price --- drivers/gpu/drm/panfrost/panfrost_drv.c | 52 ++--- 1 file changed, 29 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 1ffaef5ec5ff..9bbc9e78cc85 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -109,6 +109,34 @@ static int panfrost_ioctl_create_bo(struct drm_device *dev, void *data, return 0; } +static int +panfrost_get_job_mappings(struct drm_file *file_priv, struct panfrost_job *job) +{ + struct panfrost_file_priv *priv = file_priv->driver_priv; + unsigned int i; + + job->mappings = kvmalloc_array(job->bo_count, + sizeof(*job->mappings), + GFP_KERNEL | __GFP_ZERO); + if (!job->mappings) + return -ENOMEM; + + for (i = 0; i < job->bo_count; i++) { + struct panfrost_gem_mapping *mapping; + struct panfrost_gem_object *bo; + + bo = to_panfrost_bo(job->bos[i]); + mapping = panfrost_gem_mapping_get(bo, priv); + if (!mapping) + return -EINVAL; + + atomic_inc(>gpu_usecount); + job->mappings[i] = mapping; + } + + return 0; +} + /** * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects * referenced by the job. @@ -128,8 +156,6 @@ panfrost_lookup_bos(struct drm_device *dev, struct drm_panfrost_submit *args, struct panfrost_job *job) { - struct panfrost_file_priv *priv = file_priv->driver_priv; - struct panfrost_gem_object *bo; unsigned int i; int ret; @@ -144,27 +170,7 @@ panfrost_lookup_bos(struct drm_device *dev, if (ret) return ret; - job->mappings = kvmalloc_array(job->bo_count, - sizeof(struct panfrost_gem_mapping *), - GFP_KERNEL | __GFP_ZERO); - if (!job->mappings) - return -ENOMEM; - - for (i = 0; i < job->bo_count; i++) { - struct panfrost_gem_mapping *mapping; - - bo = to_panfrost_bo(job->bos[i]); - mapping = panfrost_gem_mapping_get(bo, priv); - if (!mapping) { - ret = -EINVAL; - break; - } - - atomic_inc(>gpu_usecount); - job->mappings[i] = mapping; - } - - return ret; + return panfrost_get_job_mappings(file_priv, job); } /** -- 2.31.1
[PATCH v3 1/7] drm/panfrost: Pass a job to panfrost_{acquire, attach}_object_fences()
So we don't have to change the prototype if we extend the function. v3: * Fix subject Signed-off-by: Boris Brezillon Reviewed-by: Steven Price --- drivers/gpu/drm/panfrost/panfrost_job.c | 22 -- 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 71a72fb50e6b..fdc1bd7ecf12 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -240,15 +240,13 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js) spin_unlock(>js->job_lock); } -static int panfrost_acquire_object_fences(struct drm_gem_object **bos, - int bo_count, - struct xarray *deps) +static int panfrost_acquire_object_fences(struct panfrost_job *job) { int i, ret; - for (i = 0; i < bo_count; i++) { + for (i = 0; i < job->bo_count; i++) { /* panfrost always uses write mode in its current uapi */ - ret = drm_gem_fence_array_add_implicit(deps, bos[i], true); + ret = drm_gem_fence_array_add_implicit(>deps, job->bos[i], true); if (ret) return ret; } @@ -256,14 +254,12 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos, return 0; } -static void panfrost_attach_object_fences(struct drm_gem_object **bos, - int bo_count, - struct dma_fence *fence) +static void panfrost_attach_object_fences(struct panfrost_job *job) { int i; - for (i = 0; i < bo_count; i++) - dma_resv_add_excl_fence(bos[i]->resv, fence); + for (i = 0; i < job->bo_count; i++) + dma_resv_add_excl_fence(job->bos[i]->resv, job->render_done_fence); } int panfrost_job_push(struct panfrost_job *job) @@ -290,8 +286,7 @@ int panfrost_job_push(struct panfrost_job *job) job->render_done_fence = dma_fence_get(>base.s_fence->finished); - ret = panfrost_acquire_object_fences(job->bos, job->bo_count, ->deps); + ret = panfrost_acquire_object_fences(job); if (ret) { mutex_unlock(>sched_lock); goto unlock; @@ -303,8 +298,7 @@ int panfrost_job_push(struct panfrost_job *job) mutex_unlock(>sched_lock); - panfrost_attach_object_fences(job->bos, job->bo_count, - job->render_done_fence); + panfrost_attach_object_fences(job); unlock: drm_gem_unlock_reservations(job->bos, job->bo_count, _ctx); -- 2.31.1
[PATCH v3 0/7] drm/panfrost: drm/panfrost: Add a new submit ioctl
Hello, This is an attempt at providing a new submit ioctl that's more Vulkan-friendly than the existing one. This ioctl 1/ allows passing several out syncobjs so we can easily update several fence/semaphore in a single ioctl() call 2/ allows passing several jobs so we don't have to have one ioctl per job-chain recorded in the command buffer 3/ supports disabling implicit dependencies as well as non-exclusive access to BOs, thus removing unnecessary synchronization I've also been looking at adding {IN,OUT}_FENCE_FD support (allowing one to pass at most one sync_file object in input and/or creating a sync_file FD embedding the render out fence), but it's not entirely clear to me when that's useful. Indeed, we can already do the sync_file <-> syncobj conversion using the SYNCOBJ_{FD_TO_HANDLE,HANDLE_TO_FD} ioctls if we have to. Note that, unlike Turnip, PanVk is using syncobjs to implement vkQueueWaitIdle(), so the syncobj -> sync_file conversion doesn't have to happen for each submission, but maybe there's a good reason to use sync_files for that too. Any feedback on that aspect would be useful I guess. Any feedback on this new ioctl is welcome, in particular, do you think other things are missing/would be nice to have for Vulkan? Regards, Boris P.S.: basic igt tests for these new ioctls re available there [1] [1]https://gitlab.freedesktop.org/bbrezillon/igt-gpu-tools/-/tree/panfrost-batch-submit Changes in v3: * Fix a deadlock in the submitqueue logic * Limit the number of submitqueue per context to 16 Boris Brezillon (7): drm/panfrost: Pass a job to panfrost_{acquire,attach}_object_fences() drm/panfrost: Move the mappings collection out of panfrost_lookup_bos() drm/panfrost: Add BO access flags to relax dependencies between jobs drm/panfrost: Add the ability to create submit queues drm/panfrost: Add a new ioctl to submit batches drm/panfrost: Advertise the SYNCOBJ_TIMELINE feature drm/panfrost: Bump minor version to reflect the feature additions drivers/gpu/drm/panfrost/Makefile | 3 +- drivers/gpu/drm/panfrost/panfrost_device.h| 2 +- drivers/gpu/drm/panfrost/panfrost_drv.c | 463 ++ drivers/gpu/drm/panfrost/panfrost_job.c | 89 ++-- drivers/gpu/drm/panfrost/panfrost_job.h | 10 +- .../gpu/drm/panfrost/panfrost_submitqueue.c | 136 + .../gpu/drm/panfrost/panfrost_submitqueue.h | 27 + include/uapi/drm/panfrost_drm.h | 103 8 files changed, 689 insertions(+), 144 deletions(-) create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h -- 2.31.1
Re: [Intel-gfx] [PATCH v2 3/3] drm/i915/uapi: reject set_domain for discrete
On 01/07/2021 16:10, Matthew Auld wrote: The CPU domain should be static for discrete, and on DG1 we don't need any flushing since everything is already coherent, so really all this Knowledge of the write combine buffer is assumed to be had by anyone involved? does is an object wait, for which we have an ioctl. Longer term the desired caching should be an immutable creation time property for the BO, which can be set with something like gem_create_ext. One other user is iris + userptr, which uses the set_domain to probe all the pages to check if the GUP succeeds, however keeping the set_domain around just for that seems rather scuffed. We could equally just submit a dummy batch, which should hopefully be good enough, otherwise adding a new creation time flag for userptr might be an option. Although longer term we will also have vm_bind, which should also be a nice fit for this, so adding a whole new flag is likely overkill. Execbuf sounds horrible. But it all reminds me of past work by Chris which is surprisingly hard to find in the archives. Patches like: commit 7706a433388016983052a27c0fd74a64b1897ae7 Author: Chris Wilson Date: Wed Nov 8 17:04:07 2017 + drm/i915/userptr: Probe existence of backing struct pages upon creation Jason Ekstrand requested a more efficient method than userptr+set-domain to determine if the userptr object was backed by a complete set of pages upon creation. To be more efficient than simply populating the userptr using get_user_pages() (as done by the call to set-domain or execbuf), we can walk the tree of vm_area_struct and check for gaps or vma not backed by struct page (VM_PFNMAP). The question is how to handle VM_MIXEDMAP which may be either struct page or pfn backed... commit 7ca21d3390eec23db99b8131ed18bc036efaba18 Author: Chris Wilson Date: Wed Nov 8 17:48:22 2017 + drm/i915/userptr: Add a flag to populate the userptr on creation Acquiring the backing struct pages for the userptr range is not free; the first client for userptr would insist on frequently creating userptr objects ahead of time and not use them. For that first client, deferring the cost of populating the userptr (calling get_user_pages()) to the actual execbuf was a substantial improvement. However, not all clients are the same, and most would like to validate that the userptr is valid and backed by struct pages upon creation, so offer a I915_USERPTR_POPULATE flag to do just that. Note that big difference between I915_USERPTR_POPULATE and the deferred scheme is that POPULATE is guaranteed to be synchronous, the result is known before the ioctl returns (and the handle exposed). However, due to system memory pressure, the object may be paged out before use, requiring them to be paged back in on execbuf (as may always happen). At least with the first one I think I was skeptical, since probing at point A makes a weak test versus userptr getting used at point B. Populate is kind of same really when user controls the backing store. At least these two arguments I think stand if we are trying to sell these flags as validation. But if the idea is limited to pure preload, with no guarantees that it keeps working by time of real use, then I guess it may be passable. Disclaimer that I haven't been following the story on why it is desirable to abandon set domain. Only judging from this series, mmap caching mode is implied from the object? Should set domain availability be driven by the object backing store instead of outright rejection? Regards, Tvrtko Suggested-by: Daniel Vetter Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Maarten Lankhorst Cc: Jordan Justen Cc: Kenneth Graunke Cc: Jason Ekstrand Cc: Daniel Vetter Cc: Ramalingam C --- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index 43004bef55cb..b684a62bf3b0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -490,6 +490,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, u32 write_domain = args->write_domain; int err; + if (IS_DGFX(to_i915(dev))) + return -ENODEV; + /* Only handle setting domains to types used by the CPU. */ if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS) return -EINVAL;
Re: [PATCH v2] drm/dbi: Print errors for mipi_dbi_command()
Hi Linus, I love your patch! Yet something to improve: [auto build test ERROR on linus/master] [also build test ERROR on v5.13 next-20210701] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Linus-Walleij/drm-dbi-Print-errors-for-mipi_dbi_command/20210702-180745 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 3dbdb38e286903ec220aaf1fb29a8d94297da246 config: arm64-randconfig-r001-20210702 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 9eb613b2de3163686b1a4bd1160f15ac56a4b083) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm64 cross compiling tool for clang build # apt-get install binutils-aarch64-linux-gnu # https://github.com/0day-ci/linux/commit/42d93a52e398adbb1fe2dfbc895c649cc8d42780 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Linus-Walleij/drm-dbi-Print-errors-for-mipi_dbi_command/20210702-180745 git checkout 42d93a52e398adbb1fe2dfbc895c649cc8d42780 # save the attached .config to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross O=build_dir ARCH=arm64 SHELL=/bin/bash arch/arm64/kvm/ drivers/gpu/drm/tiny/ If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): >> drivers/gpu/drm/tiny/st7586.c:260:2: error: member reference type 'struct >> mipi_dbi' is not a pointer; did you mean to use '.'? mipi_dbi_command(>dbi, MIPI_DCS_SET_DISPLAY_OFF); ^ ~~~ include/drm/drm_mipi_dbi.h:186:27: note: expanded from macro 'mipi_dbi_command' struct device *dev = >spi->dev; \ ~~~^ >> drivers/gpu/drm/tiny/st7586.c:260:2: error: cannot take the address of an >> rvalue of type 'struct device *' mipi_dbi_command(>dbi, MIPI_DCS_SET_DISPLAY_OFF); ^~~~ include/drm/drm_mipi_dbi.h:186:23: note: expanded from macro 'mipi_dbi_command' struct device *dev = >spi->dev; \ ^~ 2 errors generated. vim +260 drivers/gpu/drm/tiny/st7586.c eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner 2017-08-07 246 eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner 2017-08-07 247 static void st7586_pipe_disable(struct drm_simple_display_pipe *pipe) eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner 2017-08-07 248 { 84137b866e834a drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-07-22 249 struct mipi_dbi_dev *dbidev = drm_to_mipi_dbi_dev(pipe->crtc.dev); eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner 2017-08-07 250 9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25 251 /* 9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25 252 * This callback is not protected by drm_dev_enter/exit since we want to 9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25 253 * turn off the display on regular driver unload. It's highly unlikely 9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25 254 * that the underlying SPI controller is gone should this be called after 9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25 255 * unplug. 9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25 256 */ 9d5645ad1b979c drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-02-25 257 eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner 2017-08-07 258 DRM_DEBUG_KMS("\n"); eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner 2017-08-07 259 84137b866e834a drivers/gpu/drm/tinydrm/st7586.c Noralf Trønnes 2019-07-22 @260 mipi_dbi_command(>dbi, MIPI_DCS_SET_DISPLAY_OFF); eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner 2017-08-07 261 } eac99d4a2013d9 drivers/gpu/drm/tinydrm/st7586.c David Lechner 2017-08-07 262 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH v2 4/7] drm/panfrost: Add the ability to create submit queues
On Fri, 2 Jul 2021 09:58:06 -0400 Alyssa Rosenzweig wrote: > > > My Vulkan knowledge is limited so I'm not sure whether this is the right > > > approach or not. In particular is it correct that an application can > > > create a high priority queue which could affect other (normal priority) > > > applications? > > > > That's what msm does (with no extra CAPS check AFAICT), and the > > freedreno driver can already create high priority queues if > > PIPE_CONTEXT_HIGH_PRIORITY is passed. Not saying that's okay to allow > > userspace to tweak the priority, but if that's a problem, other drivers > > are in trouble too ;-). > > Speaking of, how will PIPE_CONTEXT_HIGH_PRIORITY be implemented with the > new ioctl()? I envisioned something much simpler (for the old ioctl), > just adding a "high priority?" flag to the submit and internally > creating the two queues of normal/high priority for drm_sched to work > out. Is this juggling now moved to userspace? That's what freedreno does. I guess we could create 2 default queues (one normal and one high prio) and extend the old submit ioctl() to do what you suggest if you see a good reason to not switch to the new ioctl() directly. I mean, we'll have to keep support for both anyway, but switching to the new ioctl()) shouldn't be that hard (I can prepare a MR transitioning the gallium driver to BATCH_SUBMIT if you want).
Re: [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+
On 02.07.2021 15:12, Martin Peres wrote: > On 02/07/2021 16:06, Michal Wajdeczko wrote: >> >> >> On 02.07.2021 10:13, Martin Peres wrote: >>> On 01/07/2021 21:24, Martin Peres wrote: >>> [...] > >> >>> + i915->params.enable_guc = ENABLE_GUC_LOAD_HUC; >>> + return; >>> + } >>> + >>> + /* Default: enable HuC authentication and GuC submission */ >>> + i915->params.enable_guc = ENABLE_GUC_LOAD_HUC | >>> ENABLE_GUC_SUBMISSION; >> >> This seems to be in contradiction with the GuC submission plan which >> states: >> >> "Not enabled by default on any current platforms but can be >> enabled via >> modparam enable_guc". >> > > I don't believe any current platform gets this point where GuC > submission would be enabled by default. The first would be ADL-P which > isn't out yet. Isn't that exactly what the line above does? >>> >>> In case you missed this crucial part of the review. Please answer the >>> above question. >> >> I guess there is some misunderstanding here, and I must admit I had >> similar doubt, but if you look beyond patch diff and check function code >> you will find that the very condition is: >> >> /* Don't enable GuC/HuC on pre-Gen12 */ >> if (GRAPHICS_VER(i915) < 12) { >> i915->params.enable_guc = 0; >> return; >> } >> >> so all pre-Gen12 platforms will continue to have GuC/HuC disabled. > > Thanks Michal, but then the problem is the other way: how can one enable > it on gen11? this code here converts default GuC auto mode (enable_guc=-1) into per platform desired (tested) GuC/HuC enables. to override that default, you may still use enable_guc=1 to explicitly enable GuC submission and since we also have this code: +static bool __guc_submission_supported(struct intel_guc *guc) +{ + /* GuC submission is unavailable for pre-Gen11 */ + return intel_guc_is_supported(guc) && + INTEL_GEN(guc_to_gt(guc)->i915) >= 11; +} it should work on any Gen11+. Michal > > I like what Daniele was going for here: separating the capability from > the user-requested value, but then it seems the patch stopped half way. > How about never touching the parameter, and having a AND between the two > values to get the effective enable_guc? > > Right now, the code is really confusing :s > > Thanks, > Martin > >> >> Thanks, >> Michal >>
Re: [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
Hi Nathan, On Thu, Jul 01, 2021 at 12:52:20AM -0700, Nathan Chancellor wrote: > On 7/1/2021 12:40 AM, Will Deacon wrote: > > On Wed, Jun 30, 2021 at 08:56:51AM -0700, Nathan Chancellor wrote: > > > On Wed, Jun 30, 2021 at 12:43:48PM +0100, Will Deacon wrote: > > > > On Wed, Jun 30, 2021 at 05:17:27PM +0800, Claire Chang wrote: > > > > > `BUG: unable to handle page fault for address: 003a8290` and > > > > > the fact it crashed at `_raw_spin_lock_irqsave` look like the memory > > > > > (maybe dev->dma_io_tlb_mem) was corrupted? > > > > > The dev->dma_io_tlb_mem should be set here > > > > > (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/pci/probe.c#n2528) > > > > > through device_initialize. > > > > > > > > I'm less sure about this. 'dma_io_tlb_mem' should be pointing at > > > > 'io_tlb_default_mem', which is a page-aligned allocation from memblock. > > > > The spinlock is at offset 0x24 in that structure, and looking at the > > > > register dump from the crash: > > > > > > > > Jun 29 18:28:42 hp-4300G kernel: RSP: 0018:adb4013db9e8 EFLAGS: > > > > 00010006 > > > > Jun 29 18:28:42 hp-4300G kernel: RAX: 003a8290 RBX: > > > > RCX: 8900572ad580 > > > > Jun 29 18:28:42 hp-4300G kernel: RDX: 89005653f024 RSI: > > > > 000c RDI: 1d17 > > > > Jun 29 18:28:42 hp-4300G kernel: RBP: 0a20d000 R08: > > > > 000c R09: > > > > Jun 29 18:28:42 hp-4300G kernel: R10: 0a20d000 R11: > > > > 89005653f000 R12: 0212 > > > > Jun 29 18:28:42 hp-4300G kernel: R13: 1000 R14: > > > > 0002 R15: 0020 > > > > Jun 29 18:28:42 hp-4300G kernel: FS: 7f1f8898ea40() > > > > GS:89005728() knlGS: > > > > Jun 29 18:28:42 hp-4300G kernel: CS: 0010 DS: ES: CR0: > > > > 80050033 > > > > Jun 29 18:28:42 hp-4300G kernel: CR2: 003a8290 CR3: > > > > 0001020d CR4: 00350ee0 > > > > Jun 29 18:28:42 hp-4300G kernel: Call Trace: > > > > Jun 29 18:28:42 hp-4300G kernel: _raw_spin_lock_irqsave+0x39/0x50 > > > > Jun 29 18:28:42 hp-4300G kernel: swiotlb_tbl_map_single+0x12b/0x4c0 > > > > > > > > Then that correlates with R11 holding the 'dma_io_tlb_mem' pointer and > > > > RDX pointing at the spinlock. Yet RAX is holding junk :/ > > > > > > > > I agree that enabling KASAN would be a good idea, but I also think we > > > > probably need to get some more information out of > > > > swiotlb_tbl_map_single() > > > > to see see what exactly is going wrong in there. > > > > > > I can certainly enable KASAN and if there is any debug print I can add > > > or dump anything, let me know! > > > > I bit the bullet and took v5.13 with swiotlb/for-linus-5.14 merged in, built > > x86 defconfig and ran it on my laptop. However, it seems to work fine! > > > > Please can you share your .config? > > Sure thing, it is attached. It is just Arch Linux's config run through > olddefconfig. The original is below in case you need to diff it. > > https://raw.githubusercontent.com/archlinux/svntogit-packages/9045405dc835527164f3034b3ceb9a67c7a53cd4/trunk/config > > If there is anything more that I can provide, please let me know. I eventually got this booting (for some reason it was causing LD to SEGV trying to link it for a while...) and sadly it works fine on my laptop. Hmm. Did you manage to try again with KASAN? It might also be worth taking the IOMMU out of the equation, since that interfaces differently with SWIOTLB and I couldn't figure out the code path from the log you provided. What happens if you boot with "amd_iommu=off swiotlb=force"? (although word of warning here: i915 dies horribly on my laptop if I pass swiotlb=force, even with the distro 5.10 kernel) Will
Re: [PATCH v2 4/7] drm/panfrost: Add the ability to create submit queues
> > My Vulkan knowledge is limited so I'm not sure whether this is the right > > approach or not. In particular is it correct that an application can > > create a high priority queue which could affect other (normal priority) > > applications? > > That's what msm does (with no extra CAPS check AFAICT), and the > freedreno driver can already create high priority queues if > PIPE_CONTEXT_HIGH_PRIORITY is passed. Not saying that's okay to allow > userspace to tweak the priority, but if that's a problem, other drivers > are in trouble too ;-). Speaking of, how will PIPE_CONTEXT_HIGH_PRIORITY be implemented with the new ioctl()? I envisioned something much simpler (for the old ioctl), just adding a "high priority?" flag to the submit and internally creating the two queues of normal/high priority for drm_sched to work out. Is this juggling now moved to userspace?
[PATCH v3] drm/dbi: Print errors for mipi_dbi_command()
The macro mipi_dbi_command() does not report errors unless you wrap it in another macro to do the error reporting. Report a rate-limited error so we know what is going on. Drop the only user in DRM using mipi_dbi_command() and actually checking the error explicitly, let it use mipi_dbi_command_buf() directly instead. After this any code wishing to send command arrays can rely on mipi_dbi_command() providing an appropriate error message if something goes wrong. Suggested-by: Noralf Trønnes Suggested-by: Douglas Anderson Signed-off-by: Linus Walleij --- ChangeLog v2->v3: - Make the macro actually return the error value if need be, by putting a single ret; at the end of the macro. (Neat trick from StackOverflow!) - Switch the site where I switched mipi_dbi_command() to mipi_dbi_command_buf() back to what it was. - Print the failed command in the error message. - Put the dbi in (parens) since drivers/gpu/drm/tiny/st7586.c was passing >dbi as parameter to mipi_dbi_command() and this would expand to struct device *dev = &>dbi->spi->dev which can't be parsed but struct device *dev = &(>dbi)->spi-dev; should work. I hope. ChangeLog v1->v2: - Fish out the struct device * from the DBI SPI client and use that to print the errors associated with the SPI device. --- include/drm/drm_mipi_dbi.h | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/include/drm/drm_mipi_dbi.h b/include/drm/drm_mipi_dbi.h index f543d6e3e822..05e194958265 100644 --- a/include/drm/drm_mipi_dbi.h +++ b/include/drm/drm_mipi_dbi.h @@ -183,7 +183,12 @@ int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer *fb, #define mipi_dbi_command(dbi, cmd, seq...) \ ({ \ const u8 d[] = { seq }; \ - mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \ + struct device *dev = &(dbi)->spi->dev; \ + int ret; \ + ret = mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \ + if (ret) \ + dev_err_ratelimited(dev, "error %d when sending command %#02x\n", ret, cmd); \ + ret; \ }) #ifdef CONFIG_DEBUG_FS -- 2.31.1
Re: [PATCH v15 12/12] of: Add plumbing for restricted DMA pool
On 7/2/21 6:18 AM, Will Deacon wrote: On Fri, Jul 02, 2021 at 12:39:41PM +0100, Robin Murphy wrote: On 2021-07-02 04:08, Guenter Roeck wrote: On Thu, Jun 24, 2021 at 11:55:26PM +0800, Claire Chang wrote: If a device is not behind an IOMMU, we look up the device node and set up the restricted DMA when the restricted-dma-pool is presented. Signed-off-by: Claire Chang Tested-by: Stefano Stabellini Tested-by: Will Deacon With this patch in place, all sparc and sparc64 qemu emulations fail to boot. Symptom is that the root file system is not found. Reverting this patch fixes the problem. Bisect log is attached. Ah, OF_ADDRESS depends on !SPARC, so of_dma_configure_id() is presumably returning an unexpected -ENODEV from the of_dma_set_restricted_buffer() stub. That should probably be returning 0 instead, since either way it's not an error condition for it to simply do nothing. Something like below? Yes, that does the trick. Will --->8 From 4d9dcb9210c1f37435b6088284e04b6b36ee8c4d Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Fri, 2 Jul 2021 14:13:28 +0100 Subject: [PATCH] of: Return success from of_dma_set_restricted_buffer() when !OF_ADDRESS When CONFIG_OF_ADDRESS=n, of_dma_set_restricted_buffer() returns -ENODEV and breaks the boot for sparc[64] machines. Return 0 instead, since the function is essentially a glorified NOP in this configuration. Cc: Claire Chang Cc: Konrad Rzeszutek Wilk Reported-by: Guenter Roeck Suggested-by: Robin Murphy Link: https://lore.kernel.org/r/20210702030807.ga2685...@roeck-us.net Signed-off-by: Will Deacon Tested-by: Guenter Roeck --- drivers/of/of_private.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h index 8fde97565d11..34dd548c5eac 100644 --- a/drivers/of/of_private.h +++ b/drivers/of/of_private.h @@ -173,7 +173,8 @@ static inline int of_dma_get_range(struct device_node *np, static inline int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np) { - return -ENODEV; + /* Do nothing, successfully. */ + return 0; } #endif
Re: [PATCH v2 3/3] drm/i915/uapi: reject set_domain for discrete
On Thu, 1 Jul 2021 at 16:10, Matthew Auld wrote: > > The CPU domain should be static for discrete, and on DG1 we don't need > any flushing since everything is already coherent, so really all this > does is an object wait, for which we have an ioctl. Longer term the > desired caching should be an immutable creation time property for the > BO, which can be set with something like gem_create_ext. > > One other user is iris + userptr, which uses the set_domain to probe all > the pages to check if the GUP succeeds, however keeping the set_domain > around just for that seems rather scuffed. We could equally just submit > a dummy batch, which should hopefully be good enough, otherwise adding a > new creation time flag for userptr might be an option. Although longer > term we will also have vm_bind, which should also be a nice fit for > this, so adding a whole new flag is likely overkill. Kenneth, do you have a preference for the iris + userptr use case? Adding the flag shouldn't be much work, if you feel the dummy batch is too ugly. I don't mind either way. > > Suggested-by: Daniel Vetter > Signed-off-by: Matthew Auld > Cc: Thomas Hellström > Cc: Maarten Lankhorst > Cc: Jordan Justen > Cc: Kenneth Graunke > Cc: Jason Ekstrand > Cc: Daniel Vetter > Cc: Ramalingam C > --- > drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c > b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > index 43004bef55cb..b684a62bf3b0 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > @@ -490,6 +490,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void > *data, > u32 write_domain = args->write_domain; > int err; > > + if (IS_DGFX(to_i915(dev))) > + return -ENODEV; > + > /* Only handle setting domains to types used by the CPU. */ > if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS) > return -EINVAL; > -- > 2.26.3 >
Re: [PATCH v15 12/12] of: Add plumbing for restricted DMA pool
On Fri, Jul 02, 2021 at 12:39:41PM +0100, Robin Murphy wrote: > On 2021-07-02 04:08, Guenter Roeck wrote: > > On Thu, Jun 24, 2021 at 11:55:26PM +0800, Claire Chang wrote: > > > If a device is not behind an IOMMU, we look up the device node and set > > > up the restricted DMA when the restricted-dma-pool is presented. > > > > > > Signed-off-by: Claire Chang > > > Tested-by: Stefano Stabellini > > > Tested-by: Will Deacon > > > > With this patch in place, all sparc and sparc64 qemu emulations > > fail to boot. Symptom is that the root file system is not found. > > Reverting this patch fixes the problem. Bisect log is attached. > > Ah, OF_ADDRESS depends on !SPARC, so of_dma_configure_id() is presumably > returning an unexpected -ENODEV from the of_dma_set_restricted_buffer() > stub. That should probably be returning 0 instead, since either way it's not > an error condition for it to simply do nothing. Something like below? Will --->8 >From 4d9dcb9210c1f37435b6088284e04b6b36ee8c4d Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Fri, 2 Jul 2021 14:13:28 +0100 Subject: [PATCH] of: Return success from of_dma_set_restricted_buffer() when !OF_ADDRESS When CONFIG_OF_ADDRESS=n, of_dma_set_restricted_buffer() returns -ENODEV and breaks the boot for sparc[64] machines. Return 0 instead, since the function is essentially a glorified NOP in this configuration. Cc: Claire Chang Cc: Konrad Rzeszutek Wilk Reported-by: Guenter Roeck Suggested-by: Robin Murphy Link: https://lore.kernel.org/r/20210702030807.ga2685...@roeck-us.net Signed-off-by: Will Deacon --- drivers/of/of_private.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h index 8fde97565d11..34dd548c5eac 100644 --- a/drivers/of/of_private.h +++ b/drivers/of/of_private.h @@ -173,7 +173,8 @@ static inline int of_dma_get_range(struct device_node *np, static inline int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np) { - return -ENODEV; + /* Do nothing, successfully. */ + return 0; } #endif -- 2.32.0.93.g670b81a890-goog
Re: [PATCH 2/2] drm/vc4: hdmi: Convert to gpiod
Hi Nathan, On Thu, Jul 01, 2021 at 08:29:34PM -0700, Nathan Chancellor wrote: > On Mon, May 24, 2021 at 03:18:52PM +0200, Maxime Ripard wrote: > > The new gpiod interface takes care of parsing the GPIO flags and to > > return the logical value when accessing an active-low GPIO, so switching > > to it simplifies a lot the driver. > > > > Signed-off-by: Maxime Ripard > > --- > > drivers/gpu/drm/vc4/vc4_hdmi.c | 24 +++- > > drivers/gpu/drm/vc4/vc4_hdmi.h | 3 +-- > > 2 files changed, 8 insertions(+), 19 deletions(-) > > > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c > > index ccc6c8079dc6..34622c59f6a7 100644 > > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c > > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c > > @@ -159,10 +159,9 @@ vc4_hdmi_connector_detect(struct drm_connector > > *connector, bool force) > > struct vc4_hdmi *vc4_hdmi = connector_to_vc4_hdmi(connector); > > bool connected = false; > > > > - if (vc4_hdmi->hpd_gpio) { > > - if (gpio_get_value_cansleep(vc4_hdmi->hpd_gpio) ^ > > - vc4_hdmi->hpd_active_low) > > - connected = true; > > + if (vc4_hdmi->hpd_gpio && > > + gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio)) { > > + connected = true; > > } else if (drm_probe_ddc(vc4_hdmi->ddc)) { > > connected = true; > > } else if (HDMI_READ(HDMI_HOTPLUG) & VC4_HDMI_HOTPLUG_CONNECTED) { > > @@ -1993,7 +1992,6 @@ static int vc4_hdmi_bind(struct device *dev, struct > > device *master, void *data) > > struct vc4_hdmi *vc4_hdmi; > > struct drm_encoder *encoder; > > struct device_node *ddc_node; > > - u32 value; > > int ret; > > > > vc4_hdmi = devm_kzalloc(dev, sizeof(*vc4_hdmi), GFP_KERNEL); > > @@ -2031,18 +2029,10 @@ static int vc4_hdmi_bind(struct device *dev, struct > > device *master, void *data) > > /* Only use the GPIO HPD pin if present in the DT, otherwise > > * we'll use the HDMI core's register. > > */ > > - if (of_find_property(dev->of_node, "hpd-gpios", )) { > > - enum of_gpio_flags hpd_gpio_flags; > > - > > - vc4_hdmi->hpd_gpio = of_get_named_gpio_flags(dev->of_node, > > -"hpd-gpios", 0, > > -_gpio_flags); > > - if (vc4_hdmi->hpd_gpio < 0) { > > - ret = vc4_hdmi->hpd_gpio; > > - goto err_put_ddc; > > - } > > - > > - vc4_hdmi->hpd_active_low = hpd_gpio_flags & OF_GPIO_ACTIVE_LOW; > > + vc4_hdmi->hpd_gpio = devm_gpiod_get_optional(dev, "hpd", GPIOD_IN); > > + if (IS_ERR(vc4_hdmi->hpd_gpio)) { > > + ret = PTR_ERR(vc4_hdmi->hpd_gpio); > > + goto err_put_ddc; > > } > > > > vc4_hdmi->disable_wifi_frequencies = > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h > > index 060bcaefbeb5..2688a55461d6 100644 > > --- a/drivers/gpu/drm/vc4/vc4_hdmi.h > > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.h > > @@ -146,8 +146,7 @@ struct vc4_hdmi { > > /* VC5 Only */ > > void __iomem *rm_regs; > > > > - int hpd_gpio; > > - bool hpd_active_low; > > + struct gpio_desc *hpd_gpio; > > > > /* > > * On some systems (like the RPi4), some modes are in the same > > -- > > 2.31.1 > > This patch as commit 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod") > causes my Raspberry Pi 3 to lock up shortly after boot in combination > with commit 411efa18e4b0 ("drm/vc4: hdmi: Move the HSM clock enable to > runtime_pm"). The serial console and ssh are completely unresponsive and > I do not see any messages in dmesg with "debug ignore_loglevel". The > device is running with a 32-bit kernel (multi_v7_defconfig) with 32-bit > userspace. If there is any further information that I can provide, > please let me know. Thanks for reporting this. The same bug has been reported on wednesday on the RPi repo here: https://github.com/raspberrypi/linux/pull/4418 More specifically, this commit should fix it: https://github.com/raspberrypi/linux/pull/4418/commits/6d404373c20a794da3d6a7b4f1373903183bb5d0 Even though it's based on the 5.10 kernel, it should apply without any warning on a mainline tree. Let me know if it fixes your issue too Maxime signature.asc Description: PGP signature
Re: [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+
On 02/07/2021 16:06, Michal Wajdeczko wrote: On 02.07.2021 10:13, Martin Peres wrote: On 01/07/2021 21:24, Martin Peres wrote: [...] + i915->params.enable_guc = ENABLE_GUC_LOAD_HUC; + return; + } + + /* Default: enable HuC authentication and GuC submission */ + i915->params.enable_guc = ENABLE_GUC_LOAD_HUC | ENABLE_GUC_SUBMISSION; This seems to be in contradiction with the GuC submission plan which states: "Not enabled by default on any current platforms but can be enabled via modparam enable_guc". I don't believe any current platform gets this point where GuC submission would be enabled by default. The first would be ADL-P which isn't out yet. Isn't that exactly what the line above does? In case you missed this crucial part of the review. Please answer the above question. I guess there is some misunderstanding here, and I must admit I had similar doubt, but if you look beyond patch diff and check function code you will find that the very condition is: /* Don't enable GuC/HuC on pre-Gen12 */ if (GRAPHICS_VER(i915) < 12) { i915->params.enable_guc = 0; return; } so all pre-Gen12 platforms will continue to have GuC/HuC disabled. Thanks Michal, but then the problem is the other way: how can one enable it on gen11? I like what Daniele was going for here: separating the capability from the user-requested value, but then it seems the patch stopped half way. How about never touching the parameter, and having a AND between the two values to get the effective enable_guc? Right now, the code is really confusing :s Thanks, Martin Thanks, Michal
Re: [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+
On 02.07.2021 10:13, Martin Peres wrote: > On 01/07/2021 21:24, Martin Peres wrote: > [...] >>> > + i915->params.enable_guc = ENABLE_GUC_LOAD_HUC; > + return; > + } > + > + /* Default: enable HuC authentication and GuC submission */ > + i915->params.enable_guc = ENABLE_GUC_LOAD_HUC | > ENABLE_GUC_SUBMISSION; This seems to be in contradiction with the GuC submission plan which states: "Not enabled by default on any current platforms but can be enabled via modparam enable_guc". >>> >>> I don't believe any current platform gets this point where GuC >>> submission would be enabled by default. The first would be ADL-P which >>> isn't out yet. >> >> Isn't that exactly what the line above does? > > In case you missed this crucial part of the review. Please answer the > above question. I guess there is some misunderstanding here, and I must admit I had similar doubt, but if you look beyond patch diff and check function code you will find that the very condition is: /* Don't enable GuC/HuC on pre-Gen12 */ if (GRAPHICS_VER(i915) < 12) { i915->params.enable_guc = 0; return; } so all pre-Gen12 platforms will continue to have GuC/HuC disabled. Thanks, Michal
[ANNOUNCE] libdrm 2.4.107
Alex Deucher (1): amdgpu: update marketing names Andrey Grodzovsky (6): tests/amdgpu: Fix valgrind warning test/amdgpu: Add helper functions for hot unplug test/amdgpu/hotunplug: Add test suite for GPU unplug tests/amdgpu/hotunplug: Add unplug with cs test. tests/amdgpu/hotunplug: Add hotunplug with exported bo test tests/amdgpu/hotunplug: Add hotunplug with exported fence Bas Nieuwenhuizen (2): amdgpu: Add vamgr for capture/replay. Bump version to 2.4.107 Eleni Maria Stea (3): include in xf86drmMode when the OS is FreeBSD _WANT_KERNEL_ERRNO must be defined in FreeBSD for ERESTART to be used Conditionally include and on Linux, BSD Lang Yu (1): Revert "tests/amdgpu: fix bo eviction test issue" Marius Vlad (6): xf86drm: Add a human readable representation for format modifiers xf86drm: Add a vendor function to decode the format modifier xf86drm: Add support for decoding Nvidia format modifiers xf86drm: Add support for decoding AMD format modifiers xf86drm: Add support for decoding AMLOGIC format modifiers README.rst: Include some notes about syncing uapi headers Rahul Kumar (1): amdgpu: Added product name for E9390,E9560 and E9565 dgpu Tejas Upadhyay (1): intel: Add support for ADLP git tag: libdrm-2.4.107 https://dri.freedesktop.org/libdrm/libdrm-2.4.107.tar.xz SHA256: c554cef03b033636a975543eab363cc19081cb464595d3da1ec129f87370f888 libdrm-2.4.107.tar.xz SHA512: c7542ba15c4c934519a6a1f3cb1ec21effa820a805a030d0175313bb1cc796cd311f39596ead883f9f251679d701e262894c5a297d5cf45093c80a6cd818def0 libdrm-2.4.107.tar.xz PGP: https://dri.freedesktop.org/libdrm/libdrm-2.4.107.tar.xz.sig -BEGIN PGP SIGNATURE- iQIzBAEBCAAdFiEEiZqBCQC4FYB3QubYlaZ3ojCsSqkFAmDfDLAACgkQlaZ3ojCs Sqlsrw/+MnflXdeAGkMJYbCDb/mMhItR0zWh7KFTML2q+qgnCckAziEgmV0GNPYn ahw64WfDg0HMyYecZFYRJug0+gja2jWFBXirSAM6GbFwO5aEegOAwUa0D+LpQm0E BfJyccW52XT926shsWdIi+hJreboPzgPh7N9cs/8lz5NhrTQWUCVFyeQBW/1nlI/ rvXzJFQPKwgPOlUQiCub9dHSf4EcIMj2hCRukjj5g0hxBINQrJUxmashMnIoBaho SMx4AeVofIWwXOEDqJ68aF9R1NqL2b97FCOV60w8vmjvM5w8aJn2PW1LHsUZsSr8 Ztxh0FTTm9QBghVHu+7JYOFIy5kqdN3PRVUd9hjSmxf2dAq/wjDiogr8RL5lKT83 PD63aCj0guF8rQgNMLN+g/lfpv462l+eeWiiO/2ci6nFh9e7nusu2jE0ZiUPcGll M2UoJ1agJcI6TM3zVm6iYiGuE50rz+7ZKXnHpkwMYQQeIXRprUbOP1d+NepFCVt3 bf2Ad7FebtYnduwJfq4gEC4FoJEVDV26tuJy9je30n55/KjmiD+M/HSbIDCElTrp CzOMvrUcChm6VWZ1jkjfn2cUijxWlIzoeFrir9ci2quGWcjhHl5TBUI+nvi7lmWg EdjoYgthFxjJ7HGwJNkB+oKfuTD7r32DRZJXaIdwxxQBo0XpzlE= =48d6 -END PGP SIGNATURE-
Re: [Intel-gfx] [PATCH 08/53] drm/i915/xehp: Extra media engines - Part 2 (interrupts)
On 01/07/2021 21:23, Matt Roper wrote: From: John Harrison Xe_HP can have a lot of extra media engines. This patch adds the interrupt handler support for them. Cc: Tvrtko Ursulin Cc: Daniele Ceraolo Spurio Signed-off-by: John Harrison Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt_irq.c | 13 - drivers/gpu/drm/i915/i915_reg.h| 3 +++ 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c index c13462274fe8..b2de83be4d97 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c @@ -184,7 +184,13 @@ void gen11_gt_irq_reset(struct intel_gt *gt) intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK,~0); intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~0); + if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5)) + intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~0); + if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7)) + intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~0); + if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3)) + intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0); intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK, ~0); @@ -218,8 +224,13 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt) intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask); intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask); intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask); + if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5)) + intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~dmask); + if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7)) + intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~dmask); intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~dmask); Poor 0-1 sandwiched between 4-7 and 2-3. ;) With hopefully order restored: Reviewed-by: Tvrtko Ursulin Regards, Tvrtko - + if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3)) + intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~dmask); /* * RPS interrupts will get enabled/disabled on demand when RPS itself * is enabled/disabled. diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index d4546e871833..cb1716b6ce72 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -8076,7 +8076,10 @@ enum { #define GEN11_BCS_RSVD_INTR_MASK _MMIO(0x1900a0) #define GEN11_VCS0_VCS1_INTR_MASK _MMIO(0x1900a8) #define GEN11_VCS2_VCS3_INTR_MASK _MMIO(0x1900ac) +#define GEN12_VCS4_VCS5_INTR_MASK _MMIO(0x1900b0) +#define GEN12_VCS6_VCS7_INTR_MASK _MMIO(0x1900b4) #define GEN11_VECS0_VECS1_INTR_MASK _MMIO(0x1900d0) +#define GEN12_VECS2_VECS3_INTR_MASK_MMIO(0x1900d4) #define GEN11_GUC_SG_INTR_MASK_MMIO(0x1900e8) #define GEN11_GPM_WGBOXPERF_INTR_MASK _MMIO(0x1900ec) #define GEN11_CRYPTO_RSVD_INTR_MASK _MMIO(0x1900f0)
Re: [Intel-gfx] [PATCH 01/53] drm/i915: Add "release id" version
On 01/07/2021 21:23, Matt Roper wrote: From: Lucas De Marchi Besides the arch version returned by GRAPHICS_VER(), new platforms contain a "release id" to make clear the difference from one platform to another. Although for the first ones we may use them as if they were a What does "first ones" refer to here? major/minor version, that is not true for all platforms: we may have a `release_id == n` that is closer to `n - 2` than to `n - 1`. Hm this is a bit confusing. Is the sentence simply trying to say that, as the release id number is growing, hw capabilities are not simply accumulating but can be removed as well? Otherwise I am not sure how the user of these macros is supposed to act on this sentence. However the release id number is not defined by hardware until we start using the GMD_ID register. For the platforms before that register is useful we will set the values in software and we can set them as we please. So the plan is to set them so we can group different features under a single GRAPHICS_VER_FULL() check. After GMD_ID is used, the usefulness of a "full version check" will be greatly reduced and will be mostly used for deciding workarounds and a few code paths. So it makes sense to keep it as a separate field from graphics_ver. Also, currently there is not much use for the release id in media and display, so keep them out. This is a mix of 2 independent changes: one by me and the other by Matt Roper. Cc: Matt Roper Signed-off-by: Lucas De Marchi Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/i915_drv.h | 6 ++ drivers/gpu/drm/i915/intel_device_info.c | 2 ++ drivers/gpu/drm/i915/intel_device_info.h | 2 ++ 3 files changed, 10 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6dff4ca01241..9639800485b9 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1258,11 +1258,17 @@ static inline struct drm_i915_private *pdev_to_i915(struct pci_dev *pdev) */ #define IS_GEN(dev_priv, n) (GRAPHICS_VER(dev_priv) == (n)) +#define IP_VER(ver, release) ((ver) << 8 | (release)) + #define GRAPHICS_VER(i915)(INTEL_INFO(i915)->graphics_ver) +#define GRAPHICS_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->graphics_ver, \ + INTEL_INFO(i915)->graphics_ver_release) #define IS_GRAPHICS_VER(i915, from, until) \ (GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until)) #define MEDIA_VER(i915) (INTEL_INFO(i915)->media_ver) +#define MEDIA_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->media_ver, \ + INTEL_INFO(i915)->media_ver_release) #define IS_MEDIA_VER(i915, from, until) \ (MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until)) diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index 7eaa92fee421..e8ad14f002c1 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -97,7 +97,9 @@ void intel_device_info_print_static(const struct intel_device_info *info, struct drm_printer *p) { drm_printf(p, "graphics_ver: %u\n", info->graphics_ver); + drm_printf(p, "graphics_ver_release: %u\n", info->graphics_ver_release); I get the VER and VER_FULL in the macros but could 'ver' and 'ver_release' here and in the code simply be renamed to 'ver'/'version' and 'release'? Maybe it is just me but don't think I encountered the term "version release" before. Regards, Tvrtko drm_printf(p, "media_ver: %u\n", info->media_ver); + drm_printf(p, "media_ver_release: %u\n", info->media_ver_release); drm_printf(p, "display_ver: %u\n", info->display.ver); drm_printf(p, "gt: %d\n", info->gt); drm_printf(p, "iommu: %s\n", iommu_name()); diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index b326aff65cd6..944a5ff4df49 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -162,7 +162,9 @@ enum intel_ppgtt_type { struct intel_device_info { u8 graphics_ver; + u8 graphics_ver_release; u8 media_ver; + u8 media_ver_release; u8 gt; /* GT number, 0 if undefined */ intel_engine_mask_t platform_engine_mask; /* Engines supported by the HW */
Re: [Intel-gfx] [PATCH 07/53] drm/i915/xehp: Extra media engines - Part 1 (engine definitions)
On 01/07/2021 21:23, Matt Roper wrote: From: John Harrison Xe_HP can have a lot of extra media engines. This patch adds the basic definitions for them. Cc: Tvrtko Ursulin Signed-off-by: John Harrison Signed-off-by: Tomas Winkler Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 ++- drivers/gpu/drm/i915/gt/intel_engine_cs.c| 50 drivers/gpu/drm/i915/gt/intel_engine_types.h | 14 -- drivers/gpu/drm/i915/i915_reg.h | 6 +++ 4 files changed, 69 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c index 87b06572fd2e..35edc55720f4 100644 --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c @@ -279,7 +279,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode) if (mode & EMIT_INVALIDATE) aux_inv = rq->engine->mask & ~BIT(BCS0); if (aux_inv) - cmd += 2 * hweight8(aux_inv) + 2; + cmd += 2 * hweight32(aux_inv) + 2; cs = intel_ring_begin(rq, cmd); if (IS_ERR(cs)) @@ -313,9 +313,8 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode) struct intel_engine_cs *engine; unsigned int tmp; - *cs++ = MI_LOAD_REGISTER_IMM(hweight8(aux_inv)); - for_each_engine_masked(engine, rq->engine->gt, - aux_inv, tmp) { + *cs++ = MI_LOAD_REGISTER_IMM(hweight32(aux_inv)); + for_each_engine_masked(engine, rq->engine->gt, aux_inv, tmp) { *cs++ = i915_mmio_reg_offset(aux_inv_reg(engine)); *cs++ = AUX_INV; } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 4ab2c9abb943..6e2aa1acc4d4 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -104,6 +104,38 @@ static const struct engine_info intel_engines[] = { { .graphics_ver = 11, .base = GEN11_BSD4_RING_BASE } }, }, + [VCS4] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_DECODE_CLASS, + .instance = 4, + .mmio_bases = { + { .graphics_ver = 11, .base = XEHP_BSD5_RING_BASE } + }, + }, + [VCS5] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_DECODE_CLASS, + .instance = 5, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_BSD6_RING_BASE } + }, + }, + [VCS6] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_DECODE_CLASS, + .instance = 6, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_BSD7_RING_BASE } + }, + }, + [VCS7] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_DECODE_CLASS, + .instance = 7, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_BSD8_RING_BASE } + }, + }, [VECS0] = { .hw_id = VECS0_HW, .class = VIDEO_ENHANCEMENT_CLASS, @@ -121,6 +153,22 @@ static const struct engine_info intel_engines[] = { { .graphics_ver = 11, .base = GEN11_VEBOX2_RING_BASE } }, }, + [VECS2] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_ENHANCEMENT_CLASS, + .instance = 2, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_VEBOX3_RING_BASE } + }, + }, + [VECS3] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_ENHANCEMENT_CLASS, + .instance = 3, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_VEBOX4_RING_BASE } + }, + }, }; /** @@ -269,6 +317,8 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id) BUILD_BUG_ON(MAX_ENGINE_CLASS >= BIT(GEN11_ENGINE_CLASS_WIDTH)); BUILD_BUG_ON(MAX_ENGINE_INSTANCE >= BIT(GEN11_ENGINE_INSTANCE_WIDTH)); + BUILD_BUG_ON(I915_MAX_VCS > (MAX_ENGINE_INSTANCE + 1)); + BUILD_BUG_ON(I915_MAX_VECS > (MAX_ENGINE_INSTANCE + 1)); if (GEM_DEBUG_WARN_ON(id >= ARRAY_SIZE(gt->engine))) return -EINVAL; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 5b91068ab277..b25f594a7e4b 100644 ---
Re: [Intel-gfx] [drm-intel:drm-intel-gt-next 8/14] drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: uninitialized symbol 'rq'.
On Fri, Jul 02, 2021 at 12:34:33PM +0100, Matthew Auld wrote: > > > > cf586021642d80 Chris Wilson 2021-06-17 85 err = > > > > fn(migrate, , src, dst, ); > > > > cf586021642d80 Chris Wilson 2021-06-17 86 if (!err) > > > > cf586021642d80 Chris Wilson 2021-06-17 87 > > > > continue; > > > > > > > > Does fn() initialize "rq" on the success path? Anyway Smatch would > > > > complain anyway because it thinks the list could be empty or that we > > > > might hit and early continue for everything. > > > > > > The fn() will always first initialize the rq to NULL. If it returns > > > success then rq will always be a valid rq. If it returns an err then > > > the rq might be NULL, or a valid rq depending on how far the copy/fn > > > got. > > > > > > And for_i915_gem_ww() will always run at least once, since ww->loop = > > > true, so this looks like a false positive? > > > > You don't think i915_gem_object_lock(), i915_gem_object_pin_map() or > > i915_gem_object_pin_map() can fail? > > Yeah, they can totally fail but then we mostly likely just hit the > err_out. The for_i915_gem_ww() is a little strange since it's not > really looping over anything, it's just about retrying the block if we > see -EDEADLK(which involves dropping some locks), if we see any other > error then the loop is terminated with ww->loop = false, which then > hits the goto err_out. > Ah, yeah, you're right. False positive. I hadn't looked at this code in context (I only had reviewed the email). Now that I've pulled the tree and looked at the code, then I'm sort of surprised that Smatch generates a warning... I will investigate some more. Thanks! regards, dan carpenter
Re: [Intel-gfx] [PATCH 05/53] drm/i915/gen12: Use fuse info to enable SFC
On 01/07/2021 21:23, Matt Roper wrote: From: Venkata Sandeep Dhanalakota In Gen12 there are various fuse combinations and in each configuration vdbox engine may be connected to SFC depending on which engines are available, so we need to set the SFC capability based on fuse value from the hardware. Even numbered phyical instance always have SFC, odd physical numbered physical instances have SFC only if previous even instance is fused off. Just a few nits. Bspec: 48028 Cc: Tvrtko Ursulin Cc: Daniele Ceraolo Spurio Signed-off-by: Venkata Sandeep Dhanalakota Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 30 ++- 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 151870d8fdd3..4ab2c9abb943 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -442,6 +442,28 @@ void intel_engines_free(struct intel_gt *gt) } } +static inline Inline is not desired here. +bool vdbox_has_sfc(struct drm_i915_private *i915, unsigned int physical_vdbox, + unsigned int logical_vdbox, u16 vdbox_mask) +{ I'd be tempted to prefix the function name with gen11_ so it is clearer it does not apply to earlier gens. Because if looking just at the diff out of context below, one can wonder if there is a functional change or not. There isn't, because there is a bailout for gen < 11 early in init_engine_mask(), but perhaps gen11 function name prefix would make this a bit more self-documenting. + /* +* In Gen11, only even numbered logical VDBOXes are hooked +* up to an SFC (Scaler & Format Converter) unit. +* In Gen12, Even numbered phyical instance always are connected physical +* to an SFC. Odd numbered physical instances have SFC only if +* previous even instance is fused off. +*/ + if (GRAPHICS_VER(i915) == 12) { + return (physical_vdbox % 2 == 0) || + !(BIT(physical_vdbox - 1) & vdbox_mask); + } else if (GRAPHICS_VER(i915) == 11) { + return logical_vdbox % 2 == 0; + } Not need for curlies on these branches. + + MISSING_CASE(GRAPHICS_VER(i915)); + return false; +} + /* * Determine which engines are fused off in our particular hardware. * Note that we have a catch-22 situation where we need to be able to access @@ -493,13 +515,9 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt) continue; } - /* -* In Gen11, only even numbered logical VDBOXes are -* hooked up to an SFC (Scaler & Format Converter) unit. -* In TGL each VDBOX has access to an SFC. -*/ - if (GRAPHICS_VER(i915) >= 12 || logical_vdbox++ % 2 == 0) + if (vdbox_has_sfc(i915, i, logical_vdbox, vdbox_mask)) gt->info.vdbox_sfc_access |= BIT(i); + logical_vdbox++; } drm_dbg(>drm, "vdbox enable: %04x, instances: %04lx\n", vdbox_mask, VDBOX_MASK(gt)); Regards, Tvrtko
[PATCH] drm/nouveau: Remove redundant error check on variable ret
From: Colin Ian King The call to drm_dp_aux_init never returns an error code and there is no error return being assigned to variable ret. The check for an error in ret is always false since ret is still zero from the start of the function so the init error check and error message is redundant and can be removed. Addresses-Coverity: ("Logically dead code") Fixes: fd43ad9d47e7 ("drm/nouveau/kms/nv50-: Move AUX adapter reg to connector late register/early unregister") Signed-off-by: Colin Ian King --- drivers/gpu/drm/nouveau/nouveau_connector.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c index 22b83a6577eb..f37e5f28a93f 100644 --- a/drivers/gpu/drm/nouveau/nouveau_connector.c +++ b/drivers/gpu/drm/nouveau/nouveau_connector.c @@ -1362,12 +1362,6 @@ nouveau_connector_create(struct drm_device *dev, dcbe->hasht, dcbe->hashm); nv_connector->aux.name = kstrdup(aux_name, GFP_KERNEL); drm_dp_aux_init(_connector->aux); - if (ret) { - NV_ERROR(drm, "Failed to init AUX adapter for sor-%04x-%04x: %d\n", -dcbe->hasht, dcbe->hashm, ret); - kfree(nv_connector); - return ERR_PTR(ret); - } fallthrough; default: funcs = _connector_funcs; -- 2.31.1
Re: [Intel-gfx] [drm-intel:drm-intel-gt-next 8/14] drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: uninitialized symbol 'rq'.
On Fri, 2 Jul 2021 at 12:14, Dan Carpenter wrote: > > On Fri, Jul 02, 2021 at 02:07:27PM +0300, Dan Carpenter wrote: > > On Fri, Jul 02, 2021 at 11:32:45AM +0100, Matthew Auld wrote: > > > On Fri, 2 Jul 2021 at 09:45, Dan Carpenter > > > wrote: > > > > cf586021642d80 Chris Wilson 2021-06-17 84 > > > > cf586021642d80 Chris Wilson 2021-06-17 85 err = > > > > fn(migrate, , src, dst, ); > > > > cf586021642d80 Chris Wilson 2021-06-17 86 if (!err) > > > > cf586021642d80 Chris Wilson 2021-06-17 87 > > > > continue; > > > > > > > > Does fn() initialize "rq" on the success path? Anyway Smatch would > > > > complain anyway because it thinks the list could be empty or that we > > > > might hit and early continue for everything. > > > > > > The fn() will always first initialize the rq to NULL. If it returns > > > success then rq will always be a valid rq. If it returns an err then > > > the rq might be NULL, or a valid rq depending on how far the copy/fn > > > got. > > > > > > And for_i915_gem_ww() will always run at least once, since ww->loop = > > > true, so this looks like a false positive? > > > > You don't think i915_gem_object_lock(), i915_gem_object_pin_map() or > > i915_gem_object_pin_map() can fail? > > Btw, I sincerely hope that we will re-enable GCC's uninitialized > variable checks. Will GCC be able to verify that this is initialized? 34b07d47dd00 ("drm/i915: Enable -Wuninitialized") GCC doesn't complain AFAIK. > > regards, > dan carpenter >
Re: [PATCH v2] drm/dbi: Print errors for mipi_dbi_command()
Hi Linus, I love your patch! Yet something to improve: [auto build test ERROR on linus/master] [also build test ERROR on v5.13 next-20210701] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Linus-Walleij/drm-dbi-Print-errors-for-mipi_dbi_command/20210702-180745 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 3dbdb38e286903ec220aaf1fb29a8d94297da246 config: m68k-allmodconfig (attached as .config) compiler: m68k-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/42d93a52e398adbb1fe2dfbc895c649cc8d42780 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Linus-Walleij/drm-dbi-Print-errors-for-mipi_dbi_command/20210702-180745 git checkout 42d93a52e398adbb1fe2dfbc895c649cc8d42780 # save the attached .config to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross O=build_dir ARCH=m68k SHELL=/bin/bash drivers/gpu/drm/tiny/ If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): In file included from drivers/gpu/drm/tiny/st7586.c:25: drivers/gpu/drm/tiny/st7586.c: In function 'st7586_pipe_disable': >> include/drm/drm_mipi_dbi.h:186:27: error: invalid type argument of '->' >> (have 'struct mipi_dbi') 186 | struct device *dev = >spi->dev; \ | ^~ drivers/gpu/drm/tiny/st7586.c:260:2: note: in expansion of macro 'mipi_dbi_command' 260 | mipi_dbi_command(>dbi, MIPI_DCS_SET_DISPLAY_OFF); | ^~~~ vim +186 include/drm/drm_mipi_dbi.h 160 161 u32 mipi_dbi_spi_cmd_max_speed(struct spi_device *spi, size_t len); 162 int mipi_dbi_spi_transfer(struct spi_device *spi, u32 speed_hz, 163u8 bpw, const void *buf, size_t len); 164 165 int mipi_dbi_command_read(struct mipi_dbi *dbi, u8 cmd, u8 *val); 166 int mipi_dbi_command_buf(struct mipi_dbi *dbi, u8 cmd, u8 *data, size_t len); 167 int mipi_dbi_command_stackbuf(struct mipi_dbi *dbi, u8 cmd, const u8 *data, 168size_t len); 169 int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer *fb, 170struct drm_rect *clip, bool swap); 171 /** 172 * mipi_dbi_command - MIPI DCS command with optional parameter(s) 173 * @dbi: MIPI DBI structure 174 * @cmd: Command 175 * @seq: Optional parameter(s) 176 * 177 * Send MIPI DCS command to the controller. Use mipi_dbi_command_read() for 178 * get/read. 179 * 180 * Returns: 181 * Zero on success, negative error code on failure. 182 */ 183 #define mipi_dbi_command(dbi, cmd, seq...) \ 184 ({ \ 185 const u8 d[] = { seq }; \ > 186 struct device *dev = >spi->dev; \ 187 int ret; \ 188 ret = mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \ 189 if (ret) \ 190 dev_err_ratelimited(dev, "error %d when sending command\n", ret); \ 191 }) 192 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH v15 12/12] of: Add plumbing for restricted DMA pool
On 2021-07-02 04:08, Guenter Roeck wrote: Hi, On Thu, Jun 24, 2021 at 11:55:26PM +0800, Claire Chang wrote: If a device is not behind an IOMMU, we look up the device node and set up the restricted DMA when the restricted-dma-pool is presented. Signed-off-by: Claire Chang Tested-by: Stefano Stabellini Tested-by: Will Deacon With this patch in place, all sparc and sparc64 qemu emulations fail to boot. Symptom is that the root file system is not found. Reverting this patch fixes the problem. Bisect log is attached. Ah, OF_ADDRESS depends on !SPARC, so of_dma_configure_id() is presumably returning an unexpected -ENODEV from the of_dma_set_restricted_buffer() stub. That should probably be returning 0 instead, since either way it's not an error condition for it to simply do nothing. Robin. Guenter --- # bad: [fb0ca446157a86b75502c1636b0d81e642fe6bf1] Add linux-next specific files for 20210701 # good: [62fb9874f5da54fdb243003b386128037319b219] Linux 5.13 git bisect start 'HEAD' 'v5.13' # bad: [f63c4fda987a19b1194cc45cb72fd5bf968d9d90] Merge remote-tracking branch 'rdma/for-next' git bisect bad f63c4fda987a19b1194cc45cb72fd5bf968d9d90 # good: [46bb5dd1d2a63e906e374e97dfd4a5e33934b1c4] Merge remote-tracking branch 'ipsec/master' git bisect good 46bb5dd1d2a63e906e374e97dfd4a5e33934b1c4 # good: [43ba6969cfb8185353a7a6fc79070f13b9e3d6d3] Merge remote-tracking branch 'clk/clk-next' git bisect good 43ba6969cfb8185353a7a6fc79070f13b9e3d6d3 # good: [1ca5eddcf8dca1d6345471c6404e7364af0d7019] Merge remote-tracking branch 'fuse/for-next' git bisect good 1ca5eddcf8dca1d6345471c6404e7364af0d7019 # good: [8f6d7b3248705920187263a4e7147b0752ec7dcf] Merge remote-tracking branch 'pci/next' git bisect good 8f6d7b3248705920187263a4e7147b0752ec7dcf # good: [df1885a755784da3ef285f36d9230c1d090ef186] RDMA/rtrs_clt: Alloc less memory with write path fast memory registration git bisect good df1885a755784da3ef285f36d9230c1d090ef186 # good: [93d31efb58c8ad4a66bbedbc2d082df458c04e45] Merge remote-tracking branch 'cpufreq-arm/cpufreq/arm/linux-next' git bisect good 93d31efb58c8ad4a66bbedbc2d082df458c04e45 # good: [46308965ae6fdc7c25deb2e8c048510ae51bbe66] RDMA/irdma: Check contents of user-space irdma_mem_reg_req object git bisect good 46308965ae6fdc7c25deb2e8c048510ae51bbe66 # good: [6de7a1d006ea9db235492b288312838d6878385f] thermal/drivers/int340x/processor_thermal: Split enumeration and processing part git bisect good 6de7a1d006ea9db235492b288312838d6878385f # good: [081bec2577cda3d04f6559c60b6f4e2242853520] dt-bindings: of: Add restricted DMA pool git bisect good 081bec2577cda3d04f6559c60b6f4e2242853520 # good: [bf95ac0bcd69979af146852f6a617a60285ebbc1] Merge remote-tracking branch 'thermal/thermal/linux-next' git bisect good bf95ac0bcd69979af146852f6a617a60285ebbc1 # good: [3d8287544223a3d2f37981c1f9ffd94d0b5e9ffc] RDMA/core: Always release restrack object git bisect good 3d8287544223a3d2f37981c1f9ffd94d0b5e9ffc # bad: [cff1f23fad6e0bd7d671acce0d15285c709f259c] Merge remote-tracking branch 'swiotlb/linux-next' git bisect bad cff1f23fad6e0bd7d671acce0d15285c709f259c # bad: [b655006619b7bccd0dc1e055bd72de5d613e7b5c] of: Add plumbing for restricted DMA pool git bisect bad b655006619b7bccd0dc1e055bd72de5d613e7b5c # first bad commit: [b655006619b7bccd0dc1e055bd72de5d613e7b5c] of: Add plumbing for restricted DMA pool
Re: [Intel-gfx] [drm-intel:drm-intel-gt-next 8/14] drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: uninitialized symbol 'rq'.
On Fri, 2 Jul 2021 at 12:07, Dan Carpenter wrote: > > On Fri, Jul 02, 2021 at 11:32:45AM +0100, Matthew Auld wrote: > > On Fri, 2 Jul 2021 at 09:45, Dan Carpenter wrote: > > > > > > tree: git://anongit.freedesktop.org/drm-intel drm-intel-gt-next > > > head: 5cd57f676bb946a00275408f0dd0d75dbc466d25 > > > commit: cf586021642d8017cde111b7dd1ba86224e9da51 [8/14] drm/i915/gt: > > > Pipelined page migration > > > config: x86_64-randconfig-m001-20210630 (attached as .config) > > > compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 > > > > > > If you fix the issue, kindly add following tag as appropriate > > > Reported-by: kernel test robot > > > Reported-by: Dan Carpenter > > > > > > New smatch warnings: > > > drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: > > > uninitialized symbol 'rq'. > > > drivers/gpu/drm/i915/gt/selftest_migrate.c:113 copy() error: > > > uninitialized symbol 'vaddr'. > > > > > > Old smatch warnings: > > > drivers/gpu/drm/i915/gem/i915_gem_object.h:182 __i915_gem_object_lock() > > > error: we previously assumed 'ww' could be null (see line 171) > > > > > > vim +/rq +102 drivers/gpu/drm/i915/gt/selftest_migrate.c > > > > > > cf586021642d80 Chris Wilson 2021-06-17 32 static int copy(struct > > > intel_migrate *migrate, > > > cf586021642d80 Chris Wilson 2021-06-17 33 int (*fn)(struct > > > intel_migrate *migrate, > > > cf586021642d80 Chris Wilson 2021-06-17 34 struct > > > i915_gem_ww_ctx *ww, > > > cf586021642d80 Chris Wilson 2021-06-17 35 struct > > > drm_i915_gem_object *src, > > > cf586021642d80 Chris Wilson 2021-06-17 36 struct > > > drm_i915_gem_object *dst, > > > cf586021642d80 Chris Wilson 2021-06-17 37 struct > > > i915_request **out), > > > cf586021642d80 Chris Wilson 2021-06-17 38 u32 sz, struct > > > rnd_state *prng) > > > cf586021642d80 Chris Wilson 2021-06-17 39 { > > > cf586021642d80 Chris Wilson 2021-06-17 40 struct drm_i915_private > > > *i915 = migrate->context->engine->i915; > > > cf586021642d80 Chris Wilson 2021-06-17 41 struct > > > drm_i915_gem_object *src, *dst; > > > cf586021642d80 Chris Wilson 2021-06-17 42 struct i915_request *rq; > > > cf586021642d80 Chris Wilson 2021-06-17 43 struct i915_gem_ww_ctx ww; > > > cf586021642d80 Chris Wilson 2021-06-17 44 u32 *vaddr; > > > cf586021642d80 Chris Wilson 2021-06-17 45 int err = 0; > > > > > > One way to silence these warnings would be to initialize err = -EINVAL. > > > Then Smatch would know that we goto err_out for an empty list. > > > > > > cf586021642d80 Chris Wilson 2021-06-17 46 int i; > > > cf586021642d80 Chris Wilson 2021-06-17 47 > > > cf586021642d80 Chris Wilson 2021-06-17 48 src = > > > create_lmem_or_internal(i915, sz); > > > cf586021642d80 Chris Wilson 2021-06-17 49 if (IS_ERR(src)) > > > cf586021642d80 Chris Wilson 2021-06-17 50 return 0; > > > cf586021642d80 Chris Wilson 2021-06-17 51 > > > cf586021642d80 Chris Wilson 2021-06-17 52 dst = > > > i915_gem_object_create_internal(i915, sz); > > > cf586021642d80 Chris Wilson 2021-06-17 53 if (IS_ERR(dst)) > > > cf586021642d80 Chris Wilson 2021-06-17 54 goto err_free_src; > > > cf586021642d80 Chris Wilson 2021-06-17 55 > > > cf586021642d80 Chris Wilson 2021-06-17 56 for_i915_gem_ww(, err, > > > true) { > > > cf586021642d80 Chris Wilson 2021-06-17 57 err = > > > i915_gem_object_lock(src, ); > > > cf586021642d80 Chris Wilson 2021-06-17 58 if (err) > > > cf586021642d80 Chris Wilson 2021-06-17 59 continue; > > > cf586021642d80 Chris Wilson 2021-06-17 60 > > > cf586021642d80 Chris Wilson 2021-06-17 61 err = > > > i915_gem_object_lock(dst, ); > > > cf586021642d80 Chris Wilson 2021-06-17 62 if (err) > > > cf586021642d80 Chris Wilson 2021-06-17 63 continue; > > > cf586021642d80 Chris Wilson 2021-06-17 64 > > > cf586021642d80 Chris Wilson 2021-06-17 65 vaddr = > > > i915_gem_object_pin_map(src, I915_MAP_WC); > > > cf586021642d80 Chris Wilson 2021-06-17 66 if > > > (IS_ERR(vaddr)) { > > > cf586021642d80 Chris Wilson 2021-06-17 67 err = > > > PTR_ERR(vaddr); > > > cf586021642d80 Chris Wilson 2021-06-17 68 continue; > > > cf586021642d80 Chris Wilson 2021-06-17 69 } > > > cf586021642d80 Chris Wilson 2021-06-17 70 > > > cf586021642d80 Chris Wilson 2021-06-17 71 for (i = 0; i < > > > sz / sizeof(u32); i++) > > > cf586021642d80 Chris Wilson 2021-06-17 72 vaddr[i] > > > = i; > > > cf586021642d80 Chris Wilson 2021-06-17 73 > > > i915_gem_object_flush_map(src); > > > cf586021642d80 Chris Wilson 2021-06-17 74 > > > cf586021642d80 Chris Wilson 2021-06-17 75 vaddr = > >
[PATCH 4/4] drm/msm: always wait for the exclusive fence
Drivers also need to to sync to the exclusive fence when a shared one is present. Completely untested since the driver won't even compile on !ARM. Signed-off-by: Christian König --- drivers/gpu/drm/msm/msm_gem.c | 16 +++- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index a94a43de95ef..72a07e311de3 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -817,17 +817,15 @@ int msm_gem_sync_object(struct drm_gem_object *obj, struct dma_fence *fence; int i, ret; - fobj = dma_resv_shared_list(obj->resv); - if (!fobj || (fobj->shared_count == 0)) { - fence = dma_resv_excl_fence(obj->resv); - /* don't need to wait on our own fences, since ring is fifo */ - if (fence && (fence->context != fctx->context)) { - ret = dma_fence_wait(fence, true); - if (ret) - return ret; - } + fence = dma_resv_excl_fence(obj->resv); + /* don't need to wait on our own fences, since ring is fifo */ + if (fence && (fence->context != fctx->context)) { + ret = dma_fence_wait(fence, true); + if (ret) + return ret; } + fobj = dma_resv_shared_list(obj->resv); if (!exclusive || !fobj) return 0; -- 2.25.1
[PATCH 3/4] drm/nouveau: always wait for the exclusive fence
Drivers also need to to sync to the exclusive fence when a shared one is present. Signed-off-by: Christian König --- drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index 6b43918035df..05d0b3eb3690 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -358,7 +358,7 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool e fobj = dma_resv_shared_list(resv); fence = dma_resv_excl_fence(resv); - if (fence && (!exclusive || !fobj || !fobj->shared_count)) { + if (fence) { struct nouveau_channel *prev = NULL; bool must_wait = true; -- 2.25.1
[PATCH 2/4] dma-buf: fix dma_resv_test_signaled test_all handling v2
As the name implies if testing all fences is requested we should indeed test all fences and not skip the exclusive one because we see shared ones. v2: fix logic once more Signed-off-by: Christian König --- drivers/dma-buf/dma-resv.c | 33 - 1 file changed, 12 insertions(+), 21 deletions(-) diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 4ab02b6c387a..18dd5a6ca06c 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -618,25 +618,21 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence) */ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) { - unsigned int seq, shared_count; + struct dma_fence *fence; + unsigned int seq; int ret; rcu_read_lock(); retry: ret = true; - shared_count = 0; seq = read_seqcount_begin(>seq); if (test_all) { struct dma_resv_list *fobj = dma_resv_shared_list(obj); - unsigned int i; - - if (fobj) - shared_count = fobj->shared_count; + unsigned int i, shared_count; + shared_count = fobj ? fobj->shared_count : 0; for (i = 0; i < shared_count; ++i) { - struct dma_fence *fence; - fence = rcu_dereference(fobj->shared[i]); ret = dma_resv_test_signaled_single(fence); if (ret < 0) @@ -644,24 +640,19 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) else if (!ret) break; } - - if (read_seqcount_retry(>seq, seq)) - goto retry; } - if (!shared_count) { - struct dma_fence *fence_excl = dma_resv_excl_fence(obj); - - if (fence_excl) { - ret = dma_resv_test_signaled_single(fence_excl); - if (ret < 0) - goto retry; + fence = dma_resv_excl_fence(obj); + if (ret && fence) { + ret = dma_resv_test_signaled_single(fence); + if (ret < 0) + goto retry; - if (read_seqcount_retry(>seq, seq)) - goto retry; - } } + if (read_seqcount_retry(>seq, seq)) + goto retry; + rcu_read_unlock(); return ret; } -- 2.25.1
[PATCH 1/4] dma-buf: add some more kerneldoc to dma_resv_add_shared_fence
Explicitly document that code can't assume that shared fences signal after the exclusive fence. Signed-off-by: Christian König --- drivers/dma-buf/dma-resv.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index f26c71747d43..4ab02b6c387a 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -235,7 +235,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max); * @fence: the shared fence to add * * Add a fence to a shared slot, obj->lock must be held, and - * dma_resv_reserve_shared() has been called. + * dma_resv_reserve_shared() has been called. The shared fences can signal in + * any order and there is especially no guarantee that shared fences signal + * after the exclusive one. Code relying on any signaling order is broken and + * needs to be fixed. */ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence) { -- 2.25.1
Start fixing the shared to exclusive fence dependencies.
Hey Daniel, even when you are not 100% done with the driver audit I think we should push that patch set here to drm-misc-next now so that it can end up in 5.15. Not having any dependency between the exclusive and the shared fence signaling order is just way more defensive than the current model. As discussed I'm holding back any amdgpu and TTM workarounds which could be removed for now. Thoughts? Thanks, Christian.
Re: [Intel-gfx] [drm-intel:drm-intel-gt-next 8/14] drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: uninitialized symbol 'rq'.
On Fri, Jul 02, 2021 at 02:07:27PM +0300, Dan Carpenter wrote: > On Fri, Jul 02, 2021 at 11:32:45AM +0100, Matthew Auld wrote: > > On Fri, 2 Jul 2021 at 09:45, Dan Carpenter wrote: > > > cf586021642d80 Chris Wilson 2021-06-17 84 > > > cf586021642d80 Chris Wilson 2021-06-17 85 err = fn(migrate, > > > , src, dst, ); > > > cf586021642d80 Chris Wilson 2021-06-17 86 if (!err) > > > cf586021642d80 Chris Wilson 2021-06-17 87 continue; > > > > > > Does fn() initialize "rq" on the success path? Anyway Smatch would > > > complain anyway because it thinks the list could be empty or that we > > > might hit and early continue for everything. > > > > The fn() will always first initialize the rq to NULL. If it returns > > success then rq will always be a valid rq. If it returns an err then > > the rq might be NULL, or a valid rq depending on how far the copy/fn > > got. > > > > And for_i915_gem_ww() will always run at least once, since ww->loop = > > true, so this looks like a false positive? > > You don't think i915_gem_object_lock(), i915_gem_object_pin_map() or > i915_gem_object_pin_map() can fail? Btw, I sincerely hope that we will re-enable GCC's uninitialized variable checks. Will GCC be able to verify that this is initialized? regards, dan carpenter
Re: [Intel-gfx] [drm-intel:drm-intel-gt-next 8/14] drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: uninitialized symbol 'rq'.
On Fri, Jul 02, 2021 at 11:32:45AM +0100, Matthew Auld wrote: > On Fri, 2 Jul 2021 at 09:45, Dan Carpenter wrote: > > > > tree: git://anongit.freedesktop.org/drm-intel drm-intel-gt-next > > head: 5cd57f676bb946a00275408f0dd0d75dbc466d25 > > commit: cf586021642d8017cde111b7dd1ba86224e9da51 [8/14] drm/i915/gt: > > Pipelined page migration > > config: x86_64-randconfig-m001-20210630 (attached as .config) > > compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 > > > > If you fix the issue, kindly add following tag as appropriate > > Reported-by: kernel test robot > > Reported-by: Dan Carpenter > > > > New smatch warnings: > > drivers/gpu/drm/i915/gt/selftest_migrate.c:102 copy() error: uninitialized > > symbol 'rq'. > > drivers/gpu/drm/i915/gt/selftest_migrate.c:113 copy() error: uninitialized > > symbol 'vaddr'. > > > > Old smatch warnings: > > drivers/gpu/drm/i915/gem/i915_gem_object.h:182 __i915_gem_object_lock() > > error: we previously assumed 'ww' could be null (see line 171) > > > > vim +/rq +102 drivers/gpu/drm/i915/gt/selftest_migrate.c > > > > cf586021642d80 Chris Wilson 2021-06-17 32 static int copy(struct > > intel_migrate *migrate, > > cf586021642d80 Chris Wilson 2021-06-17 33 int (*fn)(struct > > intel_migrate *migrate, > > cf586021642d80 Chris Wilson 2021-06-17 34 struct > > i915_gem_ww_ctx *ww, > > cf586021642d80 Chris Wilson 2021-06-17 35 struct > > drm_i915_gem_object *src, > > cf586021642d80 Chris Wilson 2021-06-17 36 struct > > drm_i915_gem_object *dst, > > cf586021642d80 Chris Wilson 2021-06-17 37 struct > > i915_request **out), > > cf586021642d80 Chris Wilson 2021-06-17 38 u32 sz, struct > > rnd_state *prng) > > cf586021642d80 Chris Wilson 2021-06-17 39 { > > cf586021642d80 Chris Wilson 2021-06-17 40 struct drm_i915_private > > *i915 = migrate->context->engine->i915; > > cf586021642d80 Chris Wilson 2021-06-17 41 struct drm_i915_gem_object > > *src, *dst; > > cf586021642d80 Chris Wilson 2021-06-17 42 struct i915_request *rq; > > cf586021642d80 Chris Wilson 2021-06-17 43 struct i915_gem_ww_ctx ww; > > cf586021642d80 Chris Wilson 2021-06-17 44 u32 *vaddr; > > cf586021642d80 Chris Wilson 2021-06-17 45 int err = 0; > > > > One way to silence these warnings would be to initialize err = -EINVAL. > > Then Smatch would know that we goto err_out for an empty list. > > > > cf586021642d80 Chris Wilson 2021-06-17 46 int i; > > cf586021642d80 Chris Wilson 2021-06-17 47 > > cf586021642d80 Chris Wilson 2021-06-17 48 src = > > create_lmem_or_internal(i915, sz); > > cf586021642d80 Chris Wilson 2021-06-17 49 if (IS_ERR(src)) > > cf586021642d80 Chris Wilson 2021-06-17 50 return 0; > > cf586021642d80 Chris Wilson 2021-06-17 51 > > cf586021642d80 Chris Wilson 2021-06-17 52 dst = > > i915_gem_object_create_internal(i915, sz); > > cf586021642d80 Chris Wilson 2021-06-17 53 if (IS_ERR(dst)) > > cf586021642d80 Chris Wilson 2021-06-17 54 goto err_free_src; > > cf586021642d80 Chris Wilson 2021-06-17 55 > > cf586021642d80 Chris Wilson 2021-06-17 56 for_i915_gem_ww(, err, > > true) { > > cf586021642d80 Chris Wilson 2021-06-17 57 err = > > i915_gem_object_lock(src, ); > > cf586021642d80 Chris Wilson 2021-06-17 58 if (err) > > cf586021642d80 Chris Wilson 2021-06-17 59 continue; > > cf586021642d80 Chris Wilson 2021-06-17 60 > > cf586021642d80 Chris Wilson 2021-06-17 61 err = > > i915_gem_object_lock(dst, ); > > cf586021642d80 Chris Wilson 2021-06-17 62 if (err) > > cf586021642d80 Chris Wilson 2021-06-17 63 continue; > > cf586021642d80 Chris Wilson 2021-06-17 64 > > cf586021642d80 Chris Wilson 2021-06-17 65 vaddr = > > i915_gem_object_pin_map(src, I915_MAP_WC); > > cf586021642d80 Chris Wilson 2021-06-17 66 if (IS_ERR(vaddr)) { > > cf586021642d80 Chris Wilson 2021-06-17 67 err = > > PTR_ERR(vaddr); > > cf586021642d80 Chris Wilson 2021-06-17 68 continue; > > cf586021642d80 Chris Wilson 2021-06-17 69 } > > cf586021642d80 Chris Wilson 2021-06-17 70 > > cf586021642d80 Chris Wilson 2021-06-17 71 for (i = 0; i < sz > > / sizeof(u32); i++) > > cf586021642d80 Chris Wilson 2021-06-17 72 vaddr[i] = > > i; > > cf586021642d80 Chris Wilson 2021-06-17 73 > > i915_gem_object_flush_map(src); > > cf586021642d80 Chris Wilson 2021-06-17 74 > > cf586021642d80 Chris Wilson 2021-06-17 75 vaddr = > > i915_gem_object_pin_map(dst, I915_MAP_WC); > > cf586021642d80 Chris Wilson 2021-06-17 76 if (IS_ERR(vaddr)) { > > cf586021642d80 Chris Wilson 2021-06-17 77 err = > > PTR_ERR(vaddr); > > cf586021642d80 Chris
Questions over DSI within DRM.
Hi All I'm trying to get DSI devices working reliably on the Raspberry Pi, but I'm hitting a number of places where it isn't clear as to the expected behaviour within DRM. Power on state. Many devices want the DSI clock and/or data lanes in LP-11 state when they are powered up. With the normal calling sequence of: - panel/bridge pre_enable calls from connector towards the encoder. - encoder enable which also enables video. - panel/bridge enable calls from encoder to connector. there is no point at which the DSI tx is initialised but not transmitting video. What DSI states are expected to be adopted at each point? On a similar theme, some devices want the clock lane in HS mode early so they can use it in place of an external oscillator, but the data lanes still in LP-11. There appears to be no way for the display/bridge to signal this requirement or it be achieved. host_transfer calls can supposedly be made at any time, however unless MIPI_DSI_MSG_USE_LPM is set in the message then we're meant to send it in high speed mode. If this is before a mode has been set, what defines the link frequency parameters at this point? Adopting a random default sounds like a good way to get undefined behaviour. DSI burst mode needs to set the DSI link frequency independently of the display mode. How is that meant to be configured? I would have expected it to come from DT due to link frequency often being chosen based on EMC restrictions, but I don't see such a thing in any binding. As a follow on, bridge devices can support burst mode (eg TI's SN65DSI83 that's just been merged), so it needs to know the desired panel timings for the output side of the bridge, but the DSI link timings to set up the bridge's PLL. What's the correct way for signalling that? drm_crtc_state->adjusted_mode vs drm_crtc_state->mode? Except mode is userspace's request, not what has been validated/updated by the panel/bridge. vc4 has constraints that the DSI host interface is fed off an integer divider from a typically 3GHz clock, so the host interface needs to signal that burst mode is in use even if the panel/bridge doesn't need to run in burst mode. (This does mean that displays that require a very precise link frequency can not be supported). It currently updates the adjusted_mode via drm_encoder_helper_funcs mode_fixup, but is that the correct thing to do, or is there a better solution? I'd have expected the DSI tx to be responsible for configuring burst mode parameters anyway, so the mechanism required would seem to be just the normal approach for adopting burst mode if that is defined. Some DSI host interfaces are implemented as bridges, others are encoders. Pro's and con's of each? I suspect I'm just missing the history here. When it comes to the MIPI_DSI_MODE_* flags, which ones are mutually exclusive, or are assumed based on others? Does a burst mode DSI sink set both MIPI_DSI_MODE_VIDEO and MIPI_DSI_MODE_VIDEO_BURST, or just the latter? Presumably !MIPI_DSI_MODE_VIDEO signals the of use command mode for conveying video. So looking at panel-ilitek-ili9881c where it sets just MIPI_DSI_MODE_VIDEO_SYNC_PULSE means command mode video with sync pulses? That sounds unlikely. I have looked for any information that covers this, but failed to find such, hence calling on all your expertise. Many thanks for your time, Dave
Re: [PATCH v2] drm/dbi: Print errors for mipi_dbi_command()
Den 02.07.2021 12.04, skrev Linus Walleij: > The macro mipi_dbi_command() does not report errors unless you wrap it > in another macro to do the error reporting. > > Report a rate-limited error so we know what is going on. > > Drop the only user in DRM using mipi_dbi_command() and actually checking > the error explicitly, let it use mipi_dbi_command_buf() directly > instead. > > After this any code wishing to send command arrays can rely on > mipi_dbi_command() providing an appropriate error message if something > goes wrong. > > Suggested-by: Noralf Trønnes > Suggested-by: Douglas Anderson > Signed-off-by: Linus Walleij > --- > ChangeLog v1->v2: > - Fish out the struct device * from the DBI SPI client and use > that to print the errors associated with the SPI device. > --- > drivers/gpu/drm/drm_mipi_dbi.c | 2 +- > include/drm/drm_mipi_dbi.h | 6 +- > 2 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/drm_mipi_dbi.c b/drivers/gpu/drm/drm_mipi_dbi.c > index 3854fb9798e9..c7c1b75df190 100644 > --- a/drivers/gpu/drm/drm_mipi_dbi.c > +++ b/drivers/gpu/drm/drm_mipi_dbi.c > @@ -645,7 +645,7 @@ static int mipi_dbi_poweron_reset_conditional(struct > mipi_dbi_dev *dbidev, bool > return 1; > > mipi_dbi_hw_reset(dbi); > - ret = mipi_dbi_command(dbi, MIPI_DCS_SOFT_RESET); > + ret = mipi_dbi_command_buf(dbi, MIPI_DCS_SOFT_RESET, NULL, 0); > if (ret) { > DRM_DEV_ERROR(dev, "Failed to send reset command (%d)\n", ret); > if (dbidev->regulator) > diff --git a/include/drm/drm_mipi_dbi.h b/include/drm/drm_mipi_dbi.h > index f543d6e3e822..f00cb9690cf2 100644 > --- a/include/drm/drm_mipi_dbi.h > +++ b/include/drm/drm_mipi_dbi.h > @@ -183,7 +183,11 @@ int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer > *fb, > #define mipi_dbi_command(dbi, cmd, seq...) \ > ({ \ > const u8 d[] = { seq }; \ > - mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \ > + struct device *dev = >spi->dev; \ > + int ret; \ > + ret = mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \ > + if (ret) \ > + dev_err_ratelimited(dev, "error %d when sending command\n", > ret); \ Nit: Printing the failing command would have been useful, like you did in the driver macro. > }) I would have preferred if mipi_dbi_command could have returned the error code. This indicates that it should be possible: https://stackoverflow.com/questions/3532621/using-and-returning-output-in-c-macro But I can live with this, but if drivers want to start checking the error code we might have to rethink this. But this works as things are now: Reviewed-by: Noralf Trønnes
Re: [PATCH v2 4/7] drm/panfrost: Add the ability to create submit queues
Hi, On Fri, 2 Jul 2021 at 11:55, Steven Price wrote: > On 02/07/2021 11:43, Boris Brezillon wrote: > > On Fri, 2 Jul 2021 10:56:29 +0100 > > Steven Price wrote: > >> My Vulkan knowledge is limited so I'm not sure whether this is the right > >> approach or not. In particular is it correct that an application can > >> create a high priority queue which could affect other (normal priority) > >> applications? > > > > That's what msm does (with no extra CAPS check AFAICT), and the > > freedreno driver can already create high priority queues if > > PIPE_CONTEXT_HIGH_PRIORITY is passed. Not saying that's okay to allow > > userspace to tweak the priority, but if that's a problem, other drivers > > are in trouble too ;-). > > Oh well I guess if others are doing the same ;) I have to admit kbase > has always struggled with how to identify a "privileged" process - it's > something that makes a bit of sense on Android but for other userspaces > there really doesn't seem to be a good way of identifying what should or > should not be allowed to create high priority queues. Yeah, it's a platform-specific question. Some might want to say compositor-only, some might want to let foreground apps ramp, etc. Thankfully, Vulkan is pretty clear that it's just a hint and the results might be anything or nothing. > >> Also does it really make sense to allow user space to create an > >> unlimited number of queues? It feels like an ideal way for an malicious > >> application to waste kernel memory. > > > > Same here, I see no limit on the number of queues the msm driver can > > create. I can definitely pick an arbitrary limit of 2^16 (or 2^8 if > > 2^16 is too high) if you prefer, but I feel like there's plenty of ways > > to force kernel allocations already, like allocating a gazillion of 4k > > GEM buffers (cgroup can probably limit the total amount of memory > > allocated, but you'd still have all gem-buf meta data in kernel memory). > > I guess the real problem is picking a sensible limit ;) My main concern > here is that there doesn't appear to be any memory accounted against the > process. For GEM buffers at least there is some cost to the application > - so an unbounded allocation isn't possible, even if the bounds are > likely to be very high. > > With kbase we found that syzcaller was good at finding ways of using up > all the memory on the platform - and if it wasn't accounted to the right > process that meant the OOM-killer knocked out the wrong process and > produced a bug report to investigate. Perhaps I'm just scarred by that > history ;) Yep, cgroup accounting and restriction is still very much unsolved. GEM buffers let you make an outsize impact on the whole system at little to no cost to yourself. You can also create a million syncobjs if you want. Oh well. Cheers, Daniel
Re: [PATCH v2 4/7] drm/panfrost: Add the ability to create submit queues
On Fri, 2 Jul 2021 11:58:34 +0100 Steven Price wrote: > On 02/07/2021 11:52, Boris Brezillon wrote: > > On Fri, 2 Jul 2021 11:08:58 +0100 > > Steven Price wrote: > > > >> On 01/07/2021 10:12, Boris Brezillon wrote: > >>> Needed to keep VkQueues isolated from each other. > >> > >> One more comment I noticed when I tried this out: > >> > >> [...] > >>> +struct panfrost_submitqueue * > >>> +panfrost_submitqueue_create(struct panfrost_file_priv *ctx, > >>> + enum panfrost_submitqueue_priority priority, > >>> + u32 flags) > >>> +{ > >>> + struct panfrost_submitqueue *queue; > >>> + enum drm_sched_priority sched_prio; > >>> + int ret, i; > >>> + > >>> + if (flags || priority >= PANFROST_SUBMITQUEUE_PRIORITY_COUNT) > >>> + return ERR_PTR(-EINVAL); > >>> + > >>> + queue = kzalloc(sizeof(*queue), GFP_KERNEL); > >>> + if (!queue) > >>> + return ERR_PTR(-ENOMEM); > >>> + > >>> + queue->pfdev = ctx->pfdev; > >>> + sched_prio = to_sched_prio(priority); > >>> + for (i = 0; i < NUM_JOB_SLOTS; i++) { > >>> + struct drm_gpu_scheduler *sched; > >>> + > >>> + sched = panfrost_job_get_sched(ctx->pfdev, i); > >>> + ret = drm_sched_entity_init(>sched_entity[i], > >>> + sched_prio, , 1, NULL); > >>> + if (ret) > >>> + break; > >>> + } > >>> + > >>> + if (ret) { > >>> + for (i--; i >= 0; i--) > >>> + drm_sched_entity_destroy(>sched_entity[i]); > >>> + > >>> + return ERR_PTR(ret); > >>> + } > >>> + > >>> + kref_init(>refcount); > >>> + idr_lock(>queues); > >>> + ret = idr_alloc(>queues, queue, 0, INT_MAX, GFP_KERNEL); > >> > >> This makes lockdep complain. idr_lock() is a spinlock and GFP_KERNEL can > >> sleep. So either we need to bring our own mutex here or not use GFP_KERNEL. > >> > > > > Ouch! I wonder why I don't see that (I have lockdep enabled, and the > > igt tests should have exercised this path). > > Actually I'm not sure it technically lockdep - have you got > CONFIG_DEBUG_ATOMIC_SLEEP set? Nope, I was missing that one :-/.
Re: [PATCH v2 4/7] drm/panfrost: Add the ability to create submit queues
On 02/07/2021 11:52, Boris Brezillon wrote: > On Fri, 2 Jul 2021 11:08:58 +0100 > Steven Price wrote: > >> On 01/07/2021 10:12, Boris Brezillon wrote: >>> Needed to keep VkQueues isolated from each other. >> >> One more comment I noticed when I tried this out: >> >> [...] >>> +struct panfrost_submitqueue * >>> +panfrost_submitqueue_create(struct panfrost_file_priv *ctx, >>> + enum panfrost_submitqueue_priority priority, >>> + u32 flags) >>> +{ >>> + struct panfrost_submitqueue *queue; >>> + enum drm_sched_priority sched_prio; >>> + int ret, i; >>> + >>> + if (flags || priority >= PANFROST_SUBMITQUEUE_PRIORITY_COUNT) >>> + return ERR_PTR(-EINVAL); >>> + >>> + queue = kzalloc(sizeof(*queue), GFP_KERNEL); >>> + if (!queue) >>> + return ERR_PTR(-ENOMEM); >>> + >>> + queue->pfdev = ctx->pfdev; >>> + sched_prio = to_sched_prio(priority); >>> + for (i = 0; i < NUM_JOB_SLOTS; i++) { >>> + struct drm_gpu_scheduler *sched; >>> + >>> + sched = panfrost_job_get_sched(ctx->pfdev, i); >>> + ret = drm_sched_entity_init(>sched_entity[i], >>> + sched_prio, , 1, NULL); >>> + if (ret) >>> + break; >>> + } >>> + >>> + if (ret) { >>> + for (i--; i >= 0; i--) >>> + drm_sched_entity_destroy(>sched_entity[i]); >>> + >>> + return ERR_PTR(ret); >>> + } >>> + >>> + kref_init(>refcount); >>> + idr_lock(>queues); >>> + ret = idr_alloc(>queues, queue, 0, INT_MAX, GFP_KERNEL); >> >> This makes lockdep complain. idr_lock() is a spinlock and GFP_KERNEL can >> sleep. So either we need to bring our own mutex here or not use GFP_KERNEL. >> > > Ouch! I wonder why I don't see that (I have lockdep enabled, and the > igt tests should have exercised this path). Actually I'm not sure it technically lockdep - have you got CONFIG_DEBUG_ATOMIC_SLEEP set? Steve
Re: [PATCH v2 4/7] drm/panfrost: Add the ability to create submit queues
On 02/07/2021 11:43, Boris Brezillon wrote: > On Fri, 2 Jul 2021 10:56:29 +0100 > Steven Price wrote: > >> On 01/07/2021 10:12, Boris Brezillon wrote: >>> Needed to keep VkQueues isolated from each other. >>> >>> Signed-off-by: Boris Brezillon >> >> My Vulkan knowledge is limited so I'm not sure whether this is the right >> approach or not. In particular is it correct that an application can >> create a high priority queue which could affect other (normal priority) >> applications? > > That's what msm does (with no extra CAPS check AFAICT), and the > freedreno driver can already create high priority queues if > PIPE_CONTEXT_HIGH_PRIORITY is passed. Not saying that's okay to allow > userspace to tweak the priority, but if that's a problem, other drivers > are in trouble too ;-). Oh well I guess if others are doing the same ;) I have to admit kbase has always struggled with how to identify a "privileged" process - it's something that makes a bit of sense on Android but for other userspaces there really doesn't seem to be a good way of identifying what should or should not be allowed to create high priority queues. >> >> Also does it really make sense to allow user space to create an >> unlimited number of queues? It feels like an ideal way for an malicious >> application to waste kernel memory. > > Same here, I see no limit on the number of queues the msm driver can > create. I can definitely pick an arbitrary limit of 2^16 (or 2^8 if > 2^16 is too high) if you prefer, but I feel like there's plenty of ways > to force kernel allocations already, like allocating a gazillion of 4k > GEM buffers (cgroup can probably limit the total amount of memory > allocated, but you'd still have all gem-buf meta data in kernel memory). I guess the real problem is picking a sensible limit ;) My main concern here is that there doesn't appear to be any memory accounted against the process. For GEM buffers at least there is some cost to the application - so an unbounded allocation isn't possible, even if the bounds are likely to be very high. With kbase we found that syzcaller was good at finding ways of using up all the memory on the platform - and if it wasn't accounted to the right process that meant the OOM-killer knocked out the wrong process and produced a bug report to investigate. Perhaps I'm just scarred by that history ;) Steve >> >> In terms of implementation it looks correct, but one comment below >> >>> --- >>> drivers/gpu/drm/panfrost/Makefile | 3 +- >>> drivers/gpu/drm/panfrost/panfrost_device.h| 2 +- >>> drivers/gpu/drm/panfrost/panfrost_drv.c | 69 -- >>> drivers/gpu/drm/panfrost/panfrost_job.c | 47 ++- >>> drivers/gpu/drm/panfrost/panfrost_job.h | 9 +- >>> .../gpu/drm/panfrost/panfrost_submitqueue.c | 130 ++ >>> .../gpu/drm/panfrost/panfrost_submitqueue.h | 27 >>> include/uapi/drm/panfrost_drm.h | 17 +++ >>> 8 files changed, 258 insertions(+), 46 deletions(-) >>> create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c >>> create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h >>> >> [...] >>> diff --git a/drivers/gpu/drm/panfrost/panfrost_submitqueue.c >>> b/drivers/gpu/drm/panfrost/panfrost_submitqueue.c >>> new file mode 100644 >>> index ..98050f7690df >>> --- /dev/null >>> +++ b/drivers/gpu/drm/panfrost/panfrost_submitqueue.c >>> @@ -0,0 +1,130 @@ >>> +// SPDX-License-Identifier: GPL-2.0 >>> +/* Copyright 2021 Collabora ltd. */ >>> + >>> +#include >>> + >>> +#include "panfrost_device.h" >>> +#include "panfrost_job.h" >>> +#include "panfrost_submitqueue.h" >>> + >>> +static enum drm_sched_priority >>> +to_sched_prio(enum panfrost_submitqueue_priority priority) >>> +{ >>> + switch (priority) { >>> + case PANFROST_SUBMITQUEUE_PRIORITY_LOW: >>> + return DRM_SCHED_PRIORITY_MIN; >>> + case PANFROST_SUBMITQUEUE_PRIORITY_MEDIUM: >>> + return DRM_SCHED_PRIORITY_NORMAL; >>> + case PANFROST_SUBMITQUEUE_PRIORITY_HIGH: >>> + return DRM_SCHED_PRIORITY_HIGH; >>> + default: >>> + break; >>> + } >>> + >>> + return DRM_SCHED_PRIORITY_UNSET; >>> +} >>> + >>> +static void >>> +panfrost_submitqueue_cleanup(struct kref *ref) >>> +{ >>> + struct panfrost_submitqueue *queue; >>> + unsigned int i; >>> + >>> + queue = container_of(ref, struct panfrost_submitqueue, refcount); >>> + >>> + for (i = 0; i < NUM_JOB_SLOTS; i++) >>> + drm_sched_entity_destroy(>sched_entity[i]); >>> + >>> + /* Kill in-flight jobs */ >>> + panfrost_job_kill_queue(queue); >>> + >>> + kfree(queue); >>> +} >>> + >>> +void panfrost_submitqueue_put(struct panfrost_submitqueue *queue) >>> +{ >>> + if (!IS_ERR_OR_NULL(queue)) >>> + kref_put(>refcount, panfrost_submitqueue_cleanup); >>> +} >>> + >>> +struct panfrost_submitqueue * >>> +panfrost_submitqueue_create(struct panfrost_file_priv *ctx, >>> +