On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin
wrote:
>
>
> On 30/04/2021 07:53, Daniel Vetter wrote:
> > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand
> > wrote:
> >>
> >> On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter wrote:
> >>>
&g
On Fri, Apr 30, 2021 at 2:44 PM Tvrtko Ursulin
wrote:
>
>
>
> On 30/04/2021 13:30, Daniel Vetter wrote:
> > On Fri, Apr 30, 2021 at 1:58 PM Tvrtko Ursulin
> > wrote:
> >> On 30/04/2021 07:53, Daniel Vetter wrote:
> >>> On Thu, Apr 29,
On Fri, Apr 30, 2021 at 3:32 PM Hans de Goede wrote:
>
> Hi,
>
> On 4/30/21 1:38 PM, Daniel Vetter wrote:
> > On Fri, Apr 30, 2021 at 1:28 PM Hans de Goede wrote:
> >>
> >> Hi,
> >>
> >> On 4/29/21 9:09 PM, Daniel Vetter wrote:
> >>
On Fri, Apr 30, 2021 at 3:27 PM Tvrtko Ursulin
wrote:
> On 30/04/2021 12:48, Daniel Vetter wrote:
> > On Thu, Apr 29, 2021 at 10:46:40AM +0100, Tvrtko Ursulin wrote:
> >> From: Tvrtko Ursulin
> >>
> >> When a non-persistent context exits we currently mark i
On Fri, Apr 30, 2021 at 6:27 PM Jason Ekstrand wrote:
>
> On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter wrote:
> >
> > On Thu, Apr 29, 2021 at 11:35 PM Jason Ekstrand
> > wrote:
> > >
> > > On Thu, Apr 29, 2021 at 2:07 PM Daniel Vetter wrote:
> >
On Fri, Apr 30, 2021 at 6:57 PM Jason Ekstrand wrote:
>
> On Fri, Apr 30, 2021 at 11:33 AM Daniel Vetter wrote:
> >
> > On Fri, Apr 30, 2021 at 6:27 PM Jason Ekstrand wrote:
> > >
> > > On Fri, Apr 30, 2021 at 1:53 AM Daniel Vetter wrote:
> > > >
380if (likely(!i915_sw_fence_done(signaler))) {
>381__add_wait_queue_entry_tail(&signaler->wait,
> wq);
>382pending = 1;
>383} else {
>384i915_sw_fence_wake(wq, 0, signa
what to
do when e.g. switching between copying and zero-copy on the host side
(which might be needed in some cases) and how to handle all that.
Only when that all shows that we just can't hit 60fps consistently and
really need 3 buffers in flight should we look at deeper kms queues.
And then we really need to implement them properly and not with a
mismatch between drm_event an out-fence signalling. These quick hacks
are good for experiments, but there's a pile of other things we need
to do first. At least that's how I understand the problem here right
now.
Cheers, Daniel
>
>
> --
> Earthling Michel Dänzer | https://redhat.com
> Libre software enthusiast | Mesa and X developer
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
full-ppgtt platforms. Ditching it all seemed
like a better idea.
References: ccbc1b97948a ("drm/i915/gem: Don't allow changing the VM on running
contexts (v4)")
Signed-off-by: Daniel Vetter
Cc: Jon Bloomfield
Cc: Chris Wilson
Cc: Maarten Lankhorst
Cc: Joonas Lahtinen
Cc: D
s either a full ppgtt stored in gem->ctx, or the ggtt.
We'll make more use of this function later on.
Signed-off-by: Daniel Vetter
Cc: Jon Bloomfield
Cc: Chris Wilson
Cc: Maarten Lankhorst
Cc: Joonas Lahtinen
Cc: Daniel Vetter
Cc: "Thomas Hellström"
Cc: Matthew Auld
Cc: L
https://lore.kernel.org/dri-devel/20210802154806.3710472-1-daniel.vet...@ffwll.ch/
Cheers, Daniel
Daniel Vetter (9):
drm/i915: Drop code to handle set-vm races from execbuf
drm/i915: Rename i915_gem_context_get_vm_rcu to
i915_gem_context_get_eb_vm
drm/i915: Use i915_gem_context_get_eb_
close to anything
that's a hotpath where removing the single spinlock can be measured).
Signed-off-by: Daniel Vetter
Cc: Jon Bloomfield
Cc: Chris Wilson
Cc: Maarten Lankhorst
Cc: Joonas Lahtinen
Cc: Daniel Vetter
Cc: "Thomas Hellström"
Cc: Matthew Auld
Cc: Lionel La
xt->vm or gt->vm,
which is always set.
v2: 0day found a testcase that I missed.
Signed-off-by: Daniel Vetter
Cc: Jon Bloomfield
Cc: Chris Wilson
Cc: Maarten Lankhorst
Cc: Joonas Lahtinen
Cc: Daniel Vetter
Cc: "Thomas Hellström"
Cc: Matthew Auld
Cc: Lionel Landwerlin
Cc:
e also remove the rcu_barrier in ggtt_cleanup_hw added in
commit 60a4233a4952729089e4df152e730f8f4d0e82ce
Author: Chris Wilson
Date: Mon Jul 29 14:24:12 2019 +0100
drm/i915: Flush the i915_vm_release before ggtt shutdown
Signed-off-by: Daniel Vetter
Cc: Jon Bloomfield
Cc: Chris W
Consolidates the "which is the vm my execbuf runs in" code a bit. We
do some get/put which isn't really required, but all the other users
want the refcounting, and I figured doing a function just for this
getparam to avoid 2 atomis is a bit much.
Signed-off-by: Daniel Vetter
Cc:
ris Wilson
Date: Fri Aug 30 19:03:25 2019 +0100
drm/i915: Use RCU for unlocked vm_idr lookup
except we have the conversion from idr to xarray in between.
Signed-off-by: Daniel Vetter
Cc: Jon Bloomfield
Cc: Chris Wilson
Cc: Maarten Lankhorst
Cc: Joonas Lahtinen
Cc: Daniel Vetter
C
r an accident
where we run kernel stuff in userspace vm or the other way round.
Signed-off-by: Daniel Vetter
Cc: Jon Bloomfield
Cc: Chris Wilson
Cc: Maarten Lankhorst
Cc: Joonas Lahtinen
Cc: Daniel Vetter
Cc: "Thomas Hellström"
Cc: Matthew Auld
Cc: Lionel Landwerlin
Cc:
Date: Mon Jul 26 17:23:16 2021 -0700
drm/i915/guc: GuC virtual engines
address this all by also splitting the new intel_engine_create_virtual
into a _user variant that doesn't set ce->vm.
Cc: Matthew Brost
Cc: Daniele Ceraolo Spurio
Cc: John Harrison
Signed-off-by
> now needs to happen after we place the object. Or at least the
> >> existing callers(for kernel internal objects) might not have expected
> >> that behaviour. Not sure if we checked all the callers.
> >>
> >>> It seems like the fundamental problem here is that, when it's created,
> >>> the object isn't really in any memory region at all. While I don't
> >>> think obj->mm.region == NULL is allowed or a good idea, it does seem
> >>> closer to the ground truth.
> >> Yeah, seems reasonable, especially for create_user where we don't know
> >> the placement until we actually call get_pages(). I think for internal
> >> users like with create_lmem() setting the mm.region early still makes
> >> some sense?
> >>
> >>> Perhaps what we really want is for i915_gem_object_migrate to
> >>> get_pages before it does the migration to ensure that pages exist.
> >>> The only call to i915_gem_object_migrate in the code-base today is in
> >>> the display code and it's immediately followed by pin_pages(). For
> >>> that matter, maybe the call we actually want is
> >>> i915_object_migrate_and_pin that does the whole lot.
> >> I guess the only downside is that we might end up doing a real
> >> migration, with mempy or the blitter vs just changing the preferred
> >> placement for later? I think just go with whatever you feel is the
> >> simplest for now.
> > Another cheapo could be to drop the mr == mm.region noop, and just try
> > to place the object at mr anyway?
> >
> There are a number of things to consider here,
>
> First, as Jason found out what's keeping thing from working as intended
> is that we actually call into TTM get_pages() after migration, since the
> object isn't populated with pages yet. That's indeed a bug.
>
> We should probably have migrate be migrate_and_populate(): Whatever
> kernel code decides to migrate needs to hold the object lock over the
> operation where data needs to be migrated or in the worst case call
> pin() under the lock which currently needs to be the case for dma-buf
> and display.
>
> If we blindly just look at obj->mm.region() in get_pages() then if an
> object with allowable placements in lmem and smem initially gets placed
> in lmem, and then evicted to smem it will never migrate back to lmem
> unless if there is an explicit i915_gem_object_migrate(), but again,
> that's perhaps what we want? I guess we need to more clearly define the
> migration policies; for example should we attempt to migrate evicted
> buffers back to lmem on each execbuf where they are referenced, even if
> they haven't lost their pages?
Looking at amdgpu things are indeed complicated:
- mmap adds some hints that cpu access is preferred (iirc at least) so
that the unmappable vram problems aren't too awful
- execbuf adds vram to the non-evict placement list whenever that
makes sense (i.e. preferred place and no inferred hint like mmap
access countering that)
- for eviction there's a ratelimit, to make sure we're not thrashing
terribly and spending all the gpu time moving buffers around with the
copy engine
Maybe another interim strategy would be to only evict non-busy
buffers, not sure ttm supports that already. We definitely don't want
to unconditionally force all buffers into lmem on every execbuf.
-Daniel
> On region dicrepance between gem and TTM there is a short DOC: section
> in i915_gem_ttm.c
>
> /Thomas
>
>
> >>> Thoughts?
> >>>
> >>> --Jason
> >>>
> >>> P.S. I'm going to go ahead and send another version with your other
> >>> comments addressed. We can keep this discussion going here for now.
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Tue, Aug 03, 2021 at 10:49:39AM -0400, Alex Deucher wrote:
> On Tue, Aug 3, 2021 at 4:34 AM Michel Dänzer wrote:
> >
> > On 2021-08-02 4:51 p.m., Alex Deucher wrote:
> > > On Mon, Aug 2, 2021 at 4:31 AM Daniel Vetter wrote:
> > >>
> > >> On
ion it
> > in commit messages. Require the kernel patch to be a one-stop shop for
> > finding the various bits which were used to justify the new uAPI.
> >
> > Signed-off-by: Jason Ekstrand
> > Cc: Daniel Vetter
> > Cc: Dave Airlie
> > ---
> >
ix this by releasing the RPM reference from the EFI FB's destroy hook,
> called when the FB gets unregistered.
>
> Fixes: a6c0fd3d5a8b ("efifb: Ensure graphics device for efifb stays at PCI
> D0")
> Cc: Kai-Heng Feng
> Signed-off-by: Imre Deak
Patch looks good:
On Tue, Aug 03, 2021 at 10:47:10AM -0500, Jason Ekstrand wrote:
> Both are
>
> Reviewed-by: Jason Ekstrand
CI is happy, I guess you got all the igt changes indeed. Both pushed
thanks for reviewing.
-Daniel
>
> On Tue, Aug 3, 2021 at 7:49 AM Daniel Vetter wrote:
> >
>
On Tue, Aug 03, 2021 at 03:29:00PM -0700, Matthew Brost wrote:
> Rather than returning -EAGAIN to the user when no guc_ids are available,
> implement a fair sharing algorithm in the kernel which blocks submissons
> until guc_ids become available. Submissions are released one at a time,
> based on p
",
> atomic_read(&ce->guc_id_ref));
> + drm_printf(p, "\t\tNumber Requests Not Ready: %u\n",
> +atomic_read(&ce->guc_num_rq_not_ready));
> drm_printf(p, "\t\tSchedule State: 0x%x, 0x%x\n\n",
> ce->guc_state.sched_state,
> atomic_read(&ce->guc_sched_state_no_lock));
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
> index c7ef44fa0c36..17af5e123b09 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
> @@ -51,4 +51,6 @@ static inline bool intel_guc_submission_is_used(struct
> intel_guc *guc)
> return intel_guc_is_used(guc) && intel_guc_submission_is_wanted(guc);
> }
>
> +void intel_guc_decr_num_rq_not_ready(struct intel_context *ce);
> +
> #endif
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
gt; + return 0;
> +}
> +
> +struct threaded_migrate {
> + struct intel_migrate *migrate;
> + struct task_struct *tsk;
> + struct rnd_state prng;
> +};
> +
> +static int threaded_migrate(struct intel_migrate *migrate,
> + int (*fn)(void *arg),
> + unsigned int flags)
> +{
> + const unsigned int n_cpus = num_online_cpus() + 1;
> + struct threaded_migrate *thread;
> + I915_RND_STATE(prng);
> + unsigned int i;
> + int err = 0;
> +
> + thread = kcalloc(n_cpus, sizeof(*thread), GFP_KERNEL);
> + if (!thread)
> + return 0;
> +
> + for (i = 0; i < n_cpus; ++i) {
> + struct task_struct *tsk;
> +
> + thread[i].migrate = migrate;
> + thread[i].prng =
> + I915_RND_STATE_INITIALIZER(prandom_u32_state(&prng));
> +
> + tsk = kthread_run(fn, &thread[i], "igt-%d", i);
> + if (IS_ERR(tsk)) {
> + err = PTR_ERR(tsk);
> + break;
> + }
> +
> + get_task_struct(tsk);
> + thread[i].tsk = tsk;
> + }
> +
> + msleep(10); /* start all threads before we kthread_stop() */
> +
> + for (i = 0; i < n_cpus; ++i) {
> + struct task_struct *tsk = thread[i].tsk;
> + int status;
> +
> + if (IS_ERR_OR_NULL(tsk))
> + continue;
> +
> + status = kthread_stop(tsk);
> + if (status && !err)
> + err = status;
> +
> + put_task_struct(tsk);
> + }
> +
> + kfree(thread);
> + return err;
> +}
> +
> +static int __thread_migrate_copy(void *arg)
> +{
> + struct threaded_migrate *tm = arg;
> +
> + return migrate_copy(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
> +}
> +
> +static int thread_migrate_copy(void *arg)
> +{
> + return threaded_migrate(arg, __thread_migrate_copy, 0);
> +}
> +
> +static int __thread_global_copy(void *arg)
> +{
> + struct threaded_migrate *tm = arg;
> +
> + return global_copy(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
> +}
> +
> +static int thread_global_copy(void *arg)
> +{
> + return threaded_migrate(arg, __thread_global_copy, 0);
> +}
> +
> +int intel_migrate_live_selftests(struct drm_i915_private *i915)
> +{
> + static const struct i915_subtest tests[] = {
> + SUBTEST(live_migrate_copy),
> + SUBTEST(thread_migrate_copy),
> + SUBTEST(thread_global_copy),
> + };
> + struct intel_migrate m;
> + int err;
> +
> + if (intel_migrate_init(&m, &i915->gt))
> + return 0;
> +
> + err = i915_subtests(tests, &m);
> + intel_migrate_fini(&m);
> +
> + return err;
> +}
> diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> index a92c0e9b7e6b..be5e0191eaea 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> @@ -26,6 +26,7 @@ selftest(gt_mocs, intel_mocs_live_selftests)
> selftest(gt_pm, intel_gt_pm_live_selftests)
> selftest(gt_heartbeat, intel_heartbeat_live_selftests)
> selftest(requests, i915_request_live_selftests)
> +selftest(migrate, intel_migrate_live_selftests)
> selftest(active, i915_active_live_selftests)
> selftest(objects, i915_gem_object_live_selftests)
> selftest(mman, i915_gem_mman_live_selftests)
> --
> 2.31.1
>
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
therefore holding either of
> + * them is safe and enough for the read side.
> + *
>* When dereferencing this pointer, either hold struct
>* &drm_device.master_mutex for the duration of the pointer's use, or
>* use drm_file_get_master() if struct &drm_device.master_mutex is not
> --
> 2.25.1
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
the first two patches
could land asap, but that means testing by some of the other drivers.
Etnaviv especially is pending some testing/reviewed-by.
In general please review and test.
Thanks, Daniel
Daniel Vetter (20):
drm/sched: Split drm_sched_job_init
drm/msm: Fix drm/sched point of
ich the next
patch will address.
Acked-by: Melissa Wen
Cc: Melissa Wen
Acked-by: Emma Anholt
Acked-by: Steven Price (v2)
Reviewed-by: Boris Brezillon (v5)
Signed-off-by: Daniel Vetter
Cc: Lucas Stach
Cc: Russell King
Cc: Christian Gmeiner
Cc: Qiang Yu
Cc: Rob Herring
Cc: Tomeu Vizoso
Cc: S
Boris Brezillon (v3)
Reviewed-by: Steven Price (v1)
Acked-by: Melissa Wen
Signed-off-by: Daniel Vetter
Cc: David Airlie
Cc: Daniel Vetter
Cc: Sumit Semwal
Cc: "Christian König"
Cc: Andrey Grodzovsky
Cc: Lee Jones
Cc: Nirmoy Das
Cc: Boris Brezillon
Cc: Luben Tuikov
Cc: Alex Deuche
-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Signed-off-by: Daniel Vetter
---
drivers/gpu/drm/msm/msm_gem_submit.c | 15 +++
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 6
e and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.
v2: Put smp_rmp() in the right place and fix up comment (Andrey)
Acked-by: Melissa Wen
Signed-off-by: Daniel Vetter
Cc: "Christian König"
Cc: Steven Price
g drm/sched
Acked-by: Emma Anholt
Acked-by: Melissa Wen
Reviewed-by: Steven Price (v1)
Reviewed-by: Boris Brezillon (v1)
Signed-off-by: Daniel Vetter
Cc: Lucas Stach
Cc: Russell King
Cc: Christian Gmeiner
Cc: Qiang Yu
Cc: Rob Herring
Cc: Tomeu Vizoso
Cc: Steven Price
Cc: Alyssa Rosenzwei
dependencies.
Signed-off-by: Daniel Vetter
Cc: Qiang Yu
Cc: Sumit Semwal
Cc: "Christian König"
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
drivers/gpu/drm/lima/lima_gem.c | 6 --
drivers/gpu/drm/lima/lima_sc
en
Reviewed-by: Boris Brezillon (v1)
Signed-off-by: Daniel Vetter
Cc: Lucas Stach
Cc: David Airlie
Cc: Daniel Vetter
Cc: Maarten Lankhorst
Cc: Maxime Ripard
Cc: Thomas Zimmermann
Cc: "Christian König"
Cc: Boris Brezillon
Cc: Steven Price
Cc: Emma Anholt
Cc: Lee Jon
set up job, now that job_init()
and job_arm() are apart (Emma).
v3: Rebased over renamed functions for adding depdencies
Acked-by: Emma Anholt
Reviewed-by: Steven Price (v3)
Signed-off-by: Daniel Vetter
Cc: Rob Herring
Cc: Tomeu Vizoso
Cc: Steven Price
Cc: Alyssa Rosenzweig
Cc: Sumit
under construction correctly (Emma)
v4: Rebase over perfmon patch
Reviewed-by: Melissa Wen (v3)
Acked-by: Emma Anholt
Cc: Melissa Wen
Signed-off-by: Daniel Vetter
Cc: Emma Anholt
---
drivers/gpu/drm/v3d/v3d_drv.h | 1 +
drivers/gpu/drm/v3d/v3d_gem.c | 86
o add dependencies.
Signed-off-by: Daniel Vetter
Cc: Lucas Stach
Cc: Russell King
Cc: Christian Gmeiner
Cc: Sumit Semwal
Cc: "Christian König"
Cc: etna...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
drivers/gpu/drm/etnaviv/etnaviv_gem.h|
Integrated into the scheduler now and all users converted over.
Signed-off-by: Daniel Vetter
Cc: Maarten Lankhorst
Cc: Maxime Ripard
Cc: Thomas Zimmermann
Cc: David Airlie
Cc: Daniel Vetter
Cc: Sumit Semwal
Cc: "Christian König"
Cc: linux-me...@vger.kernel.org
Cc:
drm_sched_job_init is already at the right place, so this boils down
to deleting code.
Signed-off-by: Daniel Vetter
Cc: Rob Clark
Cc: Sean Paul
Cc: Sumit Semwal
Cc: "Christian König"
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
Cc: linux-me...@vger.ker
: Rebase over renamed function names for adding dependencies.
Reviewed-by: Melissa Wen (v1)
Acked-by: Emma Anholt
Cc: Melissa Wen
Signed-off-by: Daniel Vetter
Cc: Emma Anholt
---
drivers/gpu/drm/v3d/v3d_drv.h | 5 -
drivers/gpu/drm/v3d/v3d_gem.c | 26 +-
dri
.
Another option is the fence import ioctl from Jason:
https://lore.kernel.org/dri-devel/20210610210925.642582-7-ja...@jlekstrand.net/
v2: Improve commit message per Lucas' suggestion.
Signed-off-by: Daniel Vetter
Cc: Lucas Stach
Cc: Russell King
Cc: Christian Gmeiner
Cc: etna...@lists.
t, so that drivers don't have to hack up their own
solution each on their own.
v2: Improve commit message per Lucas' suggestion.
Cc: Lucas Stach
Signed-off-by: Daniel Vetter
Cc: Rob Clark
Cc: Sean Paul
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
No longer used, the last user disappeared with
commit d07f0e59b2c762584478920cd2d11fba2980a94a
Author: Chris Wilson
Date: Fri Oct 28 13:58:44 2016 +0100
drm/i915: Move GEM activity tracking into a common struct reservation_object
Signed-off-by: Daniel Vetter
Cc: Maarten Lankhorst
Cc
You really need to hold the reservation here or all kinds of funny
things can happen between grabbing the dependencies and inserting the
new fences.
Acked-by: Melissa Wen
Signed-off-by: Daniel Vetter
Cc: "Christian König"
Cc: Daniel Vetter
Cc: Luben Tuikov
Cc: Andrey Grodzovsky
viewers that go across all
drivers wont miss it.
Reviewed-by: Lucas Stach
Acked-by: Melissa Wen
Signed-off-by: Daniel Vetter
Cc: "Christian König"
Cc: Daniel Vetter
Cc: Luben Tuikov
Cc: Andrey Grodzovsky
Cc: Alex Deucher
---
drivers/gpu/drm/scheduler/sched_main.c | 7 +++
1 f
.
Another option is the fence import ioctl from Jason:
https://lore.kernel.org/dri-devel/20210610210925.642582-7-ja...@jlekstrand.net/
v2: Improve commit message per Lucas' suggestion.
Cc: Lucas Stach
Signed-off-by: Daniel Vetter
Cc: Maarten Lankhorst
Cc: "Thomas Hellström"
Cc: Jas
use-after-free issues
around dma-buf sharing (Christian)
Reviewed-by: Christian König
Cc: Jason Ekstrand
Cc: Matthew Auld
Reviewed-by: Matthew Auld
Signed-off-by: Daniel Vetter
Cc: Sumit Semwal
Cc: "Christian König"
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
d work or not but I think the Guest compositor
> has to be told
> when it can start its repaint cycle and when it can assume the old FB is no
> longer in use.
> On bare-metal -- and also with VKMS as of today -- a pageflip completion
> indicates both.
> In other words, Vbla
hart
Reviewed-by: Daniel Vetter
And thanks for not going down the "let's add dummy functions and inflict
lots of error case handling onto a driver that will never be used" route
instead.
-Daniel
> ---
> drivers/gpu/drm/omapdrm/Kconfig | 2 +-
> 1 file changed, 1 insert
On Thu, Aug 5, 2021 at 3:18 PM Christian König wrote:
>
>
>
> Am 05.08.21 um 12:46 schrieb Daniel Vetter:
> > This is essentially part of drm_sched_dependency_optimized(), which
> > only amdgpu seems to make use of. Use it a bit more.
> >
> > This would
On Thu, Aug 5, 2021 at 3:19 PM Christian König wrote:
>
> Am 05.08.21 um 12:47 schrieb Daniel Vetter:
> > You really need to hold the reservation here or all kinds of funny
> > things can happen between grabbing the dependencies and inserting the
> > new fences.
>
On Thu, Aug 5, 2021 at 3:44 PM Christian König wrote:
> Am 05.08.21 um 12:46 schrieb Daniel Vetter:
> > This is a very confusingly named function, because not just does it
> > init an object, it arms it and provides a point of no return for
> > pushing a job into the schedu
On Thu, Aug 5, 2021 at 3:57 PM Christian König wrote:
> Am 05.08.21 um 15:25 schrieb Daniel Vetter:
> > On Thu, Aug 5, 2021 at 3:18 PM Christian König
> > wrote:
> >>
> >>
> >> Am 05.08.21 um 12:46 schrieb Daniel Vetter:
> >>> This is essen
On Thu, Aug 5, 2021 at 4:47 PM Christian König wrote:
>
> Am 05.08.21 um 16:07 schrieb Daniel Vetter:
> > On Thu, Aug 5, 2021 at 3:44 PM Christian König
> > wrote:
> >> Am 05.08.21 um 12:46 schrieb Daniel Vetter:
> >>> This is a very confusingly n
depends on DRM
> > + depends on DRM && OF
> > depends on ARCH_OMAP2PLUS || ARCH_MULTIPLATFORM
> > select OMAP2_DSS
> > select DRM_KMS_HELPER
>
> Would it make sense to select OF instead?
select is extremely harmful for any user-visible
On Fri, Aug 6, 2021 at 12:58 AM Rob Clark wrote:
>
> On Thu, Aug 5, 2021 at 3:47 AM Daniel Vetter wrote:
> >
> > Originally drm_sched_job_init was the point of no return, after which
> > drivers must submit a job. I've split that up, which allows us to fix
On Fri, Aug 6, 2021 at 7:15 PM Rob Clark wrote:
>
> On Fri, Aug 6, 2021 at 9:42 AM Daniel Vetter wrote:
> >
> > On Fri, Aug 6, 2021 at 12:58 AM Rob Clark wrote:
> > >
> > > On Thu, Aug 5, 2021 at 3:47 AM Daniel Vetter
> > > wrote:
> > > &g
nd ADL_S not
> > allowed\n");
> I would have said not supported rather than not allowed. Either way:
> Reviewed-by: John Harrison
Either is fine with me.
Acked-by: Daniel Vetter
>
> > + return -ENODEV;
> > + }
> > +
> > if (get_user(idx, &ext->virtual_index))
> > return -EFAULT;
> >
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
/590]
>[1002:67df] (rev e7) (prog-if 00 [VGA controller])
>
> Full oops in the attachment, but I think the above is all the really
> salient details.
>
>Linus
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Fri, Aug 6, 2021 at 8:57 PM Rob Clark wrote:
>
> On Fri, Aug 6, 2021 at 11:41 AM Daniel Vetter wrote:
> >
> > On Fri, Aug 6, 2021 at 7:15 PM Rob Clark wrote:
> > >
> > > On Fri, Aug 6, 2021 at 9:42 AM Daniel Vetter
> > > wrote:
> > &g
longer.
> >
> > And if ABI change is okay then commit message needs to talk about it
> > loudly and clearly.
> I don't think we have a choice. The current ABI is not and cannot ever
> be compatible with any scheduler external to i915. It cannot be
> implemented with a ha
t object actually invariant over its _entire_ lifetime.
Signed-off-by: Daniel Vetter
Fixes: 00dae4d3d35d ("drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)")
Cc: Jason Ekstrand
Cc: Chris Wilson
Cc: Tvrtko Ursulin
Cc: Joonas Lahtinen
Cc: Matthew Brost
Cc: Matthew Auld
Cc: Maa
ock & unblocked (scheduling enable)
> while
> + * this reset was inflight. If a scheduling enable is already is in
> + * flight do not clear the enable.
>*/
> - clr_context_enabled(ce);
> + spin_lock_irqsave(&ce->guc_state.lock, flags);
>
one of the IP markers).
Also I think the above should be replicated in condensed form instead of
the XXX comment.
With those: Acked-by: Daniel Vetter since I
definitely have enough clue here for a detailed review.
-Daniel
>
> Signed-off-by: Matthew Brost
> ---
>
struct intel_selftest_saved_policy *saved,
>u32 modify_type)
> diff --git a/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.h
> b/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.h
> index 35c098601ac0..ae60bb507f45 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.h
> +++ b/drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.h
> @@ -10,6 +10,7 @@
>
> struct i915_request;
> struct intel_engine_cs;
> +struct intel_gt;
>
> struct intel_selftest_saved_policy {
> u32 flags;
> @@ -23,6 +24,7 @@ enum selftest_scheduler_modify {
> SELFTEST_SCHEDULER_MODIFY_FAST_RESET,
> };
>
> +struct intel_engine_cs *intel_selftest_find_any_engine(struct intel_gt *gt);
> int intel_selftest_modify_policy(struct intel_engine_cs *engine,
>struct intel_selftest_saved_policy *saved,
>enum selftest_scheduler_modify modify_type);
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_revids")
> >
> > is missing a Signed-off-by from its committer.
> >
> > --
> > Cheers,
> > Stephen Rothwell
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
t but I think the Guest
> > > compositor has to be
> > told
> > > when it can start its repaint cycle and when it can assume the old FB is
> > > no longer in use.
> > > On bare-metal -- and also with VKMS as of today -- a pageflip completion
> > > indicates
> > both.
> > > In other words, Vblank event is the same as Flip done, which makes sense
> > > on bare-metal.
> > > But if we were to have two events at-least for VKMS: vblank to indicate
> > > to Guest to start
> > > repaint and flip_done to indicate to drop references on old FBs, I think
> > > this problem can
> > > be solved even without increasing the queue depth. Can this be acceptable?
> >
> > That's just another flavour of your "increase queue depth without
> > increasing the atomic queue depth" approach. I still think the underlying
> > fundamental issue is a timing confusion, and the fact that adjusting the
> > timings fixes things too kinda proves that. So we need to fix that in a
> > clean way, not by shuffling things around semi-randomly until the specific
> > config we tests works.
> [Kasireddy, Vivek] This issue is not due to a timing or timestamp mismatch. We
> have carefully instrumented both the Host and Guest compositors and measured
> the latencies at each step. The relevant debug data only points to the
> scheduling
> policy -- of both Host and Guest compositors -- playing a role in Guest
> rendering
> at 30 FPS.
Hm but that essentially means that the events your passing around have an
even more ad-hoc implementation specific meaning: Essentially it's the
kick-off for the guest's repaint loop? That sounds even worse for a kms
uapi extension.
> > Iow I think we need a solution here which both slows down the 90fps to
> > 60fps for the blit case, and the 30fps speed up to 60fps for the zerocopy
> > case. Because the host might need to switch transparently between blt and
> > zerocopy for various reasons.
> [Kasireddy, Vivek] As I mentioned above, the Host (Qemu) cannot switch UI
> backends at runtime. In other words, with GTK UI backend, it is always Blit
> whereas Wayland UI backend is always zero-copy.
Hm ok, that at least makes things somewhat simpler. Another thing that I
just realized: What happens when the host changes screen resolution and
especially refresh rate?
-Daniel
>
> Thanks,
> Vivek
>
> > -Daniel
> >
> > > Thanks,
> > > Vivek
> > > >
> > > > Cheers, Daniel
> > > >
> > > > >
> > > > >
> > > > > --
> > > > > Earthling Michel Dänzer |
> > > > > https://redhat.com
> > > > > Libre software enthusiast | Mesa and X
> > > > > developer
> > > >
> > > >
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Sat, Aug 07, 2021 at 06:21:10PM +0300, Imre Deak wrote:
> On Thu, Aug 05, 2021 at 12:23:21AM +0200, Daniel Vetter wrote:
> > On Mon, Aug 02, 2021 at 04:35:51PM +0300, Imre Deak wrote:
> > > Atm the EFI FB driver gets a runtime PM reference for the associated GFX
> > &g
ce);
> +
> + for_each_engine_masked(engine, ce->engine->gt, mask, tmp)
> + intel_engine_pm_put(engine);
> }
>
> static void guc_virtual_context_enter(struct intel_context *ce)
> @@ -3040,7 +3070,7 @@ static const struct intel_context_ops
> virtual_guc_context_ops = {
&
her aside: How does the perf/OA patching work on GuC?
Anyway, patch looks legit:
Reviewed-by: Daniel Vetter
> + if (intel_engine_uses_guc(engine))
> + return true;
> +
> /* GPU is pointing to the void, as good as in the kernel context. */
> if (intel_gt_is
intel_context *ce,
> bool loop)
>
> desc = __get_lrc_desc(guc, ce->guc_lrcd_reg_idx);
> desc->engine_class = engine_class_to_guc_class(engine->class);
> - desc->engine_submit_mask = adjust_engine_mask(engine->class,
> - engine->mask);
> + desc->engine_submit_mask = engine->logical_mask;
> desc->hw_context_desc = ce->lrc.lrca;
> ce->guc_prio = map_i915_prio_to_guc_prio(prio);
> desc->priority = ce->guc_prio;
> @@ -3978,6 +3960,7 @@ guc_create_virtual(struct intel_engine_cs **siblings,
> unsigned int count)
> }
>
> ve->base.mask |= sibling->mask;
> + ve->base.logical_mask |= sibling->logical_mask;
>
> if (n != 0 && ve->base.class != sibling->class) {
> DRM_DEBUG("invalid mixing of engine class, sibling %d,
> already %d\n",
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
e of engine */
> + __u16 logical_instance;
> +
> /** @rsvd1: Reserved fields. */
> - __u64 rsvd1[4];
> + __u16 rsvd1[3];
> + /** @rsvd2: Reserved fields. */
> + __u64 rsvd2[3];
> };
>
> /**
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
ree doesn't go boom since you have links
both ways). It looks like parent holds a reference on the child, so how do
you make sure the child looking at the parent doesn't go boom?
-Daniel
> + union {
> + struct list_head guc_child_list;/* parent */
> + struct list_head guc_child_link;/* child */
> + };
> +
> + /* Pointer to parent */
> + struct intel_context *parent;
> +
> + /* Number of children if parent */
> + u8 guc_number_children;
> +
> /*
>* GuC priority management
>*/
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Mon, Aug 09, 2021 at 04:37:55PM +0200, Daniel Vetter wrote:
> On Tue, Aug 03, 2021 at 03:29:12PM -0700, Matthew Brost wrote:
> > Introduce context parent-child relationship. Once this relationship is
> > created all pinning / unpinning operations are directed to the parent
&
}
> +
> + for_each_engine_masked(engine, ce->engine->gt,
> +ce->engine->mask, tmp)
> + intel_engine_pm_get(engine);
> + for_each_child(ce, child)
> + for_each_engine_masked(engine, child->engine->gt,
> +child->engine->mask, tmp)
> + intel_engine_pm_get(engine);
> +
> + return 0;
> +
> +unwind_pin:
> + for_each_child(ce, child) {
> + if (++j > i)
> + break;
> + __guc_context_unpin(child);
> + }
> +
> + return ret;
> +}
> +
> +/* Future patches will use this function */
> +__maybe_unused
> +static void guc_parent_context_unpin(struct intel_context *ce)
> +{
> + struct intel_context *child;
> + struct intel_engine_cs *engine;
> + intel_engine_mask_t tmp;
> +
> + GEM_BUG_ON(!intel_context_is_parent(ce));
> + GEM_BUG_ON(context_enabled(ce));
> +
> + unpin_guc_id(ce_to_guc(ce), ce, true);
> + for_each_child(ce, child)
> + __guc_context_unpin(child);
> + __guc_context_unpin(ce);
> +
> + for_each_engine_masked(engine, ce->engine->gt,
> +ce->engine->mask, tmp)
> + intel_engine_pm_put(engine);
> + for_each_child(ce, child)
> + for_each_engine_masked(engine, child->engine->gt,
> +child->engine->mask, tmp)
> + intel_engine_pm_put(engine);
> }
>
> static void __guc_context_sched_enable(struct intel_guc *guc,
> @@ -2993,18 +3139,17 @@ static int guc_request_alloc(struct i915_request *rq)
> }
>
> static int guc_virtual_context_pre_pin(struct intel_context *ce,
> -struct i915_gem_ww_ctx *ww,
> -void **vaddr)
> +struct i915_gem_ww_ctx *ww)
> {
> struct intel_engine_cs *engine = guc_virtual_get_sibling(ce->engine, 0);
>
> - return __guc_context_pre_pin(ce, engine, ww, vaddr);
> + return __guc_context_pre_pin(ce, engine, ww);
> }
>
> -static int guc_virtual_context_pin(struct intel_context *ce, void *vaddr)
> +static int guc_virtual_context_pin(struct intel_context *ce)
> {
> struct intel_engine_cs *engine = guc_virtual_get_sibling(ce->engine, 0);
> - int ret = __guc_context_pin(ce, engine, vaddr);
> + int ret = __guc_context_pin(ce, engine);
> intel_engine_mask_t tmp, mask = ce->engine->mask;
>
> if (likely(!ret))
> @@ -3024,7 +3169,7 @@ static void guc_virtual_context_unpin(struct
> intel_context *ce)
> GEM_BUG_ON(intel_context_is_barrier(ce));
>
> unpin_guc_id(guc, ce, true);
> - lrc_unpin(ce);
> + __guc_context_unpin(ce);
>
> for_each_engine_masked(engine, ce->engine->gt, mask, tmp)
> intel_engine_pm_put(engine);
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
t_unlock;
> }
> @@ -1770,6 +1858,7 @@ static void unpin_guc_id(struct intel_guc *guc,
> unsigned long flags;
>
> GEM_BUG_ON(atomic_read(&ce->guc_id_ref) < 0);
> + GEM_BUG_ON(intel_context_is_child(ce));
>
> if (unlikely(context_guc_id_invalid(ce)))
> return;
> @@ -1781,7 +1870,8 @@ static void unpin_guc_id(struct intel_guc *guc,
>
> if (!context_guc_id_invalid(ce) && !context_guc_id_stolen(ce) &&
> !atomic_read(&ce->guc_id_ref)) {
> - struct list_head *head = get_guc_id_list(guc, unpinned);
> + struct list_head *head =
> + get_guc_id_list(guc, ce->guc_number_children, unpinned);
>
> list_add_tail(&ce->guc_id_link, head);
> }
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission_types.h
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission_types.h
> index 7069b7248f55..a5933e07bdd2 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission_types.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission_types.h
> @@ -22,6 +22,16 @@ struct guc_virtual_engine {
> /*
> * Object which encapsulates the globally operated on i915_sched_engine +
> * the GuC submission state machine described in intel_guc_submission.c.
> + *
> + * Currently we have two instances of these per GuC. One for single-lrc and
> one
> + * for multi-lrc submission. We split these into two submission engines as
> they
> + * can operate in parallel allowing a blocking condition on one not to affect
> + * the other. i.e. guc_ids are statically allocated between these two
> submission
> + * modes. One mode may have guc_ids exhausted which requires blocking while
> the
> + * other has plenty of guc_ids and can make forward progres.
> + *
> + * In the future if different submission use cases arise we can simply
> + * instantiate another of these objects and assign it to the context.
> */
> struct guc_submit_engine {
> struct i915_sched_engine sched_engine;
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
l_guc_submission_types.h
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission_types.h
> index a5933e07bdd2..eae2e9725ede 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission_types.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission_types.h
> @@ -6,6 +6,8 @@
> #ifndef _INTEL_
ntainer_of(kref, struct intel_context, ref));
> +}
> +
> static void guc_context_destroy(struct kref *kref)
> {
> struct intel_context *ce = container_of(kref, typeof(*ce), ref);
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
line and the ordering
> + * rules for parallel requests are that they must be submitted in the
> + * order received from the execbuf IOCTL. So rather than using the
> + * timeline we store a pointer to last request submitted in the
> + * relationship in the gem context and insert a submission fence
> + * between that request and request passed into this function or
> + * alternatively we use completion fence if gem context has a single
> + * timeline and this is the first submission of an execbuf IOCTL.
> + */
> + if (likely(!is_parallel_rq(rq)))
> + prev = __i915_request_ensure_ordering(rq, timeline);
> + else
> + prev = __i915_request_ensure_parallel_ordering(rq, timeline);
> +
> /*
>* Make sure that no request gazumped us - if it was allocated after
>* our i915_request_alloc() and called __i915_request_add() before
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
g_context(p, ce);
> guc_log_context_priority(p, ce);
> +
> + if (intel_context_is_parent(ce)) {
> + struct guc_process_desc *desc = __get_process_desc(ce);
> + struct intel_context *child;
> +
> + drm_printf(p, "\t\
n each index of the
> + * virtual engines. e.g. CS[0] is bonded to CS[1], CS[2] is bonded to
> + * CS[3].
> + * VE[0] = CS[0], CS[2]
> + * VE[1] = CS[1], CS[3]
> + *
> + * Example 3 pseudo code:
> + * CS[X] = generic engine of same class, logical instance X
> + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
> + * set_engines(INVALID)
> + * set_parallel(engine_index=0, width=2, num_siblings=2,
> + *engines=CS[0],CS[1],CS[1],CS[3])
> + *
> + * Results in the following valid and invalid placements:
> + * CS[0], CS[1]
> + * CS[1], CS[3] - Not logical contiguous, return -EINVAL
> + */
> +struct i915_context_engines_parallel_submit {
> + /**
> + * @base: base user extension.
> + */
> + struct i915_user_extension base;
> +
> + /**
> + * @engine_index: slot for parallel engine
> + */
> + __u16 engine_index;
> +
> + /**
> + * @width: number of contexts per parallel engine
> + */
> + __u16 width;
> +
> + /**
> + * @num_siblings: number of siblings per context
> + */
> + __u16 num_siblings;
> +
> + /**
> + * @mbz16: reserved for future use; must be zero
> + */
> + __u16 mbz16;
> +
> + /**
> + * @flags: all undefined flags must be zero, currently not defined flags
> + */
> + __u64 flags;
> +
> + /**
> + * @mbz64: reserved for future use; must be zero
> + */
> + __u64 mbz64[3];
> +
> + /**
> + * @engines: 2-d array of engine instances to configure parallel engine
> + *
> + * length = width (i) * num_siblings (j)
> + * index = j + i * num_siblings
> + */
> + struct i915_engine_class_instance engines[0];
> +
> +} __packed;
> +
> +#define I915_DEFINE_CONTEXT_ENGINES_PARALLEL_SUBMIT(name__, N__) struct { \
> + struct i915_user_extension base; \
> + __u16 engine_index; \
> + __u16 width; \
> + __u16 num_siblings; \
> + __u16 mbz16; \
> + __u64 flags; \
> + __u64 mbz64[3]; \
> + struct i915_engine_class_instance engines[N__]; \
> +} __attribute__((packed)) name__
> +
> /**
> * DOC: Context Engine Map uAPI
> *
> @@ -2105,6 +2232,7 @@ struct i915_context_param_engines {
> __u64 extensions; /* linked chain of extension blocks, 0 terminates */
> #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see
> i915_context_engines_load_balance */
> #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see i915_context_engines_bond */
> +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see
> i915_context_engines_parallel_submit */
> struct i915_engine_class_instance engines[0];
> } __attribute__((packed));
>
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
ine(engine) ||
> + intel_context_is_parallel(eb->context)) {
> engine = engine->gt->engine_class[COPY_ENGINE_CLASS][0];
> if (!engine)
> return ERR_PTR(-ENODEV);
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
unsigned int flags)
> +int _i915_vma_move_to_active(struct i915_vma *vma,
> + struct i915_request *rq,
> + unsigned int flags,
> + struct dma_fence *shared_fence,
> + struct dma_fence *exc
On Mon, Aug 09, 2021 at 04:39:48PM +, Matthew Brost wrote:
> On Mon, Aug 09, 2021 at 06:32:42PM +0200, Daniel Vetter wrote:
> > On Tue, Aug 03, 2021 at 03:29:20PM -0700, Matthew Brost wrote:
> > > The GuC must receive requests in the order submitted for contexts in a
int __igt_gpu_reloc(struct i915_execbuffer *eb,
> if (IS_ERR(vma))
> return PTR_ERR(vma);
>
> - err = i915_gem_object_lock(obj, &eb->ww);
> + err = i915_gem_object_lock(obj, eb->ww);
> if (err)
> return err;
>
> - err = i915_v
On Mon, Aug 09, 2021 at 07:07:44PM +0200, Daniel Vetter wrote:
> On Tue, Aug 03, 2021 at 03:29:38PM -0700, Matthew Brost wrote:
> > Certain VMA functions in the execbuf IOCTL only need to be called on
> > first or last BB of a multi-BB submission. eb_relocate() on the first
>
&g
t.h
> +++ b/drivers/gpu/drm/i915/i915_selftest.h
> @@ -92,12 +92,14 @@ int __i915_subtests(const char *caller,
> T, ARRAY_SIZE(T), data)
> #define i915_live_subtests(T, data) ({ \
> typecheck(struct drm_i915_private *, data); \
> + (data)->gt.uc.guc.sched_disable_delay_ns = 0; \
> __i915_subtests(__func__, \
> __i915_live_setup, __i915_live_teardown, \
> T, ARRAY_SIZE(T), data); \
> })
> #define intel_gt_live_subtests(T, data) ({ \
> typecheck(struct intel_gt *, data); \
> + (data)->uc.guc.sched_disable_delay_ns = 0; \
> __i915_subtests(__func__, \
> __intel_gt_live_setup, __intel_gt_live_teardown, \
> T, ARRAY_SIZE(T), data); \
> diff --git a/drivers/gpu/drm/i915/i915_trace.h
> b/drivers/gpu/drm/i915/i915_trace.h
> index 806ad688274b..57ba7065d5ab 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -933,6 +933,11 @@ DEFINE_EVENT(intel_context, intel_context_reset,
>TP_ARGS(ce)
> );
>
> +DEFINE_EVENT(intel_context, intel_context_close,
> + TP_PROTO(struct intel_context *ce),
> + TP_ARGS(ce)
> +);
> +
> DEFINE_EVENT(intel_context, intel_context_ban,
>TP_PROTO(struct intel_context *ce),
>TP_ARGS(ce)
> @@ -1035,6 +1040,11 @@ trace_intel_context_reset(struct intel_context *ce)
> {
> }
>
> +static inline void
> +trace_intel_context_close(struct intel_context *ce)
> +{
> +}
> +
> static inline void
> trace_intel_context_ban(struct intel_context *ce)
> {
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> index f843a5040706..d54c280217fe 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> @@ -2112,5 +2112,5 @@ int i915_gem_gtt_live_selftests(struct drm_i915_private
> *i915)
>
> GEM_BUG_ON(offset_in_page(i915->ggtt.vm.total));
>
> - return i915_subtests(tests, i915);
> + return i915_live_subtests(tests, i915);
> }
> diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c
> b/drivers/gpu/drm/i915/selftests/i915_perf.c
> index 9e9a6cb1d9e5..86bad00cca95 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_perf.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
> @@ -431,7 +431,7 @@ int i915_perf_live_selftests(struct drm_i915_private
> *i915)
> if (err)
> return err;
>
> - err = i915_subtests(tests, i915);
> + err = i915_live_subtests(tests, i915);
>
> destroy_empty_config(&i915->perf);
>
> diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c
> b/drivers/gpu/drm/i915/selftests/i915_request.c
> index d67710d10615..afbf88865a8b 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_request.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_request.c
> @@ -1693,7 +1693,7 @@ int i915_request_live_selftests(struct drm_i915_private
> *i915)
> if (intel_gt_is_wedged(&i915->gt))
> return 0;
>
> - return i915_subtests(tests, i915);
> + return i915_live_subtests(tests, i915);
> }
>
> static int switch_to_kernel_sync(struct intel_context *ce, int err)
> diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c
> b/drivers/gpu/drm/i915/selftests/i915_vma.c
> index dd0607254a95..f4b157451851 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_vma.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
> @@ -1085,5 +1085,5 @@ int i915_vma_live_selftests(struct drm_i915_private
> *i915)
> SUBTEST(igt_vma_remapped_gtt),
> };
>
> - return i915_subtests(tests, i915);
> + return i915_live_subtests(tests, i915);
> }
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Sun, Aug 8, 2021 at 2:56 AM Jason Ekstrand wrote:
>
> On August 6, 2021 15:18:59 Daniel Vetter wrote:
>
>> gem context refcounting is another exercise in least locking design it
>> seems, where most things get destroyed upon context closure (which can
>> race with
On Mon, Aug 09, 2021 at 09:19:39AM -0700, Matt Roper wrote:
> On Mon, Aug 09, 2021 at 04:05:59PM +0200, Daniel Vetter wrote:
> > On Fri, Aug 06, 2021 at 09:36:56AM +0300, Joonas Lahtinen wrote:
> > > Hi Matt,
> > >
> > > Always use the dim tooling when ap
On Mon, Aug 09, 2021 at 04:12:52PM -0700, John Harrison wrote:
> On 8/6/2021 12:46, Daniel Vetter wrote:
> > Seen this fly by and figured I dropped a few thoughts in here. At the
> > likely cost of looking a bit out of whack :-)
> >
> > On Fri, Aug 6, 2021 at 8:01 P
intel_execlists_submission.c | 14 +
> .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 6 +-
> .../gpu/drm/i915/gt/uc/intel_guc_submission.h | 2 --
> 6 files changed, 26 insertions(+), 24 deletions(-)
>
> --
> 2.28.0
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Mon, Aug 09, 2021 at 06:11:37PM +, Matthew Brost wrote:
> On Mon, Aug 09, 2021 at 04:23:42PM +0200, Daniel Vetter wrote:
> > On Tue, Aug 03, 2021 at 03:29:07PM -0700, Matthew Brost wrote:
> > > Taking a PM reference to prevent intel_gt_wait_for_idle from short
> &
On Mon, Aug 09, 2021 at 06:20:51PM +, Matthew Brost wrote:
> On Mon, Aug 09, 2021 at 04:27:01PM +0200, Daniel Vetter wrote:
> > On Tue, Aug 03, 2021 at 03:29:08PM -0700, Matthew Brost wrote:
> > > Calling switch_to_kernel_context isn't needed if the engine PM reference
&
On Mon, Aug 09, 2021 at 06:28:58PM +, Matthew Brost wrote:
> On Mon, Aug 09, 2021 at 04:28:04PM +0200, Daniel Vetter wrote:
> > On Tue, Aug 03, 2021 at 03:29:10PM -0700, Matthew Brost wrote:
> > > Add logical engine mapping. This is required for split-frame, as
> >
On Mon, Aug 09, 2021 at 06:37:01PM +, Matthew Brost wrote:
> On Mon, Aug 09, 2021 at 04:30:06PM +0200, Daniel Vetter wrote:
> > On Tue, Aug 03, 2021 at 03:29:11PM -0700, Matthew Brost wrote:
> > > Expose logical engine instance to user via query engine info IOCTL. This
>
. Can this be
> > > > > acceptable?
> > > >
> > > > That's just another flavour of your "increase queue depth without
> > > > increasing the atomic queue depth" approach. I still think the
> > > > underlying
> > > > fundamental issue is a timing confusi
hristian König has
merged a patch set to lift this by reworking the shrinker interaction,
but it had to be reverted again because of some fallout I can't remember
offhand. dma_resv_lock vs shrinkers is very tricky.
So if you want resource limits then you really want cgroups here.
Cheers,
On Mon, Aug 09, 2021 at 06:44:16PM +, Matthew Brost wrote:
> On Mon, Aug 09, 2021 at 04:37:55PM +0200, Daniel Vetter wrote:
> > On Tue, Aug 03, 2021 at 03:29:12PM -0700, Matthew Brost wrote:
> > > Introduce context parent-child relationship. Once this relationship is
> &g
On Mon, Aug 09, 2021 at 06:58:23PM +, Matthew Brost wrote:
> On Mon, Aug 09, 2021 at 05:17:34PM +0200, Daniel Vetter wrote:
> > On Tue, Aug 03, 2021 at 03:29:13PM -0700, Matthew Brost wrote:
> > > Implement GuC parent-child context pin / unpin functions in which in any
>
401 - 500 of 23565 matches
Mail list logo