from:"Marek Olšák"

Re: [RFC PATCH 00/18] TTM interface for managing VRAM oversubscription

2024-04-25 Thread Marek Olšák

The most extreme ping-ponging is mitigated by throttling buffer moves
in the kernel, but it only works without VM_ALWAYS_VALID and you can
set BO priorities in the BO list. A better approach that works with
VM_ALWAYS_VALID would be nice.

Marek

On Wed, Apr 24, 2024 at 1:12 PM Friedrich Vock  wrote:
>
> Hi everyone,
>
> recently I've been looking into remedies for apps (in particular, newer
> games) that experience significant performance loss when they start to
> hit VRAM limits, especially on older or lower-end cards that struggle
> to fit both desktop apps and all the game data into VRAM at once.
>
> The root of the problem lies in the fact that from userspace's POV,
> buffer eviction is very opaque: Userspace applications/drivers cannot
> tell how oversubscribed VRAM is, nor do they have fine-grained control
> over which buffers get evicted.  At the same time, with GPU APIs becoming
> increasingly lower-level and GPU-driven, only the application itself
> can know which buffers are used within a particular submission, and
> how important each buffer is. For this, GPU APIs include interfaces
> to query oversubscription and specify memory priorities: In Vulkan,
> oversubscription can be queried through the VK_EXT_memory_budget
> extension. Different buffers can also be assigned priorities via the
> VK_EXT_pageable_device_local_memory extension. Modern games, especially
> D3D12 games via vkd3d-proton, rely on oversubscription being reported and
> priorities being respected in order to perform their memory management.
>
> However, relaying this information to the kernel via the current KMD uAPIs
> is not possible. On AMDGPU for example, all work submissions include a
> "bo list" that contains any buffer object that is accessed during the
> course of the submission. If VRAM is oversubscribed and a buffer in the
> list was evicted to system memory, that buffer is moved back to VRAM
> (potentially evicting other unused buffers).
>
> Since the usermode driver doesn't know what buffers are used by the
> application, its only choice is to submit a bo list that contains every
> buffer the application has allocated. In case of VRAM oversubscription,
> it is highly likely that some of the application's buffers were evicted,
> which almost guarantees that some buffers will get moved around. Since
> the bo list is only known at submit time, this also means the buffers
> will get moved right before submitting application work, which is the
> worst possible time to move buffers from a latency perspective. Another
> consequence of the large bo list is that nearly all memory from other
> applications will be evicted, too. When different applications (e.g. game
> and compositor) submit work one after the other, this causes a ping-pong
> effect where each app's submission evicts the other app's memory,
> resulting in a large amount of unnecessary moves.
>
> This overly aggressive eviction behavior led to RADV adopting a change
> that effectively allows all VRAM applications to reside in system memory
> [1].  This worked around the ping-ponging/excessive buffer moving problem,
> but also meant that any memory evicted to system memory would forever
> stay there, regardless of how VRAM is used.
>
> My proposal aims at providing a middle ground between these extremes.
> The goals I want to meet are:
> - Userspace is accurately informed about VRAM oversubscription/how much
>   VRAM has been evicted
> - Buffer eviction respects priorities set by userspace - Wasteful
>   ping-ponging is avoided to the extent possible
>
> I have been testing out some prototypes, and came up with this rough
> sketch of an API:
>
> - For each ttm_resource_manager, the amount of evicted memory is tracked
>   (similarly to how "usage" tracks the memory usage). When memory is
>   evicted via ttm_bo_evict, the size of the evicted memory is added, when
>   memory is un-evicted (see below), its size is subtracted. The amount of
>   evicted memory for e.g. VRAM can be queried by userspace via an ioctl.
>
> - Each ttm_resource_manager maintains a list of evicted buffer objects.
>
> - ttm_mem_unevict walks the list of evicted bos for a given
>   ttm_resource_manager and tries moving evicted resources back. When a
>   buffer is freed, this function is called to immediately restore some
>   evicted memory.
>
> - Each ttm_buffer_object independently tracks the mem_type it wants
>   to reside in.
>
> - ttm_bo_try_unevict is added as a helper function which attempts to
>   move the buffer to its preferred mem_type. If no space is available
>   there, it fails with -ENOSPC/-ENOMEM.
>
> - Similar to how ttm_bo_evict works, each driver can implement
>   uneviction_valuable/unevict_flags callbacks to control buffer
>   un-eviction.
>
> This is what patches 1-10 accomplish (together with an amdgpu
> implementation utilizing the new API).
>
> Userspace priorities could then be implemented as follows:
>
> - TTM already manages priorities for each buffer object. These

Re: [PATCH 1/3] drm/amdgpu: Forward soft recovery errors to userspace

2024-03-09 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Fri, Mar 8, 2024 at 3:43 AM Christian König  wrote:
>
> Am 07.03.24 um 20:04 schrieb Joshua Ashton:
> > As we discussed before[1], soft recovery should be
> > forwarded to userspace, or we can get into a really
> > bad state where apps will keep submitting hanging
> > command buffers cascading us to a hard reset.
>
> Marek you are in favor of this like forever.  So I would like to request
> you to put your Reviewed-by on it and I will just push it into our
> internal kernel branch.
>
> Regards,
> Christian.
>
> >
> > 1: 
> > https://lore.kernel.org/all/bf23d5ed-9a6b-43e7-84ee-8cbfd0d60...@froggi.es/
> > Signed-off-by: Joshua Ashton 
> >
> > Cc: Friedrich Vock 
> > Cc: Bas Nieuwenhuizen 
> > Cc: Christian König 
> > Cc: André Almeida 
> > Cc: sta...@vger.kernel.org
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +--
> >   1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > index 4b3000c21ef2..aebf59855e9f 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > @@ -262,9 +262,8 @@ amdgpu_job_prepare_job(struct drm_sched_job *sched_job,
> >   struct dma_fence *fence = NULL;
> >   int r;
> >
> > - /* Ignore soft recovered fences here */
> >   r = drm_sched_entity_error(s_entity);
> > - if (r && r != -ENODATA)
> > + if (r)
> >   goto error;
> >
> >   if (!fence && job->gang_submit)
>

Re: [PATCH 2/2] drm/amdgpu: Mark ctx as guilty in ring_soft_recovery path

2024-01-15 Thread Marek Olšák

On Mon, Jan 15, 2024 at 3:06 PM Christian König
 wrote:
>
> Am 15.01.24 um 20:30 schrieb Joshua Ashton:
> > On 1/15/24 19:19, Christian König wrote:
> >> Am 15.01.24 um 20:13 schrieb Joshua Ashton:
> >>> On 1/15/24 18:53, Christian König wrote:
>  Am 15.01.24 um 19:35 schrieb Joshua Ashton:
> > On 1/15/24 18:30, Bas Nieuwenhuizen wrote:
> >> On Mon, Jan 15, 2024 at 7:14 PM Friedrich Vock
> >> mailto:friedrich.v...@gmx.de>> wrote:
> >>
> >> Re-sending as plaintext, sorry about that
> >>
> >> On 15.01.24 18:54, Michel Dänzer wrote:
> >>  > On 2024-01-15 18:26, Friedrich Vock wrote:
> >>  >> [snip]
> >>  >> The fundamental problem here is that not telling
> >> applications that
> >>  >> something went wrong when you just canceled their work
> >> midway is an
> >>  >> out-of-spec hack.
> >>  >> When there is a report of real-world apps breaking
> >> because of
> >> that hack,
> >>  >> reports of different apps working (even if it's
> >> convenient that they
> >>  >> work) doesn't justify keeping the broken code.
> >>  > If the breaking apps hit multiple soft resets in a row,
> >> I've laid
> >> out a pragmatic solution which covers both cases.
> >> Hitting soft reset every time is the lucky path. Once GPU
> >> work is
> >> interrupted out of nowhere, all bets are off and it might as
> >> well
> >> trigger a full system hang next time. No hang recovery should
> >> be able to
> >> cause that under any circumstance.
> >>
> >>
> >> I think the more insidious situation is no further hangs but
> >> wrong results because we skipped some work. That we skipped work
> >> may e.g. result in some texture not being uploaded or some GPGPU
> >> work not being done and causing further errors downstream (say if
> >> a game is doing AI/physics on the GPU not to say anything of
> >> actual GPGPU work one might be doing like AI)
> >
> > Even worse if this is compute on eg. OpenCL for something
> > science/math/whatever related, or training a model.
> >
> > You could randomly just get invalid/wrong results without even
> > knowing!
> 
>  Well on the kernel side we do provide an API to query the result of
>  a submission. That includes canceling submissions with a soft
>  recovery.
> 
>  What we just doesn't do is to prevent further submissions from this
>  application. E.g. enforcing that the application is punished for
>  bad behavior.
> >>>
> >>> You do prevent future submissions for regular resets though: Those
> >>> increase karma which sets ctx->guilty, and if ctx->guilty then
> >>> -ECANCELED is returned for a submission.
> >>>
> >>> ctx->guilty is never true for soft recovery though, as it doesn't
> >>> increase karma, which is the problem this patch is trying to solve.
> >>>
> >>> By the submission result query API, I you assume you mean checking
> >>> the submission fence error somehow? That doesn't seem very ergonomic
> >>> for a Vulkan driver compared to the simple solution which is to just
> >>> mark it as guilty with what already exists...
> >>
> >> Well as I said the guilty handling is broken for quite a number of
> >> reasons.
> >>
> >> What we can do rather trivially is changing this code in
> >> amdgpu_job_prepare_job():
> >>
> >>  /* Ignore soft recovered fences here */
> >>  r = drm_sched_entity_error(s_entity);
> >>  if (r && r != -ENODATA)
> >>  goto error;
> >>
> >> This will bubble up errors from soft recoveries into the entity as
> >> well and makes sure that further submissions are rejected.
> >
> > That makes sense to do, but at least for GL_EXT_robustness, that will
> > not tell the app that it was guilty.
>
> No, it clearly gets that signaled. We should probably replace the guilty
> atomic with a calls to drm_sched_entity_error().
>
> It's just that this isn't what Marek and I had in mind for this,
> basically completely forget about AMDGPU_CTX_OP_QUERY_STATE or
> AMDGPU_CTX_OP_QUERY_STATE2.
>
> Instead just look at the return value of the CS or query fence result IOCTL.
>
> When you get an -ENODATA you have been guilty of causing a soft
> recovery, when you get an -ETIME you are guilty of causing a timeout
> which had to be hard recovered. When you get an -ECANCELED you are an
> innocent victim of a hard recovery somebody else caused.
>
> What we haven't defined yet is an error code for loosing VRAM, but that
> should be trivial to do.

So far we have implemented the GPU reset and soft reset, but we
haven't done anything to have a robust system recovery. Under the
current system, things can easily keep hanging indefinitely because
nothing prevents that.

The reset status query should stay. Robust apps will use it to tell
when they should recreate their context and resources even if they

Re: [PATCH 2/2] drm/amdgpu: Mark ctx as guilty in ring_soft_recovery path

2024-01-15 Thread Marek Olšák

On Mon, Jan 15, 2024 at 11:41 AM Michel Dänzer  wrote:
>
> On 2024-01-15 17:19, Friedrich Vock wrote:
> > On 15.01.24 16:43, Joshua Ashton wrote:
> >> On 1/15/24 15:25, Michel Dänzer wrote:
> >>> On 2024-01-15 14:17, Christian König wrote:
>  Am 15.01.24 um 12:37 schrieb Joshua Ashton:
> > On 1/15/24 09:40, Christian König wrote:
> >> Am 13.01.24 um 15:02 schrieb Joshua Ashton:
> >>
> >>> Without this feedback, the application may keep pushing through
> >>> the soft
> >>> recoveries, continually hanging the system with jobs that timeout.
> >>
> >> Well, that is intentional behavior. Marek is voting for making
> >> soft recovered errors fatal as well while Michel is voting for
> >> better ignoring them.
> >>
> >> I'm not really sure what to do. If you guys think that soft
> >> recovered hangs should be fatal as well then we can certainly do
> >> this.
> >>>
> >>> A possible compromise might be making soft resets fatal if they
> >>> happen repeatedly (within a certain period of time?).
> >>
> >> No, no and no. Aside from introducing issues by side effects not
> >> surfacing and all of the stuff I mentioned about descriptor buffers,
> >> bda, draw indirect and stuff just resulting in more faults and hangs...
> >>
> >> You are proposing we throw out every promise we made to an application
> >> on the API contract level because it "might work". That's just wrong!
> >>
> >> Let me put this in explicit terms: What you are proposing is in direct
> >> violation of the GL and Vulkan specification.
> >>
> >> You can't just chose to break these contracts because you think it
> >> 'might' be a better user experience.
> >
> > Is the original issue that motivated soft resets to be non-fatal even an
> > issue anymore?
> >
> > If I read that old thread correctly, the rationale for that was that
> > assigning guilt to a context was more broken than not doing it, because
> > the compositor/Xwayland process would also crash despite being unrelated
> > to the hang.
> > With Joshua's Mesa fixes, this is not the case anymore, so I don't think
> > keeping soft resets non-fatal provides any benefit to the user experience.
> > The potential detriments to user experience have been outlined multiple
> > times in this thread already.
> >
> > (I suppose if the compositor itself faults it might still bring down a
> > session, but I've literally never seen that, and it's not like a
> > compositor triggering segfaults on CPU stays alive either.)
>
> That's indeed what happened for me, multiple times. And each time the session 
> continued running fine for days after the soft reset.
>
> But apparently my experience isn't valid somehow, and I should have been 
> forced to log in again to please the GL gods...
>
>
> Conversely, I can't remember hitting a case where an app kept running into 
> soft resets. It's almost as if different people may have different 
> experiences! ;)
>
> Note that I'm not saying that case can't happen. Making soft resets fatal 
> only if they happen repeatedly could address both issues, rather than only 
> one or the other. Seems like a win-win.

This is exactly the comment that shouldn't have been sent, and you are
not the only one.

Nobody should ever care about subjective experiences. We can only do
this properly by looking at the whole system and its rules and try to
find a solution that works for everything on paper first. DrawIndirect
is one case where the current system fails. "Works for me because I
don't use DrawIndirect" is a horrible way to do this.

Marek

Re: [PATCH] Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole"

2024-01-10 Thread Marek Olšák

It looks like this would cause failures even with regular 64-bit
allocations because the virtual address range allocator in libdrm asks
the kernel what ranges of addresses are free, and the kernel doesn't
exclude the KFD allocation from that.

Basically, no VM allocations can be done by the kernel outside the
ranges reserved for the kernel.

Marek

On Sat, Jan 6, 2024 at 1:48 AM Marek Olšák  wrote:
>
> The 32-bit address space means the high 32 bits are constant and 
> predetermined and it's definitely somewhere in the upper range of the address 
> space. If ROCm or KFD occupy that space, even accidentally, other UMDs that 
> use libdrm for VA allocation won't be able to start. The VA range allocator 
> is in libdrm.
>
> Marek
>
> On Fri, Jan 5, 2024, 15:20 Felix Kuehling  wrote:
>>
>> TBA/TMA were relocated to the upper half of the canonical address space.
>> I don't think that qualifies as 32-bit by definition. But maybe you're
>> using a different definition.
>>
>> That said, if Mesa manages its own virtual address space in user mode,
>> and KFD maps the TMA/TBA at an address that Mesa believes to be free, I
>> can see how that would lead to problems.
>>
>> That said, the fence refcount bug is another problem that may have been
>> exposed by the way that a crashing Mesa application shuts down.
>> Reverting Jay's patch certainly didn't fix that, but only hides the problem.
>>
>> Regards,
>>Felix
>>
>>
>> On 2024-01-04 13:29, Marek Olšák wrote:
>> > Hi,
>> >
>> > I have received information that the original commit makes all 32-bit
>> > userspace VA allocations fail, so UMDs like Mesa can't even initialize
>> > and they either crash or fail to load. If TBA/TMA was relocated to the
>> > 32-bit address range, it would explain why UMDs can't allocate
>> > anything in that range.
>> >
>> > Marek
>> >
>> > On Wed, Jan 3, 2024 at 2:50 PM Jay Cornwall  wrote:
>> >> On 1/3/2024 12:58, Felix Kuehling wrote:
>> >>
>> >>> A segfault in Mesa seems to be a different issue from what's mentioned
>> >>> in the commit message. I'd let Christian or Marek comment on
>> >>> compatibility with graphics UMDs. I'm not sure why this patch would
>> >>> affect them at all.
>> >> I was referencing this issue in OpenCL/OpenGL interop, which certainly 
>> >> looked related:
>> >>
>> >> [   91.769002] amdgpu :0a:00.0: amdgpu: bo 9bba4692 va 
>> >> 0x08-0x0801ff conflict with 0x08-0x080002
>> >> [   91.769141] ocltst[2781]: segfault at b2 ip 7f3fb90a7c39 sp 
>> >> 7ffd3c011ba0 error 4 in radeonsi_dri.so[7f3fb888e000+1196000] likely 
>> >> on CPU 15 (core 7, socket 0)
>> >>
>> >>> Looking at the logs in the tickets, it looks like a fence reference
>> >>> counting error. I don't see how Jay's patch could have caused that. I
>> >>> made another change in that code recently that could make a difference
>> >>> for this issue:
>> >>>
>> >>>  commit 8f08c5b24ced1be7eb49692e4816c1916233c79b
>> >>>  Author: Felix Kuehling 
>> >>>  Date:   Fri Oct 27 18:21:55 2023 -0400
>> >>>
>> >>>   drm/amdkfd: Run restore_workers on freezable WQs
>> >>>
>> >>>   Make restore workers freezable so we don't have to explicitly
>> >>>  flush them
>> >>>   in suspend and GPU reset code paths, and we don't accidentally
>> >>>  try to
>> >>>   restore BOs while the GPU is suspended. Not having to flush
>> >>>  restore_work
>> >>>   also helps avoid lock/fence dependencies in the GPU reset case
>> >>>  where we're
>> >>>   not allowed to wait for fences.
>> >>>
>> >>>   A side effect of this is, that we can now have multiple
>> >>>  concurrent threads
>> >>>   trying to signal the same eviction fence. Rework eviction fence
>> >>>  signaling
>> >>>   and replacement to account for that.
>> >>>
>> >>>   The GPU reset path can no longer rely on restore_process_worker
>> >>>  to resume
>> >>>   queues because evict/restore workers can run independently of
>> >>>  it. Instead
>> >>>   call a new restore_process_helper directly.
>> >>>
>> >>>   This is an RFC and request for testing.
>> >>>
>> >>>   v2:
>> >>>   - Reworked eviction fence signaling
>> >>>   - Introduced restore_process_helper
>> >>>
>> >>>   v3:
>> >>>   - Handle unsignaled eviction fences in restore_process_bos
>> >>>
>> >>>   Signed-off-by: Felix Kuehling 
>> >>>   Acked-by: Christian König 
>> >>>   Tested-by: Emily Deng 
>> >>>   Signed-off-by: Alex Deucher 
>> >>>
>> >>>
>> >>> FWIW, I built a plain 6.6 kernel, and was not able to reproduce the
>> >>> crash with some simple tests.
>> >>>
>> >>> Regards,
>> >>> Felix
>> >>>
>> >>>
>> >>>> So I agree, let's revert it.
>> >>>>
>> >>>> Reviewed-by: Jay Cornwall

Re: 回复: Re: 回复: Re: [PATCH libdrm 1/2] amdgpu: fix parameter of amdgpu_cs_ctx_create2

2024-01-09 Thread Marek Olšák

int p = -1.
unsigned u = p;
int p2 = u;

p2 is -1.

Marek

On Tue, Jan 9, 2024, 03:26 Christian König  wrote:

> Am 09.01.24 um 09:09 schrieb 李真能:
>
> Thanks!
>
> What about the second patch?
>
> The second patch:   amdgpu: change proirity value to be consistent with
> kernel.
>
> As I want to pass AMDGPU_CTX_PRIORITY_LOW to kernel module drm-scheduler,
> if these two patches are not applyed,
>
> It can not pass LOW priority to drm-scheduler.
>
> Do you have any other suggestion?
>
>
> Well what exactly is the problem? Just use AMD_PRIORITY=-512.
>
> As far as I can see that is how it is supposed to be used.
>
> Regards,
> Christian.
>
>
>
>
>
>
>
>
> 
>
>
>
>
>
> *主 题：*Re: 回复: Re: [PATCH libdrm 1/2] amdgpu: fix parameter of
> amdgpu_cs_ctx_create2
> *日 期：*2024-01-09 15:15
> *发件人：*Christian König
> *收件人：*李真能;Marek Olsak;Pierre-Eric Pelloux-Prayer;dri-devel;amd-gfx;
>
> Am 09.01.24 um 02:50 schrieb 李真能:
>
> When the priority value is passed to the kernel, the kernel compares it
> with the following values:
>
> #define AMDGPU_CTX_PRIORITY_VERY_LOW-1023
> #define AMDGPU_CTX_PRIORITY_LOW -512
> #define AMDGPU_CTX_PRIORITY_NORMAL  0
> #define AMDGPU_CTX_PRIORITY_HIGH512
> #define AMDGPU_CTX_PRIORITY_VERY_HIGH   1023
>
> If priority is uint32_t, we can't set LOW and VERY_LOW value to kernel
> context priority,
>
> Well that's nonsense.
>
> How the kernel handles the values and how userspace handles them are two
> separate things. You just need to make sure that it's always 32 bits.
>
> In other words if you have signed or unsigned data type in userspace is
> irrelevant for the kernel.
>
> You can refer to the kernel function amdgpu_ctx_priority_permit, if
> priority is greater
>
> than 0, and this process has not  CAP_SYS_NICE capibility or DRM_MASTER
> permission,
>
> this process will be exited.
>
> Correct, that's intentional.
>
> Regards,
> Christian.
>
>
>
>
>
>
> 
>
>
>
>
>
> *主 题：*Re: [PATCH libdrm 1/2] amdgpu: fix parameter of
> amdgpu_cs_ctx_create2
> *日 期：*2024-01-09 00:28
> *发件人：*Christian König
> *收件人：*李真能;Marek Olsak;Pierre-Eric Pelloux-Prayer;dri-devel;amd-gfx;
>
> Am 08.01.24 um 10:40 schrieb Zhenneng Li:
> > In order to pass the correct priority parameter to the kernel,
> > we must change priority type from uint32_t to int32_t.
>
> Hui what? Why should it matter if the parameter is signed or not?
>
> That doesn't seem to make sense.
>
> Regards,
> Christian.
>
> >
> > Signed-off-by: Zhenneng Li
> > ---
> > amdgpu/amdgpu.h | 2 +-
> > amdgpu/amdgpu_cs.c | 2 +-
> > 2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
> > index 9bdbf366..f46753f3 100644
> > --- a/amdgpu/amdgpu.h
> > +++ b/amdgpu/amdgpu.h
> > @@ -896,7 +896,7 @@ int amdgpu_bo_list_update(amdgpu_bo_list_handle
> handle,
> > *
> > */
> > int amdgpu_cs_ctx_create2(amdgpu_device_handle dev,
> > - uint32_t priority,
> > + int32_t priority,
> > amdgpu_context_handle *context);
> > /**
> > * Create GPU execution Context
> > diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
> > index 49fc16c3..eb72c638 100644
> > --- a/amdgpu/amdgpu_cs.c
> > +++ b/amdgpu/amdgpu_cs.c
> > @@ -49,7 +49,7 @@ static int amdgpu_cs_reset_sem(amdgpu_semaphore_handle
> sem);
> > * \return 0 on success otherwise POSIX Error code
> > */
> > drm_public int amdgpu_cs_ctx_create2(amdgpu_device_handle dev,
> > - uint32_t priority,
> > + int32_t priority,
> > amdgpu_context_handle *context)
> > {
> > struct amdgpu_context *gpu_context;
>
>
>

Re: [PATCH] Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole"

2024-01-05 Thread Marek Olšák

The 32-bit address space means the high 32 bits are constant and
predetermined and it's definitely somewhere in the upper range of the
address space. If ROCm or KFD occupy that space, even accidentally, other
UMDs that use libdrm for VA allocation won't be able to start. The VA range
allocator is in libdrm.

Marek

On Fri, Jan 5, 2024, 15:20 Felix Kuehling  wrote:

> TBA/TMA were relocated to the upper half of the canonical address space.
> I don't think that qualifies as 32-bit by definition. But maybe you're
> using a different definition.
>
> That said, if Mesa manages its own virtual address space in user mode,
> and KFD maps the TMA/TBA at an address that Mesa believes to be free, I
> can see how that would lead to problems.
>
> That said, the fence refcount bug is another problem that may have been
> exposed by the way that a crashing Mesa application shuts down.
> Reverting Jay's patch certainly didn't fix that, but only hides the
> problem.
>
> Regards,
>    Felix
>
>
> On 2024-01-04 13:29, Marek Olšák wrote:
> > Hi,
> >
> > I have received information that the original commit makes all 32-bit
> > userspace VA allocations fail, so UMDs like Mesa can't even initialize
> > and they either crash or fail to load. If TBA/TMA was relocated to the
> > 32-bit address range, it would explain why UMDs can't allocate
> > anything in that range.
> >
> > Marek
> >
> > On Wed, Jan 3, 2024 at 2:50 PM Jay Cornwall 
> wrote:
> >> On 1/3/2024 12:58, Felix Kuehling wrote:
> >>
> >>> A segfault in Mesa seems to be a different issue from what's mentioned
> >>> in the commit message. I'd let Christian or Marek comment on
> >>> compatibility with graphics UMDs. I'm not sure why this patch would
> >>> affect them at all.
> >> I was referencing this issue in OpenCL/OpenGL interop, which certainly
> looked related:
> >>
> >> [   91.769002] amdgpu :0a:00.0: amdgpu: bo 9bba4692 va
> 0x08-0x0801ff conflict with 0x08-0x080002
> >> [   91.769141] ocltst[2781]: segfault at b2 ip 7f3fb90a7c39 sp
> 7ffd3c011ba0 error 4 in radeonsi_dri.so[7f3fb888e000+1196000] likely on
> CPU 15 (core 7, socket 0)
> >>
> >>> Looking at the logs in the tickets, it looks like a fence reference
> >>> counting error. I don't see how Jay's patch could have caused that. I
> >>> made another change in that code recently that could make a difference
> >>> for this issue:
> >>>
> >>>  commit 8f08c5b24ced1be7eb49692e4816c1916233c79b
> >>>  Author: Felix Kuehling 
> >>>  Date:   Fri Oct 27 18:21:55 2023 -0400
> >>>
> >>>   drm/amdkfd: Run restore_workers on freezable WQs
> >>>
> >>>   Make restore workers freezable so we don't have to explicitly
> >>>  flush them
> >>>   in suspend and GPU reset code paths, and we don't
> accidentally
> >>>  try to
> >>>   restore BOs while the GPU is suspended. Not having to flush
> >>>  restore_work
> >>>   also helps avoid lock/fence dependencies in the GPU reset
> case
> >>>  where we're
> >>>   not allowed to wait for fences.
> >>>
> >>>   A side effect of this is, that we can now have multiple
> >>>  concurrent threads
> >>>   trying to signal the same eviction fence. Rework eviction
> fence
> >>>  signaling
> >>>   and replacement to account for that.
> >>>
> >>>   The GPU reset path can no longer rely on
> restore_process_worker
> >>>  to resume
> >>>   queues because evict/restore workers can run independently of
> >>>  it. Instead
> >>>   call a new restore_process_helper directly.
> >>>
> >>>   This is an RFC and request for testing.
> >>>
> >>>   v2:
> >>>   - Reworked eviction fence signaling
> >>>   - Introduced restore_process_helper
> >>>
> >>>   v3:
> >>>   - Handle unsignaled eviction fences in restore_process_bos
> >>>
> >>>   Signed-off-by: Felix Kuehling 
> >>>   Acked-by: Christian König 
> >>>   Tested-by: Emily Deng 
> >>>   Signed-off-by: Alex Deucher 
> >>>
> >>>
> >>> FWIW, I built a plain 6.6 kernel, and was not able to reproduce the
> >>> crash with some simple tests.
> >>>
> >>> Regards,
> >>> Felix
> >>>
> >>>
> >>>> So I agree, let's revert it.
> >>>>
> >>>> Reviewed-by: Jay Cornwall 
>

Re: [PATCH] Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole"

2024-01-04 Thread Marek Olšák

Hi,

I have received information that the original commit makes all 32-bit
userspace VA allocations fail, so UMDs like Mesa can't even initialize
and they either crash or fail to load. If TBA/TMA was relocated to the
32-bit address range, it would explain why UMDs can't allocate
anything in that range.

Marek

On Wed, Jan 3, 2024 at 2:50 PM Jay Cornwall  wrote:
>
> On 1/3/2024 12:58, Felix Kuehling wrote:
>
> > A segfault in Mesa seems to be a different issue from what's mentioned
> > in the commit message. I'd let Christian or Marek comment on
> > compatibility with graphics UMDs. I'm not sure why this patch would
> > affect them at all.
>
> I was referencing this issue in OpenCL/OpenGL interop, which certainly looked 
> related:
>
> [   91.769002] amdgpu :0a:00.0: amdgpu: bo 9bba4692 va 
> 0x08-0x0801ff conflict with 0x08-0x080002
> [   91.769141] ocltst[2781]: segfault at b2 ip 7f3fb90a7c39 sp 
> 7ffd3c011ba0 error 4 in radeonsi_dri.so[7f3fb888e000+1196000] likely on 
> CPU 15 (core 7, socket 0)
>
> >
> > Looking at the logs in the tickets, it looks like a fence reference
> > counting error. I don't see how Jay's patch could have caused that. I
> > made another change in that code recently that could make a difference
> > for this issue:
> >
> > commit 8f08c5b24ced1be7eb49692e4816c1916233c79b
> > Author: Felix Kuehling 
> > Date:   Fri Oct 27 18:21:55 2023 -0400
> >
> >  drm/amdkfd: Run restore_workers on freezable WQs
> >
> >  Make restore workers freezable so we don't have to explicitly
> > flush them
> >  in suspend and GPU reset code paths, and we don't accidentally
> > try to
> >  restore BOs while the GPU is suspended. Not having to flush
> > restore_work
> >  also helps avoid lock/fence dependencies in the GPU reset case
> > where we're
> >  not allowed to wait for fences.
> >
> >  A side effect of this is, that we can now have multiple
> > concurrent threads
> >  trying to signal the same eviction fence. Rework eviction fence
> > signaling
> >  and replacement to account for that.
> >
> >  The GPU reset path can no longer rely on restore_process_worker
> > to resume
> >  queues because evict/restore workers can run independently of
> > it. Instead
> >  call a new restore_process_helper directly.
> >
> >  This is an RFC and request for testing.
> >
> >  v2:
> >  - Reworked eviction fence signaling
> >  - Introduced restore_process_helper
> >
> >  v3:
> >  - Handle unsignaled eviction fences in restore_process_bos
> >
> >  Signed-off-by: Felix Kuehling 
> >  Acked-by: Christian König 
> >  Tested-by: Emily Deng 
> >  Signed-off-by: Alex Deucher 
> >
> >
> > FWIW, I built a plain 6.6 kernel, and was not able to reproduce the
> > crash with some simple tests.
> >
> > Regards,
> >Felix
> >
> >
> >>
> >> So I agree, let's revert it.
> >>
> >> Reviewed-by: Jay Cornwall 
>

Re: [PATCH] drm/amdgpu: Enable tunneling on high-priority compute queues

2023-12-11 Thread Marek Olšák

On Fri, Dec 8, 2023 at 1:37 PM Alex Deucher  wrote:

> On Fri, Dec 8, 2023 at 12:27 PM Joshua Ashton  wrote:
> >
> > FWIW, we are shipping this right now in SteamOS Preview channel
> > (probably going to Stable soon) and it seems to be working as expected
> > and fixing issues there in instances we need to composite, compositor
> > work we are forced to do would take longer than the compositor redzone
> > to vblank.
> >
> > Previously in high gfx workloads like Cyberpunk using 100% of the GPU,
> > we would consistently miss the deadline as composition could take
> > anywhere from 2-6ms fairly randomly.
> >
> > Now it seems the time for the compositor's work to complete is pretty
> > consistent and well in-time in gpuvis for every frame.
>
> I was mostly just trying to look up the information to verify that it
> was set up correctly, but I guess Marek already did and provided you
> with that info, so it's probably fine as is.
>
> >
> > The only times we are not meeting deadline now is when there is an
> > application using very little GPU and finishes incredibly quick, and the
> > compositor is doing significantly more work (eg. FSR from 800p -> 4K or
> > whatever), but that's a separate problem that can likely be solved by
> > inlining some of the composition work with the client's dmabuf work if
> > it has focus to avoid those clock bubbles.
> >
> > I heard some musings about dmabuf deadline kernel work recently, but not
> > sure if any of that is applicable to AMD.
>
> I think something like a workload hint would be more useful.  We did a
> few patch sets to allow userspace to provide a hint to the kernel
> about the workload type so the kernel could adjust the power
> management heuristics accordingly, but there were concerns that the
> UMDs would have to maintain application lists to select which
> heuristic worked best for each application.  Maybe it would be better
> to provide a general classification?  E.g., if the GL or vulkan app
> uses these extensions, it's probably a compute type application vs
> something more graphics-y.  The usual trade-off between power and
> performance.  In general, just letting the firmware pick the clock
> based on perf counters generally seems to work the best.  Maybe a
> general workload hint set by the compositor based on the content type
> it's displaying would be a better option (video vs gaming vs desktop)?
>
> The deadline stuff doesn't really align well with what we can do with
> our firmware and seems ripe for abuse.  Apps can just ask for high
> clocks all the time which is great for performance, but not great for
> power.  Plus there is not much room for anything other than max clocks
> since you don't know how big the workload is or which clocks are the
> limiting factor.
>

Max clocks also decrease performance due to thermal and power limits.
You'll get more performance and less heat if you let the GPU turn off idle
blocks and boost clocks for busy blocks.

Marek

Re: [PATCH] drm/amdgpu: Enable tunneling on high-priority compute queues

2023-12-08 Thread Marek Olšák

On Fri, Dec 8, 2023 at 9:57 AM Christian König 
wrote:

> Am 08.12.23 um 12:43 schrieb Friedrich Vock:
> > On 08.12.23 10:51, Christian König wrote:
> >> Well longer story short Alex and I have been digging up the
> >> documentation for this and as far as we can tell this isn't correct.
> > Huh. I initially talked to Marek about this, adding him in Cc.
>
> Yeah, from the userspace side all you need to do is to set the bit as
> far as I can tell.
>
> >>
> >> You need to do quite a bit more before you can turn on this feature.
> >> What userspace side do you refer to?
> > I was referring to the Mesa merge request I made
> > (https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26462).
> > If/When you have more details about what else needs to be done, feel
> > free to let me know.
>
> For example from the hardware specification explicitly states that the
> kernel driver should make sure that only one app/queue is using this at
> the same time. That might work for now since we should only have a
> single compute priority queue, but we are not 100% sure yet.
>

This is incorrect. While the hw documentation says it's considered
"unexpected programming", it also says that the hardware algorithm handles
it correctly and it describes what happens in this case: Tunneled waves
from different queues are treated as equal.

Marek

Re: [PATCH] drm/amdgpu: Enable tunneling on high-priority compute queues

2023-12-08 Thread Marek Olšák

It's correct according to our documentation.

Reviewed-by: Marek Olšák 

Marek

On Fri, Dec 8, 2023 at 5:47 AM Christian König 
wrote:

> Well longer story short Alex and I have been digging up the
> documentation for this and as far as we can tell this isn't correct.
>
> You need to do quite a bit more before you can turn on this feature.
> What userspace side do you refer to?
>
> Regards,
> Christian.
>
> Am 08.12.23 um 09:19 schrieb Friedrich Vock:
> > Friendly ping on this one.
> > Userspace side got merged, so would be great to land this patch too :)
> >
> > On 02.12.23 01:17, Friedrich Vock wrote:
> >> This improves latency if the GPU is already busy with other work.
> >> This is useful for VR compositors that submit highly latency-sensitive
> >> compositing work on high-priority compute queues while the GPU is busy
> >> rendering the next frame.
> >>
> >> Userspace merge request:
> >> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26462
> >>
> >> Signed-off-by: Friedrich Vock 
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu.h  |  1 +
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 10 ++
> >>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   |  3 ++-
> >>   drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c   |  3 ++-
> >>   4 files changed, 11 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> index 9505dc8f9d69..4b923a156c4e 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> @@ -790,6 +790,7 @@ struct amdgpu_mqd_prop {
> >>   uint64_t eop_gpu_addr;
> >>   uint32_t hqd_pipe_priority;
> >>   uint32_t hqd_queue_priority;
> >> +bool allow_tunneling;
> >>   bool hqd_active;
> >>   };
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >> index 231d49132a56..4d98e8879be8 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >> @@ -620,6 +620,10 @@ static void amdgpu_ring_to_mqd_prop(struct
> >> amdgpu_ring *ring,
> >>   struct amdgpu_mqd_prop *prop)
> >>   {
> >>   struct amdgpu_device *adev = ring->adev;
> >> +bool is_high_prio_compute = ring->funcs->type ==
> >> AMDGPU_RING_TYPE_COMPUTE &&
> >> + amdgpu_gfx_is_high_priority_compute_queue(adev, ring);
> >> +bool is_high_prio_gfx = ring->funcs->type ==
> >> AMDGPU_RING_TYPE_GFX &&
> >> + amdgpu_gfx_is_high_priority_graphics_queue(adev, ring);
> >>
> >>   memset(prop, 0, sizeof(*prop));
> >>
> >> @@ -637,10 +641,8 @@ static void amdgpu_ring_to_mqd_prop(struct
> >> amdgpu_ring *ring,
> >>*/
> >>   prop->hqd_active = ring->funcs->type == AMDGPU_RING_TYPE_KIQ;
> >>
> >> -if ((ring->funcs->type == AMDGPU_RING_TYPE_COMPUTE &&
> >> - amdgpu_gfx_is_high_priority_compute_queue(adev, ring)) ||
> >> -(ring->funcs->type == AMDGPU_RING_TYPE_GFX &&
> >> - amdgpu_gfx_is_high_priority_graphics_queue(adev, ring))) {
> >> +prop->allow_tunneling = is_high_prio_compute;
> >> +if (is_high_prio_compute || is_high_prio_gfx) {
> >>   prop->hqd_pipe_priority = AMDGPU_GFX_PIPE_PRIO_HIGH;
> >>   prop->hqd_queue_priority = AMDGPU_GFX_QUEUE_PRIORITY_MAXIMUM;
> >>   }
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> >> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> >> index c8a3bf01743f..73f6d7e72c73 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> >> @@ -6593,7 +6593,8 @@ static int gfx_v10_0_compute_mqd_init(struct
> >> amdgpu_device *adev, void *m,
> >>   tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, ENDIAN_SWAP, 1);
> >>   #endif
> >>   tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, UNORD_DISPATCH, 0);
> >> -tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, TUNNEL_DISPATCH, 0);
> >> +tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, TUNNEL_DISPATCH,
> >> +prop->allow_tunneling);
> >>   tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, PRIV_STATE, 1);
> >>   tmp = REG_SET_FIELD(tmp, CP_HQD_PQ

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-08-09 Thread Marek Olšák

On Wed, Aug 9, 2023 at 3:35 AM Michel Dänzer  wrote:
>
> On 8/8/23 19:03, Marek Olšák wrote:
> > It's the same situation as SIGSEGV. A process can catch the signal,
> > but if it doesn't, it gets killed. GL and Vulkan APIs give you a way
> > to catch the GPU error and prevent the process termination. If you
> > don't use the API, you'll get undefined behavior, which means anything
> > can happen, including process termination.
>
> Got a spec reference for that?
>
> I know the spec allows process termination in response to e.g. out of bounds 
> buffer access by the application (which corresponds to SIGSEGV). There are 
> other causes for GPU hangs though, e.g. driver bugs. The ARB_robustness spec 
> says:
>
> If the reset notification behavior is NO_RESET_NOTIFICATION_ARB,
> then the implementation will never deliver notification of reset
> events, and GetGraphicsResetStatusARB will always return
> NO_ERROR[fn1].
>[fn1: In this case it is recommended that implementations should
> not allow loss of context state no matter what events occur.
> However, this is only a recommendation, and cannot be relied
> upon by applications.]
>
> No mention of process termination, that rather sounds to me like the GL 
> implementation should do its best to keep the application running.

It basically says that we can do anything.

A frozen window or flipping between 2 random frames can't be described
as "keeping the application running". That's the worst user
experience. I will not accept it.

A window system can force-enable robustness for its non-robust apps
and control that. That's the best possible user experience and it's
achievable everywhere. Everything else doesn't matter.

Marek




Marek

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-08-08 Thread Marek Olšák

It's the same situation as SIGSEGV. A process can catch the signal,
but if it doesn't, it gets killed. GL and Vulkan APIs give you a way
to catch the GPU error and prevent the process termination. If you
don't use the API, you'll get undefined behavior, which means anything
can happen, including process termination.



Marek

On Tue, Aug 8, 2023 at 8:14 AM Sebastian Wick  wrote:
>
> On Fri, Aug 4, 2023 at 3:03 PM Daniel Vetter  wrote:
> >
> > On Tue, Jun 27, 2023 at 10:23:23AM -0300, André Almeida wrote:
> > > Create a section that specifies how to deal with DRM device resets for
> > > kernel and userspace drivers.
> > >
> > > Acked-by: Pekka Paalanen 
> > > Signed-off-by: André Almeida 
> > > ---
> > >
> > > v4: 
> > > https://lore.kernel.org/lkml/20230626183347.55118-1-andrealm...@igalia.com/
> > >
> > > Changes:
> > >  - Grammar fixes (Randy)
> > >
> > >  Documentation/gpu/drm-uapi.rst | 68 ++
> > >  1 file changed, 68 insertions(+)
> > >
> > > diff --git a/Documentation/gpu/drm-uapi.rst 
> > > b/Documentation/gpu/drm-uapi.rst
> > > index 65fb3036a580..3cbffa25ed93 100644
> > > --- a/Documentation/gpu/drm-uapi.rst
> > > +++ b/Documentation/gpu/drm-uapi.rst
> > > @@ -285,6 +285,74 @@ for GPU1 and GPU2 from different vendors, and a 
> > > third handler for
> > >  mmapped regular files. Threads cause additional pain with signal
> > >  handling as well.
> > >
> > > +Device reset
> > > +
> > > +
> > > +The GPU stack is really complex and is prone to errors, from hardware 
> > > bugs,
> > > +faulty applications and everything in between the many layers. Some 
> > > errors
> > > +require resetting the device in order to make the device usable again. 
> > > This
> > > +sections describes the expectations for DRM and usermode drivers when a
> > > +device resets and how to propagate the reset status.
> > > +
> > > +Kernel Mode Driver
> > > +--
> > > +
> > > +The KMD is responsible for checking if the device needs a reset, and to 
> > > perform
> > > +it as needed. Usually a hang is detected when a job gets stuck 
> > > executing. KMD
> > > +should keep track of resets, because userspace can query any time about 
> > > the
> > > +reset stats for an specific context. This is needed to propagate to the 
> > > rest of
> > > +the stack that a reset has happened. Currently, this is implemented by 
> > > each
> > > +driver separately, with no common DRM interface.
> > > +
> > > +User Mode Driver
> > > +
> > > +
> > > +The UMD should check before submitting new commands to the KMD if the 
> > > device has
> > > +been reset, and this can be checked more often if the UMD requires it. 
> > > After
> > > +detecting a reset, UMD will then proceed to report it to the application 
> > > using
> > > +the appropriate API error code, as explained in the section below about
> > > +robustness.
> > > +
> > > +Robustness
> > > +--
> > > +
> > > +The only way to try to keep an application working after a reset is if it
> > > +complies with the robustness aspects of the graphical API that it is 
> > > using.
> > > +
> > > +Graphical APIs provide ways to applications to deal with device resets. 
> > > However,
> > > +there is no guarantee that the app will use such features correctly, and 
> > > the
> > > +UMD can implement policies to close the app if it is a repeating 
> > > offender,
> >
> > Not sure whether this one here is due to my input, but s/UMD/KMD. Repeat
> > offender killing is more a policy where the kernel enforces policy, and no
> > longer up to userspace to dtrt (because very clearly userspace is not
> > really doing the right thing anymore when it's just hanging the gpu in an
> > endless loop). Also maybe tune it down further to something like "the
> > kernel driver may implemnent ..."
> >
> > In my opinion the umd shouldn't implement these kind of magic guesses, the
> > entire point of robustness apis is to delegate responsibility for
> > correctly recovering to the application. And the kernel is left with
> > enforcing fair resource usage policies (which eventually might be a
> > cgroups limit on how much gpu time you're allowed to waste with gpu
> > resets).
>
> Killing apps that the kernel thinks are misbehaving really doesn't
> seem like a good idea to me. What if the process is a service getting
> restarted after getting killed? What if killing that process leaves
> the system in a bad state?
>
> Can't the kernel provide some information to user space so that e.g.
> systemd can handle those situations?
>
> > > +likely in a broken loop. This is done to ensure that it does not keep 
> > > blocking
> > > +the user interface from being correctly displayed. This should be done 
> > > even if
> > > +the app is correct but happens to trigger some bug in the 
> > > hardware/driver.
> > > +
> > > +OpenGL
> > > +~~
> > > +
> > > +Apps using OpenGL should use the available robust interfaces, like the
> > > +extension ``GL_ARB_robustness``

Re: Non-robust apps and resets (was Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations)

2023-08-02 Thread Marek Olšák

A screen that doesn't update isn't usable. Killing the window system
and returning to the login screen is one option. Killing the window
system manually from a terminal or over ssh and then returning to the
login screen is another option, but 99% of users either hard-reset the
machine or do sysrq+REISUB anyway because it's faster that way. Those
are all your options. If we don't do the kill, users might decide to
do a hard reset with an unsync'd file system, which can cause more
damage.

The precedent from the CPU land is pretty strong here. There is
SIGSEGV for invalid CPU memory access and SIGILL for invalid CPU
instructions, yet we do nothing for invalid GPU memory access and
invalid GPU instructions. Sending a terminating signal from the kernel
would be the most natural thing to do. Instead, we just keep a frozen
GUI to keep users helpless, or we continue command submission and then
the hanging app can cause an infinite cycle of GPU hangs and resets,
making the GPU unusable until somebody kills the app over ssh.

That's why GL/Vulkan robustness is required - either robust apps, or a
robust compositor that greys out lost windows and pops up a diagnostic
message with a list of actions to choose from. That's the direction we
should be taking. Non-robust apps under a non-robust compositor should
just be killed if they crash the GPU.


Marek

On Wed, Jul 26, 2023 at 4:07 AM Michel Dänzer
 wrote:
>
> On 7/25/23 15:02, André Almeida wrote:
> > Em 25/07/2023 05:03, Michel Dänzer escreveu:
> >> On 7/25/23 04:55, André Almeida wrote:
> >>> Hi everyone,
> >>>
> >>> It's not clear what we should do about non-robust OpenGL apps after GPU 
> >>> resets, so I'll try to summarize the topic, show some options and my 
> >>> proposal to move forward on that.
> >>>
> >>> Em 27/06/2023 10:23, André Almeida escreveu:
>  +Robustness
>  +--
>  +
>  +The only way to try to keep an application working after a reset is if 
>  it
>  +complies with the robustness aspects of the graphical API that it is 
>  using.
>  +
>  +Graphical APIs provide ways to applications to deal with device resets. 
>  However,
>  +there is no guarantee that the app will use such features correctly, 
>  and the
>  +UMD can implement policies to close the app if it is a repeating 
>  offender,
>  +likely in a broken loop. This is done to ensure that it does not keep 
>  blocking
>  +the user interface from being correctly displayed. This should be done 
>  even if
>  +the app is correct but happens to trigger some bug in the 
>  hardware/driver.
>  +
> >>> Depending on the OpenGL version, there are different robustness API 
> >>> available:
> >>>
> >>> - OpenGL ABR extension [0]
> >>> - OpenGL KHR extension [1]
> >>> - OpenGL ES extension  [2]
> >>>
> >>> Apps written in OpenGL should use whatever version is available for them 
> >>> to make the app robust for GPU resets. That usually means calling 
> >>> GetGraphicsResetStatusARB(), checking the status, and if it encounter 
> >>> something different from NO_ERROR, that means that a reset has happened, 
> >>> the context is considered lost and should be recreated. If an app follow 
> >>> this, it will likely succeed recovering a reset.
> >>>
> >>> What should non-robustness apps do then? They certainly will not be 
> >>> notified if a reset happens, and thus can't recover if their context is 
> >>> lost. OpenGL specification does not explicitly define what should be done 
> >>> in such situations[3], and I believe that usually when the spec mandates 
> >>> to close the app, it would explicitly note it.
> >>>
> >>> However, in reality there are different types of device resets, causing 
> >>> different results. A reset can be precise enough to damage only the 
> >>> guilty context, and keep others alive.
> >>>
> >>> Given that, I believe drivers have the following options:
> >>>
> >>> a) Kill all non-robust apps after a reset. This may lead to lose work 
> >>> from innocent applications.
> >>>
> >>> b) Ignore all non-robust apps OpenGL calls. That means that applications 
> >>> would still be alive, but the user interface would be freeze. The user 
> >>> would need to close it manually anyway, but in some corner cases, the app 
> >>> could autosave some work or the user might be able to interact with it 
> >>> using some alternative method (command line?).
> >>>
> >>> c) Kill just the affected non-robust applications. To do that, the driver 
> >>> need to be 100% sure on the impact of its resets.
> >>>
> >>> RadeonSI currently implements a), as can be seen at [4], while Iris 
> >>> implements what I think it's c)[5].
> >>>
> >>> For the user experience point-of-view, c) is clearly the best option, but 
> >>> it's the hardest to archive. There's not much gain on having b) over a), 
> >>> perhaps it could be an optional env var for such corner case applications.
> >>
> >> I disagree on these conclusions.
> >>
> >>

Re: Non-robust apps and resets (was Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations)

2023-07-25 Thread Marek Olšák

On Tue, Jul 25, 2023 at 4:03 AM Michel Dänzer
 wrote:
>
> On 7/25/23 04:55, André Almeida wrote:
> > Hi everyone,
> >
> > It's not clear what we should do about non-robust OpenGL apps after GPU 
> > resets, so I'll try to summarize the topic, show some options and my 
> > proposal to move forward on that.
> >
> > Em 27/06/2023 10:23, André Almeida escreveu:
> >> +Robustness
> >> +--
> >> +
> >> +The only way to try to keep an application working after a reset is if it
> >> +complies with the robustness aspects of the graphical API that it is 
> >> using.
> >> +
> >> +Graphical APIs provide ways to applications to deal with device resets. 
> >> However,
> >> +there is no guarantee that the app will use such features correctly, and 
> >> the
> >> +UMD can implement policies to close the app if it is a repeating offender,
> >> +likely in a broken loop. This is done to ensure that it does not keep 
> >> blocking
> >> +the user interface from being correctly displayed. This should be done 
> >> even if
> >> +the app is correct but happens to trigger some bug in the hardware/driver.
> >> +
> > Depending on the OpenGL version, there are different robustness API 
> > available:
> >
> > - OpenGL ABR extension [0]
> > - OpenGL KHR extension [1]
> > - OpenGL ES extension  [2]
> >
> > Apps written in OpenGL should use whatever version is available for them to 
> > make the app robust for GPU resets. That usually means calling 
> > GetGraphicsResetStatusARB(), checking the status, and if it encounter 
> > something different from NO_ERROR, that means that a reset has happened, 
> > the context is considered lost and should be recreated. If an app follow 
> > this, it will likely succeed recovering a reset.
> >
> > What should non-robustness apps do then? They certainly will not be 
> > notified if a reset happens, and thus can't recover if their context is 
> > lost. OpenGL specification does not explicitly define what should be done 
> > in such situations[3], and I believe that usually when the spec mandates to 
> > close the app, it would explicitly note it.
> >
> > However, in reality there are different types of device resets, causing 
> > different results. A reset can be precise enough to damage only the guilty 
> > context, and keep others alive.
> >
> > Given that, I believe drivers have the following options:
> >
> > a) Kill all non-robust apps after a reset. This may lead to lose work from 
> > innocent applications.
> >
> > b) Ignore all non-robust apps OpenGL calls. That means that applications 
> > would still be alive, but the user interface would be freeze. The user 
> > would need to close it manually anyway, but in some corner cases, the app 
> > could autosave some work or the user might be able to interact with it 
> > using some alternative method (command line?).
> >
> > c) Kill just the affected non-robust applications. To do that, the driver 
> > need to be 100% sure on the impact of its resets.
> >
> > RadeonSI currently implements a), as can be seen at [4], while Iris 
> > implements what I think it's c)[5].
> >
> > For the user experience point-of-view, c) is clearly the best option, but 
> > it's the hardest to archive. There's not much gain on having b) over a), 
> > perhaps it could be an optional env var for such corner case applications.
>
> I disagree on these conclusions.
>
> c) is certainly better than a), but it's not "clearly the best" in all cases. 
> The OpenGL UMD is not a privileged/special component and is in no position to 
> decide whether or not the process as a whole (only some thread(s) of which 
> may use OpenGL at all) gets to continue running or not.

That's not true. I recommend that you enable b) with your driver and
then hang the GPU under different scenarios and see the result. Then
enable a) and do the same and compare.

Options a) and c) can be merged into one because they are not separate
options to choose from.

If Wayland wanted to grey out lost apps, they would appear as robust
contexts in gallium, but the reset status would be piped through the
Wayland protocol instead of the GL API.

Marek



Marek

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-07-05 Thread Marek Olšák

On Wed, Jul 5, 2023 at 3:32 AM Michel Dänzer  wrote:
>
> On 7/5/23 08:30, Marek Olšák wrote:
> > On Tue, Jul 4, 2023, 03:55 Michel Dänzer  wrote:
> > On 7/4/23 04:34, Marek Olšák wrote:
> > > On Mon, Jul 3, 2023, 03:12 Michel Dänzer  > > wrote:
> > > On 6/30/23 22:32, Marek Olšák wrote:
> > > > On Fri, Jun 30, 2023 at 11:11 AM Michel Dänzer 
> >  wrote:
> > > >> On 6/30/23 16:59, Alex Deucher wrote:
> > > >>> On Fri, Jun 30, 2023 at 10:49 AM Sebastian Wick
> > > >>> mailto:sebastian.w...@redhat.com> 
> > wrote:
> > > >>>> On Tue, Jun 27, 2023 at 3:23 PM André Almeida 
> >  wrote:
> > > >>>>>
> > > >>>>> +Robustness
> > > >>>>> +--
> > > >>>>> +
> > > >>>>> +The only way to try to keep an application working after a 
> > reset is if it
> > > >>>>> +complies with the robustness aspects of the graphical API 
> > that it is using.
> > > >>>>> +
> > > >>>>> +Graphical APIs provide ways to applications to deal with 
> > device resets. However,
> > > >>>>> +there is no guarantee that the app will use such features 
> > correctly, and the
> > > >>>>> +UMD can implement policies to close the app if it is a 
> > repeating offender,
> > > >>>>> +likely in a broken loop. This is done to ensure that it 
> > does not keep blocking
> > > >>>>> +the user interface from being correctly displayed. This 
> > should be done even if
> > > >>>>> +the app is correct but happens to trigger some bug in the 
> > hardware/driver.
> > > >>>>
> > > >>>> I still don't think it's good to let the kernel arbitrarily 
> > kill
> > > >>>> processes that it thinks are not well-behaved based on some 
> > heuristics
> > > >>>> and policy.
> > > >>>>
> > > >>>> Can't this be outsourced to user space? Expose the 
> > information about
> > > >>>> processes causing a device and let e.g. systemd deal with 
> > coming up
> > > >>>> with a policy and with killing stuff.
> > > >>>
> > > >>> I don't think it's the kernel doing the killing, it would be 
> > the UMD.
> > > >>> E.g., if the app is guilty and doesn't support robustness the 
> > UMD can
> > > >>> just call exit().
> > > >>
> > > >> It would be safer to just ignore API calls[0], similarly to 
> > what is done until the application destroys the context with robustness. 
> > Calling exit() likely results in losing any unsaved work, whereas at least 
> > some applications might otherwise allow saving the work by other means.
> > > >
> > > > That's a terrible idea. Ignoring API calls would be identical 
> > to a freeze. You might as well disable GPU recovery because the result 
> > would be the same.
> > >
> > > No GPU recovery would affect everything using the GPU, whereas 
> > this affects only non-robust applications.
> > >
> > > which is currently the majority.
> >
> > Not sure where you're going with this. Applications need to use 
> > robustness to be able to recover from a GPU hang, and the GPU needs to be 
> > reset for that. So disabling GPU reset is not the same as what we're 
> > discussing here.
> >
> >
> > > > - non-robust contexts: call exit(1) immediately, which is the 
> > best way to recover
> > >
> > > That's not the UMD's call to make.
> > >
> > > That's absolutely the UMD's call to make because that's mandated by 
> > the hw and API design
> >
> > Can you point us to a spec which mandates that the process must be 
> > killed in this case?
> >
> >
> > > and only driver devs know this, which this thread is a proof of. The 
> > default behavior is to skip all command submission if a non-robust context 
> > is lost, which looks like a freeze. That's required to prevent infinit

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-07-05 Thread Marek Olšák

On Tue, Jul 4, 2023, 03:55 Michel Dänzer  wrote:

> On 7/4/23 04:34, Marek Olšák wrote:
> > On Mon, Jul 3, 2023, 03:12 Michel Dänzer  <mailto:michel.daen...@mailbox.org>> wrote:
> > On 6/30/23 22:32, Marek Olšák wrote:
> > > On Fri, Jun 30, 2023 at 11:11 AM Michel Dänzer <
> michel.daen...@mailbox.org <mailto:michel.daen...@mailbox.org>  michel.daen...@mailbox.org <mailto:michel.daen...@mailbox.org>>> wrote:
> > >> On 6/30/23 16:59, Alex Deucher wrote:
> > >>> On Fri, Jun 30, 2023 at 10:49 AM Sebastian Wick
> > >>> mailto:sebastian.w...@redhat.com>
> <mailto:sebastian.w...@redhat.com <mailto:sebastian.w...@redhat.com>>>
> wrote:
> > >>>> On Tue, Jun 27, 2023 at 3:23 PM André Almeida <
> andrealm...@igalia.com <mailto:andrealm...@igalia.com>  andrealm...@igalia.com <mailto:andrealm...@igalia.com>>> wrote:
> > >>>>>
> > >>>>> +Robustness
> > >>>>> +--
> > >>>>> +
> > >>>>> +The only way to try to keep an application working after a
> reset is if it
> > >>>>> +complies with the robustness aspects of the graphical API
> that it is using.
> > >>>>> +
> > >>>>> +Graphical APIs provide ways to applications to deal with
> device resets. However,
> > >>>>> +there is no guarantee that the app will use such features
> correctly, and the
> > >>>>> +UMD can implement policies to close the app if it is a
> repeating offender,
> > >>>>> +likely in a broken loop. This is done to ensure that it does
> not keep blocking
> > >>>>> +the user interface from being correctly displayed. This
> should be done even if
> > >>>>> +the app is correct but happens to trigger some bug in the
> hardware/driver.
> > >>>>
> > >>>> I still don't think it's good to let the kernel arbitrarily kill
> > >>>> processes that it thinks are not well-behaved based on some
> heuristics
> > >>>> and policy.
> > >>>>
> > >>>> Can't this be outsourced to user space? Expose the information
> about
> > >>>> processes causing a device and let e.g. systemd deal with
> coming up
> > >>>> with a policy and with killing stuff.
> > >>>
> > >>> I don't think it's the kernel doing the killing, it would be the
> UMD.
> > >>> E.g., if the app is guilty and doesn't support robustness the
> UMD can
> > >>> just call exit().
> > >>
> > >> It would be safer to just ignore API calls[0], similarly to what
> is done until the application destroys the context with robustness. Calling
> exit() likely results in losing any unsaved work, whereas at least some
> applications might otherwise allow saving the work by other means.
> > >
> > > That's a terrible idea. Ignoring API calls would be identical to a
> freeze. You might as well disable GPU recovery because the result would be
> the same.
> >
> > No GPU recovery would affect everything using the GPU, whereas this
> affects only non-robust applications.
> >
> > which is currently the majority.
>
> Not sure where you're going with this. Applications need to use robustness
> to be able to recover from a GPU hang, and the GPU needs to be reset for
> that. So disabling GPU reset is not the same as what we're discussing here.
>
>
> > > - non-robust contexts: call exit(1) immediately, which is the best
> way to recover
> >
> > That's not the UMD's call to make.
> >
> > That's absolutely the UMD's call to make because that's mandated by the
> hw and API design
>
> Can you point us to a spec which mandates that the process must be killed
> in this case?
>
>
> > and only driver devs know this, which this thread is a proof of. The
> default behavior is to skip all command submission if a non-robust context
> is lost, which looks like a freeze. That's required to prevent infinite
> hangs from the same context and can be caused by the side effects of the
> GPU reset itself, not by the cause of the previous hang. The only way out
> of that is killing the process.
>
> The UMD killing the process is not the only way out of that, and doing so
> is overreach on its part. The UMD is but one out of many components in a
> process,

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-07-03 Thread Marek Olšák

On Mon, Jul 3, 2023, 22:38 Randy Dunlap  wrote:

>
>
> On 7/3/23 19:34, Marek Olšák wrote:
> >
> >
> > On Mon, Jul 3, 2023, 03:12 Michel Dänzer  <mailto:michel.daen...@mailbox.org>> wrote:
> >
>
> Marek,
> Please stop sending html emails to the mailing lists.
> The mailing list software drops them.
>
> Please set your email interface to use plain text mode instead.
> Thanks.
>

The mobile Gmail app doesn't support plain text, which I use frequently.

Marek


> --
> ~Randy
>

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-07-03 Thread Marek Olšák

On Mon, Jul 3, 2023, 03:12 Michel Dänzer  wrote:

> On 6/30/23 22:32, Marek Olšák wrote:
> > On Fri, Jun 30, 2023 at 11:11 AM Michel Dänzer <
> michel.daen...@mailbox.org <mailto:michel.daen...@mailbox.org>> wrote:
> >> On 6/30/23 16:59, Alex Deucher wrote:
> >>> On Fri, Jun 30, 2023 at 10:49 AM Sebastian Wick
> >>> mailto:sebastian.w...@redhat.com>> wrote:
> >>>> On Tue, Jun 27, 2023 at 3:23 PM André Almeida  <mailto:andrealm...@igalia.com>> wrote:
> >>>>>
> >>>>> +Robustness
> >>>>> +--
> >>>>> +
> >>>>> +The only way to try to keep an application working after a reset is
> if it
> >>>>> +complies with the robustness aspects of the graphical API that it
> is using.
> >>>>> +
> >>>>> +Graphical APIs provide ways to applications to deal with device
> resets. However,
> >>>>> +there is no guarantee that the app will use such features
> correctly, and the
> >>>>> +UMD can implement policies to close the app if it is a repeating
> offender,
> >>>>> +likely in a broken loop. This is done to ensure that it does not
> keep blocking
> >>>>> +the user interface from being correctly displayed. This should be
> done even if
> >>>>> +the app is correct but happens to trigger some bug in the
> hardware/driver.
> >>>>
> >>>> I still don't think it's good to let the kernel arbitrarily kill
> >>>> processes that it thinks are not well-behaved based on some heuristics
> >>>> and policy.
> >>>>
> >>>> Can't this be outsourced to user space? Expose the information about
> >>>> processes causing a device and let e.g. systemd deal with coming up
> >>>> with a policy and with killing stuff.
> >>>
> >>> I don't think it's the kernel doing the killing, it would be the UMD.
> >>> E.g., if the app is guilty and doesn't support robustness the UMD can
> >>> just call exit().
> >>
> >> It would be safer to just ignore API calls[0], similarly to what is
> done until the application destroys the context with robustness. Calling
> exit() likely results in losing any unsaved work, whereas at least some
> applications might otherwise allow saving the work by other means.
> >
> > That's a terrible idea. Ignoring API calls would be identical to a
> freeze. You might as well disable GPU recovery because the result would be
> the same.
>
> No GPU recovery would affect everything using the GPU, whereas this
> affects only non-robust applications.
>

which is currently the majority.


>
> > - non-robust contexts: call exit(1) immediately, which is the best way
> to recover
>
> That's not the UMD's call to make.
>

That's absolutely the UMD's call to make because that's mandated by the hw
and API design and only driver devs know this, which this thread is a proof
of. The default behavior is to skip all command submission if a non-robust
context is lost, which looks like a freeze. That's required to prevent
infinite hangs from the same context and can be caused by the side effects
of the GPU reset itself, not by the cause of the previous hang. The only
way out of that is killing the process.

Marek


>
> >> [0] Possibly accompanied by a one-time message to stderr along the
> lines of "GPU reset detected but robustness not enabled in context,
> ignoring OpenGL API calls".
>
>
> --
> Earthling Michel Dänzer|  https://redhat.com
> Libre software enthusiast  | Mesa and Xwayland developer
>
>

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-06-30 Thread Marek Olšák

That's a terrible idea. Ignoring API calls would be identical to a freeze.
You might as well disable GPU recovery because the result would be the same.

There are 2 scenarios:
- robust contexts: report the GPU reset status and skip API calls; let the
app recreate the context to recover
- non-robust contexts: call exit(1) immediately, which is the best way to
recover

Marek

On Fri, Jun 30, 2023 at 11:11 AM Michel Dänzer 
wrote:

> On 6/30/23 16:59, Alex Deucher wrote:
> > On Fri, Jun 30, 2023 at 10:49 AM Sebastian Wick
> >  wrote:
> >> On Tue, Jun 27, 2023 at 3:23 PM André Almeida 
> wrote:
> >>>
> >>> +Robustness
> >>> +--
> >>> +
> >>> +The only way to try to keep an application working after a reset is
> if it
> >>> +complies with the robustness aspects of the graphical API that it is
> using.
> >>> +
> >>> +Graphical APIs provide ways to applications to deal with device
> resets. However,
> >>> +there is no guarantee that the app will use such features correctly,
> and the
> >>> +UMD can implement policies to close the app if it is a repeating
> offender,
> >>> +likely in a broken loop. This is done to ensure that it does not keep
> blocking
> >>> +the user interface from being correctly displayed. This should be
> done even if
> >>> +the app is correct but happens to trigger some bug in the
> hardware/driver.
> >>
> >> I still don't think it's good to let the kernel arbitrarily kill
> >> processes that it thinks are not well-behaved based on some heuristics
> >> and policy.
> >>
> >> Can't this be outsourced to user space? Expose the information about
> >> processes causing a device and let e.g. systemd deal with coming up
> >> with a policy and with killing stuff.
> >
> > I don't think it's the kernel doing the killing, it would be the UMD.
> > E.g., if the app is guilty and doesn't support robustness the UMD can
> > just call exit().
>
> It would be safer to just ignore API calls[0], similarly to what is done
> until the application destroys the context with robustness. Calling exit()
> likely results in losing any unsaved work, whereas at least some
> applications might otherwise allow saving the work by other means.
>
>
> [0] Possibly accompanied by a one-time message to stderr along the lines
> of "GPU reset detected but robustness not enabled in context, ignoring
> OpenGL API calls".
>
> --
> Earthling Michel Dänzer|  https://redhat.com
> Libre software enthusiast  | Mesa and Xwayland developer
>
>

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-06-27 Thread Marek Olšák

On Tue, Jun 27, 2023 at 5:31 PM André Almeida 
wrote:

> Hi Marek,
>
> Em 27/06/2023 15:57, Marek Olšák escreveu:
> > On Tue, Jun 27, 2023, 09:23 André Almeida  > <mailto:andrealm...@igalia.com>> wrote:
> >
> > +User Mode Driver
> > +
> > +
> > +The UMD should check before submitting new commands to the KMD if
> > the device has
> > +been reset, and this can be checked more often if the UMD requires
> > it. After
> > +detecting a reset, UMD will then proceed to report it to the
> > application using
> > +the appropriate API error code, as explained in the section below
> about
> > +robustness.
> >
> >
> > The UMD won't check the device status before every command submission
> > due to ioctl overhead. Instead, the KMD should skip command submission
> > and return an error that it was skipped.
>
> I wrote like this because when reading the source code for
> vk::check_status()[0] and Gallium's si_flush_gfx_cs()[1], I was under
> the impression that UMD checks the reset status before every
> submission/flush.
>

It only does that before every command submission when the context is
robust. When it's not robust, radeonsi doesn't do anything.


>
> Is your comment about of how things are currently implemented, or how
> they would ideally work? Either way I can apply your suggestion, I just
> want to make it clear.
>

Yes. Ideally, we would get the reply whether the context is lost from the
CS ioctl. This is not currently implemented.

Marek


>
> [0]
>
> https://elixir.bootlin.com/mesa/mesa-23.1.3/source/src/vulkan/runtime/vk_device.h#L142
> [1]
>
> https://elixir.bootlin.com/mesa/mesa-23.1.3/source/src/gallium/drivers/radeonsi/si_gfx_cs.c#L83
>
> >
> > The only case where that won't be applicable is user queues where
> > drivers don't call into the kernel to submit work, but they do call into
> > the kernel to create a dma_fence. In that case, the call to create a
> > dma_fence can fail with an error.
> >
> > Marek
>
>

Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

2023-06-27 Thread Marek Olšák

On Tue, Jun 27, 2023, 09:23 André Almeida  wrote:

> Create a section that specifies how to deal with DRM device resets for
> kernel and userspace drivers.
>
> Acked-by: Pekka Paalanen 
> Signed-off-by: André Almeida 
> ---
>
> v4:
> https://lore.kernel.org/lkml/20230626183347.55118-1-andrealm...@igalia.com/
>
> Changes:
>  - Grammar fixes (Randy)
>
>  Documentation/gpu/drm-uapi.rst | 68 ++
>  1 file changed, 68 insertions(+)
>
> diff --git a/Documentation/gpu/drm-uapi.rst
> b/Documentation/gpu/drm-uapi.rst
> index 65fb3036a580..3cbffa25ed93 100644
> --- a/Documentation/gpu/drm-uapi.rst
> +++ b/Documentation/gpu/drm-uapi.rst
> @@ -285,6 +285,74 @@ for GPU1 and GPU2 from different vendors, and a third
> handler for
>  mmapped regular files. Threads cause additional pain with signal
>  handling as well.
>
> +Device reset
> +
> +
> +The GPU stack is really complex and is prone to errors, from hardware
> bugs,
> +faulty applications and everything in between the many layers. Some errors
> +require resetting the device in order to make the device usable again.
> This
> +sections describes the expectations for DRM and usermode drivers when a
> +device resets and how to propagate the reset status.
> +
> +Kernel Mode Driver
> +--
> +
> +The KMD is responsible for checking if the device needs a reset, and to
> perform
> +it as needed. Usually a hang is detected when a job gets stuck executing.
> KMD
> +should keep track of resets, because userspace can query any time about
> the
> +reset stats for an specific context. This is needed to propagate to the
> rest of
> +the stack that a reset has happened. Currently, this is implemented by
> each
> +driver separately, with no common DRM interface.
> +
> +User Mode Driver
> +
> +
> +The UMD should check before submitting new commands to the KMD if the
> device has
> +been reset, and this can be checked more often if the UMD requires it.
> After
> +detecting a reset, UMD will then proceed to report it to the application
> using
> +the appropriate API error code, as explained in the section below about
> +robustness.
>

The UMD won't check the device status before every command submission due
to ioctl overhead. Instead, the KMD should skip command submission and
return an error that it was skipped.

The only case where that won't be applicable is user queues where drivers
don't call into the kernel to submit work, but they do call into the kernel
to create a dma_fence. In that case, the call to create a dma_fence can
fail with an error.

Marek

+
> +Robustness
> +--
> +
> +The only way to try to keep an application working after a reset is if it
> +complies with the robustness aspects of the graphical API that it is
> using.
> +
> +Graphical APIs provide ways to applications to deal with device resets.
> However,
> +there is no guarantee that the app will use such features correctly, and
> the
> +UMD can implement policies to close the app if it is a repeating offender,
> +likely in a broken loop. This is done to ensure that it does not keep
> blocking
> +the user interface from being correctly displayed. This should be done
> even if
> +the app is correct but happens to trigger some bug in the hardware/driver.
> +
> +OpenGL
> +~~
> +
> +Apps using OpenGL should use the available robust interfaces, like the
> +extension ``GL_ARB_robustness`` (or ``GL_EXT_robustness`` for OpenGL ES).
> This
> +interface tells if a reset has happened, and if so, all the context state
> is
> +considered lost and the app proceeds by creating new ones. If it is
> possible to
> +determine that robustness is not in use, the UMD will terminate the app
> when a
> +reset is detected, giving that the contexts are lost and the app won't be
> able
> +to figure this out and recreate the contexts.
> +
> +Vulkan
> +~~
> +
> +Apps using Vulkan should check for ``VK_ERROR_DEVICE_LOST`` for
> submissions.
> +This error code means, among other things, that a device reset has
> happened and
> +it needs to recreate the contexts to keep going.
> +
> +Reporting causes of resets
> +--
> +
> +Apart from propagating the reset through the stack so apps can recover,
> it's
> +really useful for driver developers to learn more about what caused the
> reset in
> +first place. DRM devices should make use of devcoredump to store relevant
> +information about the reset, so this information can be added to user bug
> +reports.
> +
>  .. _drm_driver_ioctl:
>
>  IOCTL Support on Device Nodes
> --
> 2.41.0
>
>

Re: [RFC PATCH 0/1] Add AMDGPU_INFO_GUILTY_APP ioctl

2023-05-03 Thread Marek Olšák

On Wed, May 3, 2023, 14:53 André Almeida  wrote:

> Em 03/05/2023 14:08, Marek Olšák escreveu:
> > GPU hangs are pretty common post-bringup. They are not common per user,
> > but if we gather all hangs from all users, we can have lots and lots of
> > them.
> >
> > GPU hangs are indeed not very debuggable. There are however some things
> > we can do:
> > - Identify the hanging IB by its VA (the kernel should know it)
>
> How can the kernel tell which VA range is being executed? I only found
> that information at mmCP_IB1_BASE_ regs, but as stated in this thread by
> Christian this is not reliable to be read.
>

The kernel receives the VA and the size via the CS ioctl. When user queues
are enabled, the kernel will no longer receive them.


> > - Read and parse the IB to detect memory corruption.
> > - Print active waves with shader disassembly if SQ isn't hung (often
> > it's not).
> >
> > Determining which packet the CP is stuck on is tricky. The CP has 2
> > engines (one frontend and one backend) that work on the same command
> > buffer. The frontend engine runs ahead, executes some packets and
> > forwards others to the backend engine. Only the frontend engine has the
> > command buffer VA somewhere. The backend engine only receives packets
> > from the frontend engine via a FIFO, so it might not be possible to tell
> > where it's stuck if it's stuck.
>
> Do they run at the same asynchronously or does the front end waits the
> back end to execute?
>

They run asynchronously and should run asynchronously for performance, but
they can be synchronized using a special packet (PFP_SYNC_ME).

Marek


> >
> > When the gfx pipeline hangs outside of shaders, making a scandump seems
> > to be the only way to have a chance at finding out what's going wrong,
> > and only AMD-internal versions of hw can be scanned.
> >
> > Marek
> >
> > On Wed, May 3, 2023 at 11:23 AM Christian König
> >  > <mailto:ckoenig.leichtzumer...@gmail.com>> wrote:
> >
> > Am 03.05.23 um 17:08 schrieb Felix Kuehling:
> >  > Am 2023-05-03 um 03:59 schrieb Christian König:
> >  >> Am 02.05.23 um 20:41 schrieb Alex Deucher:
> >  >>> On Tue, May 2, 2023 at 11:22 AM Timur Kristóf
> >  >>> mailto:timur.kris...@gmail.com>>
> wrote:
> >  >>>> [SNIP]
> >  >>>>>>>> In my opinion, the correct solution to those problems
> would be
> >  >>>>>>>> if
> >  >>>>>>>> the kernel could give userspace the necessary information
> > about
> >  >>>>>>>> a
> >  >>>>>>>> GPU hang before a GPU reset.
> >  >>>>>>>>
> >  >>>>>>>   The fundamental problem here is that the kernel doesn't
> have
> >  >>>>>>> that
> >  >>>>>>> information either. We know which IB timed out and can
> >  >>>>>>> potentially do
> >  >>>>>>> a devcoredump when that happens, but that's it.
> >  >>>>>>
> >  >>>>>> Is it really not possible to know such a fundamental thing
> > as what
> >  >>>>>> the
> >  >>>>>> GPU was doing when it hung? How are we supposed to do any
> > kind of
> >  >>>>>> debugging without knowing that?
> >  >>
> >  >> Yes, that's indeed something at least I try to figure out for
> years
> >  >> as well.
> >  >>
> >  >> Basically there are two major problems:
> >  >> 1. When the ASIC is hung you can't talk to the firmware engines
> any
> >  >> more and most state is not exposed directly, but just through
> some
> >  >> fw/hw interface.
> >  >> Just take a look at how umr reads the shader state from the
> SQ.
> >  >> When that block is hung you can't do that any more and basically
> > have
> >  >> no chance at all to figure out why it's hung.
> >  >>
> >  >> Same for other engines, I remember once spending a week
> > figuring
> >  >> out why the UVD block is hung during suspend. Turned out to be a
> >  >> debugging nightmare because any time you touch any register of
> that
> >  >> block the whole system would hang.
> >  >>
> &

Re: [RFC PATCH 0/1] Add AMDGPU_INFO_GUILTY_APP ioctl

2023-05-03 Thread Marek Olšák

WRITE_DATA with ENGINE=PFP will execute the packet on the frontend engine,
while ENGINE=ME will execute the packet on the backend engine.

Marek

On Wed, May 3, 2023 at 1:08 PM Marek Olšák  wrote:

> GPU hangs are pretty common post-bringup. They are not common per user,
> but if we gather all hangs from all users, we can have lots and lots of
> them.
>
> GPU hangs are indeed not very debuggable. There are however some things we
> can do:
> - Identify the hanging IB by its VA (the kernel should know it)
> - Read and parse the IB to detect memory corruption.
> - Print active waves with shader disassembly if SQ isn't hung (often it's
> not).
>
> Determining which packet the CP is stuck on is tricky. The CP has 2
> engines (one frontend and one backend) that work on the same command
> buffer. The frontend engine runs ahead, executes some packets and forwards
> others to the backend engine. Only the frontend engine has the command
> buffer VA somewhere. The backend engine only receives packets from the
> frontend engine via a FIFO, so it might not be possible to tell where it's
> stuck if it's stuck.
>
> When the gfx pipeline hangs outside of shaders, making a scandump seems to
> be the only way to have a chance at finding out what's going wrong, and
> only AMD-internal versions of hw can be scanned.
>
> Marek
>
> On Wed, May 3, 2023 at 11:23 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> Am 03.05.23 um 17:08 schrieb Felix Kuehling:
>> > Am 2023-05-03 um 03:59 schrieb Christian König:
>> >> Am 02.05.23 um 20:41 schrieb Alex Deucher:
>> >>> On Tue, May 2, 2023 at 11:22 AM Timur Kristóf
>> >>>  wrote:
>> >>>> [SNIP]
>> >>>>>>>> In my opinion, the correct solution to those problems would be
>> >>>>>>>> if
>> >>>>>>>> the kernel could give userspace the necessary information about
>> >>>>>>>> a
>> >>>>>>>> GPU hang before a GPU reset.
>> >>>>>>>>
>> >>>>>>>   The fundamental problem here is that the kernel doesn't have
>> >>>>>>> that
>> >>>>>>> information either. We know which IB timed out and can
>> >>>>>>> potentially do
>> >>>>>>> a devcoredump when that happens, but that's it.
>> >>>>>>
>> >>>>>> Is it really not possible to know such a fundamental thing as what
>> >>>>>> the
>> >>>>>> GPU was doing when it hung? How are we supposed to do any kind of
>> >>>>>> debugging without knowing that?
>> >>
>> >> Yes, that's indeed something at least I try to figure out for years
>> >> as well.
>> >>
>> >> Basically there are two major problems:
>> >> 1. When the ASIC is hung you can't talk to the firmware engines any
>> >> more and most state is not exposed directly, but just through some
>> >> fw/hw interface.
>> >> Just take a look at how umr reads the shader state from the SQ.
>> >> When that block is hung you can't do that any more and basically have
>> >> no chance at all to figure out why it's hung.
>> >>
>> >> Same for other engines, I remember once spending a week figuring
>> >> out why the UVD block is hung during suspend. Turned out to be a
>> >> debugging nightmare because any time you touch any register of that
>> >> block the whole system would hang.
>> >>
>> >> 2. There are tons of things going on in a pipeline fashion or even
>> >> completely in parallel. For example the CP is just the beginning of a
>> >> rather long pipeline which at the end produces a bunch of pixels.
>> >> In almost all cases I've seen you ran into a problem somewhere
>> >> deep in the pipeline and only very rarely at the beginning.
>> >>
>> >>>>>>
>> >>>>>> I wonder what AMD's Windows driver team is doing with this problem,
>> >>>>>> surely they must have better tools to deal with GPU hangs?
>> >>>>> For better or worse, most teams internally rely on scan dumps via
>> >>>>> JTAG
>> >>>>> which sort of limits the usefulness outside of AMD, but also gives
>> >>>>> you
>> >>>>> the exact state of the hardware when it's hung so the hardware teams
>> >>>

Re: [RFC PATCH 0/1] Add AMDGPU_INFO_GUILTY_APP ioctl

2023-05-03 Thread Marek Olšák

GPU hangs are pretty common post-bringup. They are not common per user, but
if we gather all hangs from all users, we can have lots and lots of them.

GPU hangs are indeed not very debuggable. There are however some things we
can do:
- Identify the hanging IB by its VA (the kernel should know it)
- Read and parse the IB to detect memory corruption.
- Print active waves with shader disassembly if SQ isn't hung (often it's
not).

Determining which packet the CP is stuck on is tricky. The CP has 2 engines
(one frontend and one backend) that work on the same command buffer. The
frontend engine runs ahead, executes some packets and forwards others to
the backend engine. Only the frontend engine has the command buffer VA
somewhere. The backend engine only receives packets from the frontend
engine via a FIFO, so it might not be possible to tell where it's stuck if
it's stuck.

When the gfx pipeline hangs outside of shaders, making a scandump seems to
be the only way to have a chance at finding out what's going wrong, and
only AMD-internal versions of hw can be scanned.

Marek

On Wed, May 3, 2023 at 11:23 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Am 03.05.23 um 17:08 schrieb Felix Kuehling:
> > Am 2023-05-03 um 03:59 schrieb Christian König:
> >> Am 02.05.23 um 20:41 schrieb Alex Deucher:
> >>> On Tue, May 2, 2023 at 11:22 AM Timur Kristóf
> >>>  wrote:
>  [SNIP]
>  In my opinion, the correct solution to those problems would be
>  if
>  the kernel could give userspace the necessary information about
>  a
>  GPU hang before a GPU reset.
> 
> >>>   The fundamental problem here is that the kernel doesn't have
> >>> that
> >>> information either. We know which IB timed out and can
> >>> potentially do
> >>> a devcoredump when that happens, but that's it.
> >>
> >> Is it really not possible to know such a fundamental thing as what
> >> the
> >> GPU was doing when it hung? How are we supposed to do any kind of
> >> debugging without knowing that?
> >>
> >> Yes, that's indeed something at least I try to figure out for years
> >> as well.
> >>
> >> Basically there are two major problems:
> >> 1. When the ASIC is hung you can't talk to the firmware engines any
> >> more and most state is not exposed directly, but just through some
> >> fw/hw interface.
> >> Just take a look at how umr reads the shader state from the SQ.
> >> When that block is hung you can't do that any more and basically have
> >> no chance at all to figure out why it's hung.
> >>
> >> Same for other engines, I remember once spending a week figuring
> >> out why the UVD block is hung during suspend. Turned out to be a
> >> debugging nightmare because any time you touch any register of that
> >> block the whole system would hang.
> >>
> >> 2. There are tons of things going on in a pipeline fashion or even
> >> completely in parallel. For example the CP is just the beginning of a
> >> rather long pipeline which at the end produces a bunch of pixels.
> >> In almost all cases I've seen you ran into a problem somewhere
> >> deep in the pipeline and only very rarely at the beginning.
> >>
> >>
> >> I wonder what AMD's Windows driver team is doing with this problem,
> >> surely they must have better tools to deal with GPU hangs?
> > For better or worse, most teams internally rely on scan dumps via
> > JTAG
> > which sort of limits the usefulness outside of AMD, but also gives
> > you
> > the exact state of the hardware when it's hung so the hardware teams
> > prefer it.
> >
>  How does this approach scale? It's not something we can ask users to
>  do, and even if all of us in the radv team had a JTAG device, we
>  wouldn't be able to play every game that users experience random hangs
>  with.
> >>> It doesn't scale or lend itself particularly well to external
> >>> development, but that's the current state of affairs.
> >>
> >> The usual approach seems to be to reproduce a problem in a lab and
> >> have a JTAG attached to give the hw guys a scan dump and they can
> >> then tell you why something didn't worked as expected.
> >
> > That's the worst-case scenario where you're debugging HW or FW issues.
> > Those should be pretty rare post-bringup. But are there hangs caused
> > by user mode driver or application bugs that are easier to debug and
> > probably don't even require a GPU reset? For example most VM faults
> > can be handled without hanging the GPU. Similarly, a shader in an
> > endless loop should not require a full GPU reset. In the KFD compute
> > case, that's still preemptible and the offending process can be killed
> > with Ctrl-C or debugged with rocm-gdb.
>
> We also have infinite loop in shader abort for gfx and page faults are
> pretty rare with OpenGL (a bit more often with Vulkan) and can be
> handled gracefully on modern hw (they just spam the logs).
>
> The

Re: drm/amd/display: disable display DCC with retiling due to worse power consumption

2023-05-01 Thread Marek Olšák

We're going to do this in Mesa instead:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22771

Marek

On Fri, Apr 28, 2023 at 6:36 PM Marek Olšák  wrote:

> On Fri, Apr 28, 2023, 16:14 Joshua Ashton  wrote:
>
>> I mean I would also like power and perf numbers for Vangogh given you
>> referenced 10.3.
>>
>> Generic "power consumption is better" isn't enough to convince me that
>> this is the right call.
>>
>
> Raphael and Mendocino have worse power consumption with retiled
> displayable DCC and modifiers, and that can also be due to how retiling is
> implemented for modifiers.
>
> Marek
>
>
>> - Joshie ✨
>>
>> On Friday, 28 April 2023, Marek Olšák  wrote:
>> > I thought the same thing initially, but then realized that's not how
>> modifiers were designed to work.
>> > Mesa should expose all modifiers it wants to allow for 3D and it
>> doesn't care which ones are displayable.
>> > The kernel should expose all modifiers it wants to allow for display.
>> > With that, Mesa can still use theoretically displayable DCC, but it
>> will only be used for anything that's not the display.
>> > We can, of course, disable it in Mesa instead to get the same effect.
>> > We would need perf numbers for dGPUs to be able to tell whether it's
>> beneficial with the cost of DCC retiling.
>> > Marek
>> >
>> > On Fri, Apr 28, 2023, 12:11 Joshua Ashton  wrote:
>> >>
>> >> I really don't think the kernel isn't the right place to do this.
>> >> Is there any reason to not just disable it from the Mesa side?
>> >>
>> >> We can already disable displayable DCC there, so I don't see why you
>> are even touching the kernel.
>> >> It makes it infinitely harder for anyone to evaluate perf and power
>> tradeoffs if you disable it at this level.
>> >>
>> >> The whole power vs perf trade is also not a big deal on dGPUs compared
>> to APUs. Probably needs a better heuristic either way to avoid regressing
>> perf.
>> >>
>> >> - Joshie ✨
>> >>
>> >> On 28 April 2023 10:47:17 BST, "Marek Olšák"  wrote:
>> >>>
>> >>> Hi,
>> >>> It's attached for review.
>> >>>
>> >>> Thanks,
>> >>> Marek
>> >
>
>

Re: drm/amd/display: disable display DCC with retiling due to worse power consumption

2023-04-28 Thread Marek Olšák

On Fri, Apr 28, 2023, 16:14 Joshua Ashton  wrote:

> I mean I would also like power and perf numbers for Vangogh given you
> referenced 10.3.
>
> Generic "power consumption is better" isn't enough to convince me that
> this is the right call.
>

Raphael and Mendocino have worse power consumption with retiled displayable
DCC and modifiers, and that can also be due to how retiling is implemented
for modifiers.

Marek


> - Joshie ✨
>
> On Friday, 28 April 2023, Marek Olšák  wrote:
> > I thought the same thing initially, but then realized that's not how
> modifiers were designed to work.
> > Mesa should expose all modifiers it wants to allow for 3D and it doesn't
> care which ones are displayable.
> > The kernel should expose all modifiers it wants to allow for display.
> > With that, Mesa can still use theoretically displayable DCC, but it will
> only be used for anything that's not the display.
> > We can, of course, disable it in Mesa instead to get the same effect.
> > We would need perf numbers for dGPUs to be able to tell whether it's
> beneficial with the cost of DCC retiling.
> > Marek
> >
> > On Fri, Apr 28, 2023, 12:11 Joshua Ashton  wrote:
> >>
> >> I really don't think the kernel isn't the right place to do this.
> >> Is there any reason to not just disable it from the Mesa side?
> >>
> >> We can already disable displayable DCC there, so I don't see why you
> are even touching the kernel.
> >> It makes it infinitely harder for anyone to evaluate perf and power
> tradeoffs if you disable it at this level.
> >>
> >> The whole power vs perf trade is also not a big deal on dGPUs compared
> to APUs. Probably needs a better heuristic either way to avoid regressing
> perf.
> >>
> >> - Joshie ✨
> >>
> >> On 28 April 2023 10:47:17 BST, "Marek Olšák"  wrote:
> >>>
> >>> Hi,
> >>> It's attached for review.
> >>>
> >>> Thanks,
> >>> Marek
> >

Re: drm/amd/display: disable display DCC with retiling due to worse power consumption

2023-04-28 Thread Marek Olšák

I thought the same thing initially, but then realized that's not how
modifiers were designed to work.

Mesa should expose all modifiers it wants to allow for 3D and it doesn't
care which ones are displayable.

The kernel should expose all modifiers it wants to allow for display.

With that, Mesa can still use theoretically displayable DCC, but it will
only be used for anything that's not the display.

We can, of course, disable it in Mesa instead to get the same effect.

We would need perf numbers for dGPUs to be able to tell whether it's
beneficial with the cost of DCC retiling.

Marek

On Fri, Apr 28, 2023, 12:11 Joshua Ashton  wrote:

> I really don't think the kernel isn't the right place to do this.
> Is there any reason to not just disable it from the Mesa side?
>
> We can already disable displayable DCC there, so I don't see why you are
> even touching the kernel.
> It makes it infinitely harder for anyone to evaluate perf and power
> tradeoffs if you disable it at this level.
>
> The whole power vs perf trade is also not a big deal on dGPUs compared to
> APUs. Probably needs a better heuristic either way to avoid regressing perf.
>
> - Joshie ✨
>
> On 28 April 2023 10:47:17 BST, "Marek Olšák"  wrote:
>>
>> Hi,
>>
>> It's attached for review.
>>
>> Thanks,
>> Marek
>>
>

Re: drm/amd/display: disable display DCC with retiling due to worse power consumption

2023-04-28 Thread Marek Olšák

git send-email doesn't work for me since Gmail broke it some number of
years ago.

The code contains a detailed comment, so that the commit message doesn't
need it (it would be identical). It's better to put comments in the code
than the commit.

Marek

On Fri, Apr 28, 2023, 10:16 Hamza Mahfooz  wrote:

>
> Hey Marek,
>
> On 4/28/23 05:47, Marek Olšák wrote:
> > Hi,
> >
> > It's attached for review.
>
> Please send this to the mailing list using git-send-email(1). Also,
> please include a commit description and it would be helpful if you
> included "Link:"s to any relevant issues that you are tracking in
> association with this patch.
>
> >
> > Thanks,
> > Marek
> --
> Hamza
>
>

drm/amd/display: disable display DCC with retiling due to worse power consumption

2023-04-28 Thread Marek Olšák

Hi,

It's attached for review.

Thanks,
Marek
From 5c068e00a9f286657a1a536ba517d5a76bcf388e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= 
Date: Fri, 28 Apr 2023 05:41:52 -0400
Subject: [PATCH] drm/amd/display: disable display DCC with retiling due to
 worse power consumption
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Marek Olšák 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
index 322668973747..260607c81d7c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -136,6 +136,14 @@ void amdgpu_dm_plane_fill_blending_from_plane_state(const struct drm_plane_state
 
 static void add_modifier(uint64_t **mods, uint64_t *size, uint64_t *cap, uint64_t mod)
 {
+	/* Displayable DCC with retiling is known to increase power consumption
+	 * on GFX 10.3.7. Disable it on all chips until we have evidence that
+	 * it doesn't regress power consumption. This effectively disables
+	 * displayable DCC on everything except Raven2.
+	 */
+	if (AMDGPU_FMT_MOD_GET(DCC_RETILE, mod))
+		return;
+
 	if (!*mods)
 		return;
 
-- 
2.25.1

Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-26 Thread Marek Olšák

Perhaps I should clarify this. There are GL and Vulkan features that if any
app uses them and its shaders are killed, the next IB will hang. One of
them is Draw Indirect - if a shader is killed before storing the vertex
count and instance count in memory, the next draw will hang with a high
probability. No such app can be allowed to continue executing after a reset.

Marek

On Wed, Apr 26, 2023 at 5:51 AM Michel Dänzer 
wrote:

> On 4/25/23 21:11, Marek Olšák wrote:
> > The last 3 comments in this thread contain arguments that are false and
> were specifically pointed out as false 6 comments ago: Soft resets are just
> as fatal as hard resets. There is nothing better about soft resets. If the
> VRAM is lost completely, that's a different story, and if the hard reset is
> 100% unreliable, that's also a different story, but other than those two
> outliers, there is no difference between the two from the user point view.
> Both can repeatedly hang if you don't prevent the app that caused the hang
> from using the GPU even if the app is not robust. The robustness context
> type doesn't matter here. By definition, no guilty app can continue after a
> reset, and no innocent apps affected by a reset can continue either because
> those can now hang too. That's how destructive all resets are. Personal
> anecdotes that the soft reset is better are just that, anecdotes.
>
> You're trying to frame the situation as black or white, but reality is
> shades of grey.
>
>
> There's a similar situation with kernel Oopsen: In principle it's not safe
> to continue executing the kernel after it hits an Oops, since it might be
> in an inconsistent state, which could result in any kind of misbehaviour.
> Still, the default behaviour is to continue executing, and in most cases it
> turns out fine. Users which cannot accept the residual risk can choose to
> make the kernel panic when it hits an Oops (either via CONFIG_PANIC_ON_OOPS
> at build time, or via oops=panic on the kernel command line). A kernel
> panic means that the machine basically freezes from a user PoV, which would
> be worse as the default behaviour for most users (because it would e.g.
> incur a higher risk of losing filesystem data).
>
>
> --
> Earthling Michel Dänzer|  https://redhat.com
> Libre software enthusiast  | Mesa and Xwayland developer
>
>

Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-25 Thread Marek Olšák

The last 3 comments in this thread contain arguments that are false and
were specifically pointed out as false 6 comments ago: Soft resets are just
as fatal as hard resets. There is nothing better about soft resets. If the
VRAM is lost completely, that's a different story, and if the hard reset is
100% unreliable, that's also a different story, but other than those two
outliers, there is no difference between the two from the user point view.
Both can repeatedly hang if you don't prevent the app that caused the hang
from using the GPU even if the app is not robust. The robustness context
type doesn't matter here. By definition, no guilty app can continue after a
reset, and no innocent apps affected by a reset can continue either because
those can now hang too. That's how destructive all resets are. Personal
anecdotes that the soft reset is better are just that, anecdotes.

Marek

On Tue, Apr 25, 2023, 08:44 Christian König 
wrote:

> Am 25.04.23 um 14:14 schrieb Michel Dänzer:
> > On 4/25/23 14:08, Christian König wrote:
> >> Well signaling that something happened is not the question. We do this
> for both soft as well as hard resets.
> >>
> >> The question is if errors result in blocking further submissions with
> the same context or not.
> >>
> >> In case of a hard reset and potential loss of state we have to kill the
> context, otherwise a follow up submission would just lockup the hardware
> once more.
> >>
> >> In case of a soft reset I think we can keep the context alive, this way
> even applications without robustness handling can keep work.
> >>
> >> You potentially still get some corruption, but at least not your
> compositor killed.
> > Right, and if there is corruption, the user can restart the session.
> >
> >
> > Maybe a possible compromise could be making soft resets fatal if user
> space enabled robustness for the context, and non-fatal if not.
>
> Well that should already be mostly the case. If an application has
> enabled robustness it should notice that something went wrong and act
> appropriately.
>
> The only thing we need to handle is for applications without robustness
> in case of a hard reset or otherwise it will trigger an reset over and
> over again.
>
> Christian.
>
> >
> >
> >> Am 25.04.23 um 13:07 schrieb Marek Olšák:
> >>> That supposedly depends on the compositor. There may be compositors
> for very specific cases (e.g. Steam Deck) that handle resets very well, and
> those would like to be properly notified of all resets because that's how
> they get the best outcome, e.g. no corruption. A soft reset that is
> unhandled by userspace may result in persistent corruption.
> >
>
>

Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-25 Thread Marek Olšák

That supposedly depends on the compositor. There may be compositors for
very specific cases (e.g. Steam Deck) that handle resets very well, and
those would like to be properly notified of all resets because that's how
they get the best outcome, e.g. no corruption. A soft reset that is
unhandled by userspace may result in persistent corruption.

Marek

On Tue, Apr 25, 2023 at 6:27 AM Michel Dänzer 
wrote:

> On 4/24/23 18:45, Marek Olšák wrote:
> > Soft resets are fatal just as hard resets, but no reset is "always
> fatal". There are cases when apps keep working depending on which features
> are being used. It's still unsafe.
>
> Agreed, in theory.
>
> In practice, from a user PoV, right now there's pretty much 0 chance of
> the user session surviving if the GPU context in certain critical processes
> (e.g. the Wayland compositor or Xwayland) hits a fatal reset. There's a > 0
> chance of it surviving after a soft reset. There's ongoing work towards
> making user-space components more robust against fatal resets, but it's
> taking time. Meanwhile, I suspect most users would take the > 0 chance.
>
>
> --
> Earthling Michel Dänzer|  https://redhat.com
> Libre software enthusiast  | Mesa and Xwayland developer
>
>

Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-24 Thread Marek Olšák

Soft resets are fatal just as hard resets, but no reset is "always fatal".
There are cases when apps keep working depending on which features are
being used. It's still unsafe.

Marek

On Mon, Apr 24, 2023, 03:03 Christian König 
wrote:

> Am 24.04.23 um 03:43 schrieb André Almeida:
> > When a DRM job timeout, the GPU is probably hang and amdgpu have some
> > ways to deal with that, ranging from soft recoveries to full device
> > reset. Anyway, when userspace ask the kernel the state of the context
> > (via AMDGPU_CTX_OP_QUERY_STATE), the kernel reports that the device was
> > reset, regardless if a full reset happened or not.
> >
> > However, amdgpu only marks a context guilty in the ASIC reset path. This
> > makes the userspace report incomplete, given that on soft recovery path
> > the guilty context is not told that it's the guilty one.
> >
> > Fix this by marking the context guilty for every type of reset when a
> > job timeouts.
>
> The guilty handling is pretty much broken by design and only works
> because we go through multiple hops of validating the entity after the
> job has already been pushed to the hw.
>
> I think we should probably just remove that completely and use an
> approach where we check the in flight submissions in the query state
> IOCTL. See my other patch on the mailing list regarding that.
>
> Additional to that I currently didn't considered soft-recovered
> submissions as fatal and continue accepting submissions from that
> context, but already wanted to talk with Marek about that behavior.
>
> Regards,
> Christian.
>
> >
> > Signed-off-by: André Almeida 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 8 +++-
> >   2 files changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index ac78caa7cba8..ea169d1689e2 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -4771,9 +4771,6 @@ int amdgpu_device_pre_asic_reset(struct
> amdgpu_device *adev,
> >
> >   amdgpu_fence_driver_isr_toggle(adev, false);
> >
> > - if (job && job->vm)
> > - drm_sched_increase_karma(>base);
> > -
> >   r = amdgpu_reset_prepare_hwcontext(adev, reset_context);
> >   /* If reset handler not implemented, continue; otherwise return */
> >   if (r == -ENOSYS)
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > index c3d9d75143f4..097ed8f06865 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > @@ -51,6 +51,13 @@ static enum drm_gpu_sched_stat
> amdgpu_job_timedout(struct drm_sched_job *s_job)
> >   memset(, 0, sizeof(struct amdgpu_task_info));
> >   adev->job_hang = true;
> >
> > + amdgpu_vm_get_task_info(ring->adev, job->pasid, );
> > +
> > + if (job && job->vm) {
> > + DRM_INFO("marking %s context as guilty", ti.process_name);
> > + drm_sched_increase_karma(>base);
> > + }
> > +
> >   if (amdgpu_gpu_recovery &&
> >   amdgpu_ring_soft_recovery(ring, job->vmid,
> s_job->s_fence->parent)) {
> >   DRM_ERROR("ring %s timeout, but soft recovered\n",
> > @@ -58,7 +65,6 @@ static enum drm_gpu_sched_stat
> amdgpu_job_timedout(struct drm_sched_job *s_job)
> >   goto exit;
> >   }
> >
> > - amdgpu_vm_get_task_info(ring->adev, job->pasid, );
> >   DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n",
> > job->base.sched->name,
> atomic_read(>fence_drv.last_seq),
> > ring->fence_drv.sync_seq);
>
>

Re: [PATCH 03/13] drm/amdgpu/UAPI: add new CS chunk for GFX shadow buffers

2023-04-13 Thread Marek Olšák

I'm OK with that.

Marek

On Thu, Apr 13, 2023 at 12:56 PM Alex Deucher  wrote:

> On Thu, Apr 13, 2023 at 11:54 AM Christian König
>  wrote:
> >
> > Am 13.04.23 um 14:26 schrieb Alex Deucher:
> > > On Thu, Apr 13, 2023 at 7:32 AM Christian König
> > >  wrote:
> > >> Ok, then we have a problem.
> > >>
> > >> Alex what do you think?
> > > If you program it to 0, FW skips the GDS backup I think so UMD's can
> > > decide whether they want to use it or not, depending on whether they
> > > use GDS.
> >
> > Yeah, but when Mesa isn't using it we have a hard way justifying to
> > upstream that because it isn't tested at all.
>
> Well, the interface would still get used, it's just that mesa would
> likely only ever pass 0 for the virtual address.  It's just a
> passthrough to the packet.  If we discover we need it at some point,
> it would be nice to not have to add a new interface to add it.
>
> Alex
>
> >
> > Christian.
> >
> > >
> > > Alex
> > >
> > >
> > >> Christian.
> > >>
> > >> Am 13.04.23 um 11:21 schrieb Marek Olšák:
> > >>
> > >> That's not why it was removed. It was removed because userspace
> doesn't use GDS memory and gds_va is always going to be 0.
> > >>
> > >> Firmware shouldn't use it because using it would increase preemption
> latency.
> > >>
> > >> Marek
> > >>
> > >> On Sun, Apr 9, 2023, 11:21 Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
> > >>> We removed the GDS information because they were unnecessary. The
> GDS size was already part of the device info before we added the shadow
> info.
> > >>>
> > >>> But as far as I know the firmware needs valid VAs for all three
> buffers or won't work correctly.
> > >>>
> > >>> Christian.
> > >>>
> > >>> Am 06.04.23 um 17:01 schrieb Marek Olšák:
> > >>>
> > >>> There is no GDS shadowing info in the device info uapi, so userspace
> can't create any GDS buffer and thus can't have any GDS va. It's a uapi
> issue, not what firmware wants to do.
> > >>>
> > >>> Marek
> > >>>
> > >>> On Thu, Apr 6, 2023 at 6:31 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
> > >>>> That's what I thought as well, but Mitch/Hans insisted on that.
> > >>>>
> > >>>> We should probably double check internally.
> > >>>>
> > >>>> Christian.
> > >>>>
> > >>>> Am 06.04.23 um 11:43 schrieb Marek Olšák:
> > >>>>
> > >>>> GDS memory isn't used on gfx11. Only GDS OA is used.
> > >>>>
> > >>>> Marek
> > >>>>
> > >>>> On Thu, Apr 6, 2023 at 5:09 AM Christian König <
> christian.koe...@amd.com> wrote:
> > >>>>> Why that?
> > >>>>>
> > >>>>> This is the save buffer for GDS, not the old style GDS BOs.
> > >>>>>
> > >>>>> Christian.
> > >>>>>
> > >>>>> Am 06.04.23 um 09:36 schrieb Marek Olšák:
> > >>>>>
> > >>>>> gds_va is unnecessary.
> > >>>>>
> > >>>>> Marek
> > >>>>>
> > >>>>> On Thu, Mar 30, 2023 at 3:18 PM Alex Deucher <
> alexander.deuc...@amd.com> wrote:
> > >>>>>> For GFX11, the UMD needs to allocate some shadow buffers
> > >>>>>> to be used for preemption.  The UMD allocates the buffers
> > >>>>>> and passes the GPU virtual address to the kernel since the
> > >>>>>> kernel will program the packet that specified these
> > >>>>>> addresses as part of its IB submission frame.
> > >>>>>>
> > >>>>>> v2: UMD passes shadow init to tell kernel when to initialize
> > >>>>>>  the shadow
> > >>>>>>
> > >>>>>> Reviewed-by: Christian König 
> > >>>>>> Signed-off-by: Alex Deucher 
> > >>>>>> ---
> > >>>>>>   include/uapi/drm/amdgpu_drm.h | 10 ++
> > >>>>>>   1 file changed, 10 insertions(+)
> > >>>>>>
> > >>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h
> b/include/uapi/drm/amdgpu_drm.h
> > >>>>>> index b6eb90df5d05..3d9474af6566 100644
> > >>>>>> --- a/include/uapi/drm/amdgpu_drm.h
> > >>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
> > >>>>>> @@ -592,6 +592,7 @@ struct drm_amdgpu_gem_va {
> > >>>>>>   #define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES 0x07
> > >>>>>>   #define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT0x08
> > >>>>>>   #define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x09
> > >>>>>> +#define AMDGPU_CHUNK_ID_CP_GFX_SHADOW   0x0a
> > >>>>>>
> > >>>>>>   struct drm_amdgpu_cs_chunk {
> > >>>>>>  __u32   chunk_id;
> > >>>>>> @@ -708,6 +709,15 @@ struct drm_amdgpu_cs_chunk_data {
> > >>>>>>  };
> > >>>>>>   };
> > >>>>>>
> > >>>>>> +#define AMDGPU_CS_CHUNK_CP_GFX_SHADOW_FLAGS_INIT_SHADOW
>  0x1
> > >>>>>> +
> > >>>>>> +struct drm_amdgpu_cs_chunk_cp_gfx_shadow {
> > >>>>>> +   __u64 shadow_va;
> > >>>>>> +   __u64 csa_va;
> > >>>>>> +   __u64 gds_va;
> > >>>>>> +   __u64 flags;
> > >>>>>> +};
> > >>>>>> +
> > >>>>>>   /*
> > >>>>>>*  Query h/w info: Flag that this is integrated (a.h.a.
> fusion) GPU
> > >>>>>>*
> > >>>>>> --
> > >>>>>> 2.39.2
> > >>>>>>
> >
>

Re: [PATCH 03/13] drm/amdgpu/UAPI: add new CS chunk for GFX shadow buffers

2023-04-13 Thread Marek Olšák

That's not why it was removed. It was removed because userspace doesn't use
GDS memory and gds_va is always going to be 0.

Firmware shouldn't use it because using it would increase preemption
latency.

Marek

On Sun, Apr 9, 2023, 11:21 Christian König 
wrote:

> We removed the GDS information because they were unnecessary. The GDS size
> was already part of the device info before we added the shadow info.
>
> But as far as I know the firmware needs valid VAs for all three buffers or
> won't work correctly.
>
> Christian.
>
> Am 06.04.23 um 17:01 schrieb Marek Olšák:
>
> There is no GDS shadowing info in the device info uapi, so userspace can't
> create any GDS buffer and thus can't have any GDS va. It's a uapi issue,
> not what firmware wants to do.
>
> Marek
>
> On Thu, Apr 6, 2023 at 6:31 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> That's what I thought as well, but Mitch/Hans insisted on that.
>>
>> We should probably double check internally.
>>
>> Christian.
>>
>> Am 06.04.23 um 11:43 schrieb Marek Olšák:
>>
>> GDS memory isn't used on gfx11. Only GDS OA is used.
>>
>> Marek
>>
>> On Thu, Apr 6, 2023 at 5:09 AM Christian König 
>> wrote:
>>
>>> Why that?
>>>
>>> This is the save buffer for GDS, not the old style GDS BOs.
>>>
>>> Christian.
>>>
>>> Am 06.04.23 um 09:36 schrieb Marek Olšák:
>>>
>>> gds_va is unnecessary.
>>>
>>> Marek
>>>
>>> On Thu, Mar 30, 2023 at 3:18 PM Alex Deucher 
>>> wrote:
>>>
>>>> For GFX11, the UMD needs to allocate some shadow buffers
>>>> to be used for preemption.  The UMD allocates the buffers
>>>> and passes the GPU virtual address to the kernel since the
>>>> kernel will program the packet that specified these
>>>> addresses as part of its IB submission frame.
>>>>
>>>> v2: UMD passes shadow init to tell kernel when to initialize
>>>> the shadow
>>>>
>>>> Reviewed-by: Christian König 
>>>> Signed-off-by: Alex Deucher 
>>>> ---
>>>>  include/uapi/drm/amdgpu_drm.h | 10 ++
>>>>  1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/include/uapi/drm/amdgpu_drm.h
>>>> b/include/uapi/drm/amdgpu_drm.h
>>>> index b6eb90df5d05..3d9474af6566 100644
>>>> --- a/include/uapi/drm/amdgpu_drm.h
>>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>>> @@ -592,6 +592,7 @@ struct drm_amdgpu_gem_va {
>>>>  #define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES 0x07
>>>>  #define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT0x08
>>>>  #define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x09
>>>> +#define AMDGPU_CHUNK_ID_CP_GFX_SHADOW   0x0a
>>>>
>>>>  struct drm_amdgpu_cs_chunk {
>>>> __u32   chunk_id;
>>>> @@ -708,6 +709,15 @@ struct drm_amdgpu_cs_chunk_data {
>>>> };
>>>>  };
>>>>
>>>> +#define AMDGPU_CS_CHUNK_CP_GFX_SHADOW_FLAGS_INIT_SHADOW 0x1
>>>> +
>>>> +struct drm_amdgpu_cs_chunk_cp_gfx_shadow {
>>>> +   __u64 shadow_va;
>>>> +   __u64 csa_va;
>>>> +   __u64 gds_va;
>>>> +   __u64 flags;
>>>> +};
>>>> +
>>>>  /*
>>>>   *  Query h/w info: Flag that this is integrated (a.h.a. fusion) GPU
>>>>   *
>>>> --
>>>> 2.39.2
>>>>
>>>>
>>>
>>
>

Re: [PATCH 03/13] drm/amdgpu/UAPI: add new CS chunk for GFX shadow buffers

2023-04-06 Thread Marek Olšák

There is no GDS shadowing info in the device info uapi, so userspace can't
create any GDS buffer and thus can't have any GDS va. It's a uapi issue,
not what firmware wants to do.

Marek

On Thu, Apr 6, 2023 at 6:31 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> That's what I thought as well, but Mitch/Hans insisted on that.
>
> We should probably double check internally.
>
> Christian.
>
> Am 06.04.23 um 11:43 schrieb Marek Olšák:
>
> GDS memory isn't used on gfx11. Only GDS OA is used.
>
> Marek
>
> On Thu, Apr 6, 2023 at 5:09 AM Christian König 
> wrote:
>
>> Why that?
>>
>> This is the save buffer for GDS, not the old style GDS BOs.
>>
>> Christian.
>>
>> Am 06.04.23 um 09:36 schrieb Marek Olšák:
>>
>> gds_va is unnecessary.
>>
>> Marek
>>
>> On Thu, Mar 30, 2023 at 3:18 PM Alex Deucher 
>> wrote:
>>
>>> For GFX11, the UMD needs to allocate some shadow buffers
>>> to be used for preemption.  The UMD allocates the buffers
>>> and passes the GPU virtual address to the kernel since the
>>> kernel will program the packet that specified these
>>> addresses as part of its IB submission frame.
>>>
>>> v2: UMD passes shadow init to tell kernel when to initialize
>>> the shadow
>>>
>>> Reviewed-by: Christian König 
>>> Signed-off-by: Alex Deucher 
>>> ---
>>>  include/uapi/drm/amdgpu_drm.h | 10 ++
>>>  1 file changed, 10 insertions(+)
>>>
>>> diff --git a/include/uapi/drm/amdgpu_drm.h
>>> b/include/uapi/drm/amdgpu_drm.h
>>> index b6eb90df5d05..3d9474af6566 100644
>>> --- a/include/uapi/drm/amdgpu_drm.h
>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>> @@ -592,6 +592,7 @@ struct drm_amdgpu_gem_va {
>>>  #define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES 0x07
>>>  #define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT0x08
>>>  #define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x09
>>> +#define AMDGPU_CHUNK_ID_CP_GFX_SHADOW   0x0a
>>>
>>>  struct drm_amdgpu_cs_chunk {
>>> __u32   chunk_id;
>>> @@ -708,6 +709,15 @@ struct drm_amdgpu_cs_chunk_data {
>>> };
>>>  };
>>>
>>> +#define AMDGPU_CS_CHUNK_CP_GFX_SHADOW_FLAGS_INIT_SHADOW 0x1
>>> +
>>> +struct drm_amdgpu_cs_chunk_cp_gfx_shadow {
>>> +   __u64 shadow_va;
>>> +   __u64 csa_va;
>>> +   __u64 gds_va;
>>> +   __u64 flags;
>>> +};
>>> +
>>>  /*
>>>   *  Query h/w info: Flag that this is integrated (a.h.a. fusion) GPU
>>>   *
>>> --
>>> 2.39.2
>>>
>>>
>>
>

Re: [PATCH 03/13] drm/amdgpu/UAPI: add new CS chunk for GFX shadow buffers

2023-04-06 Thread Marek Olšák

GDS memory isn't used on gfx11. Only GDS OA is used.

Marek

On Thu, Apr 6, 2023 at 5:09 AM Christian König 
wrote:

> Why that?
>
> This is the save buffer for GDS, not the old style GDS BOs.
>
> Christian.
>
> Am 06.04.23 um 09:36 schrieb Marek Olšák:
>
> gds_va is unnecessary.
>
> Marek
>
> On Thu, Mar 30, 2023 at 3:18 PM Alex Deucher 
> wrote:
>
>> For GFX11, the UMD needs to allocate some shadow buffers
>> to be used for preemption.  The UMD allocates the buffers
>> and passes the GPU virtual address to the kernel since the
>> kernel will program the packet that specified these
>> addresses as part of its IB submission frame.
>>
>> v2: UMD passes shadow init to tell kernel when to initialize
>> the shadow
>>
>> Reviewed-by: Christian König 
>> Signed-off-by: Alex Deucher 
>> ---
>>  include/uapi/drm/amdgpu_drm.h | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>> index b6eb90df5d05..3d9474af6566 100644
>> --- a/include/uapi/drm/amdgpu_drm.h
>> +++ b/include/uapi/drm/amdgpu_drm.h
>> @@ -592,6 +592,7 @@ struct drm_amdgpu_gem_va {
>>  #define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES 0x07
>>  #define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT0x08
>>  #define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x09
>> +#define AMDGPU_CHUNK_ID_CP_GFX_SHADOW   0x0a
>>
>>  struct drm_amdgpu_cs_chunk {
>> __u32   chunk_id;
>> @@ -708,6 +709,15 @@ struct drm_amdgpu_cs_chunk_data {
>> };
>>  };
>>
>> +#define AMDGPU_CS_CHUNK_CP_GFX_SHADOW_FLAGS_INIT_SHADOW 0x1
>> +
>> +struct drm_amdgpu_cs_chunk_cp_gfx_shadow {
>> +   __u64 shadow_va;
>> +   __u64 csa_va;
>> +   __u64 gds_va;
>> +   __u64 flags;
>> +};
>> +
>>  /*
>>   *  Query h/w info: Flag that this is integrated (a.h.a. fusion) GPU
>>   *
>> --
>> 2.39.2
>>
>>
>

Re: [PATCH 03/13] drm/amdgpu/UAPI: add new CS chunk for GFX shadow buffers

2023-04-06 Thread Marek Olšák

gds_va is unnecessary.

Marek

On Thu, Mar 30, 2023 at 3:18 PM Alex Deucher 
wrote:

> For GFX11, the UMD needs to allocate some shadow buffers
> to be used for preemption.  The UMD allocates the buffers
> and passes the GPU virtual address to the kernel since the
> kernel will program the packet that specified these
> addresses as part of its IB submission frame.
>
> v2: UMD passes shadow init to tell kernel when to initialize
> the shadow
>
> Reviewed-by: Christian König 
> Signed-off-by: Alex Deucher 
> ---
>  include/uapi/drm/amdgpu_drm.h | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index b6eb90df5d05..3d9474af6566 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -592,6 +592,7 @@ struct drm_amdgpu_gem_va {
>  #define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES 0x07
>  #define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT0x08
>  #define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x09
> +#define AMDGPU_CHUNK_ID_CP_GFX_SHADOW   0x0a
>
>  struct drm_amdgpu_cs_chunk {
> __u32   chunk_id;
> @@ -708,6 +709,15 @@ struct drm_amdgpu_cs_chunk_data {
> };
>  };
>
> +#define AMDGPU_CS_CHUNK_CP_GFX_SHADOW_FLAGS_INIT_SHADOW 0x1
> +
> +struct drm_amdgpu_cs_chunk_cp_gfx_shadow {
> +   __u64 shadow_va;
> +   __u64 csa_va;
> +   __u64 gds_va;
> +   __u64 flags;
> +};
> +
>  /*
>   *  Query h/w info: Flag that this is integrated (a.h.a. fusion) GPU
>   *
> --
> 2.39.2
>
>

Re: [PATCH 07/13] drm/amdgpu: add UAPI to query GFX shadow sizes

2023-03-27 Thread Marek Olšák

Reviewed-by: Marek Olšák 

On Thu, Mar 23, 2023 at 5:41 PM Alex Deucher 
wrote:

> Add UAPI to query the GFX shadow buffer requirements
> for preemption on GFX11.  UMDs need to specify the shadow
> areas for preemption.
>
> v2: move into existing asic info query
> drop GDS as its use is determined by the UMD (Marek)
>
> Signed-off-by: Alex Deucher 
> ---
>  include/uapi/drm/amdgpu_drm.h | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index 3d9474af6566..3563c69521b0 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -1136,6 +1136,14 @@ struct drm_amdgpu_info_device {
> __u64 mall_size;/* AKA infinity cache */
> /* high 32 bits of the rb pipes mask */
> __u32 enabled_rb_pipes_mask_hi;
> +   /* shadow area size for gfx11 */
> +   __u32 shadow_size;
> +   /* shadow area alignment for gfx11 */
> +   __u32 shadow_alignment;
> +   /* context save area size for gfx11 */
> +   __u32 csa_size;
> +   /* context save area alignment for gfx11 */
> +   __u32 csa_alignment;
>  };
>
>  struct drm_amdgpu_info_hw_ip {
> --
> 2.39.2
>
>

Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-22 Thread Marek Olšák

The uapi would make sense if somebody wrote and implemented a Vulkan
extension exposing the hints and if we had customers who require that
extension. Without that, userspace knows almost nothing. If anything, this
effort should be led by our customers especially in the case of Vulkan
(writing the extension spec, etc.)

This is not a stack issue as much as it is an interface designed around
Windows that doesn't fit Linux, and for that reason, putting into uapi in
the current form doesn't seem to be a good idea.

Marek

On Wed, Mar 22, 2023 at 10:52 AM Alex Deucher  wrote:

> On Wed, Mar 22, 2023 at 10:37 AM Marek Olšák  wrote:
> >
> > It sounds like the kernel should set the hint based on which queues are
> used, so that every UMD doesn't have to duplicate the same logic.
>
> Userspace has a better idea of what they are doing than the kernel.
> That said, we already set the video hint in the kernel when we submit
> work to VCN/UVD/VCE and we already set hint COMPUTE when user queues
> are active in ROCm because user queues don't go through the kernel.  I
> guess we could just set 3D by default.  On windows there is a separate
> API for fullscreen 3D games so 3D is only enabled in that case.  I
> assumed UMDs would want to select a hint, but maybe we should just
> select the kernel set something.  I figured vulkan or OpenGL would
> select 3D vs COMPUTE depending on what queues/extensions the app uses.
>
> Thinking about it more, if we do keep the hints, maybe it makes more
> sense to select the hint at context init.  Then we can set the hint to
> the hardware at context init time.  If multiple hints come in from
> different contexts we'll automatically select the most aggressive one.
> That would also be compatible with user mode queues.
>
> Alex
>
> >
> > Marek
> >
> > On Wed, Mar 22, 2023 at 10:29 AM Christian König <
> christian.koe...@amd.com> wrote:
> >>
> >> Well that sounds like being able to optionally set it after context
> creation is actually the right approach.
> >>
> >> VA-API could set it as soon as we know that this is a video codec
> application.
> >>
> >> Vulkan can set it depending on what features are used by the
> application.
> >>
> >> But yes, Shashank (or whoever requested that) should come up with some
> code for Mesa to actually use it. Otherwise we don't have the justification
> to push it into the kernel driver.
> >>
> >> Christian.
> >>
> >> Am 22.03.23 um 15:24 schrieb Marek Olšák:
> >>
> >> The hint is static per API (one of graphics, video, compute, unknown).
> In the case of Vulkan, which exposes all queues, the hint is unknown, so
> Vulkan won't use it. (or make it based on the queue being used and not the
> uapi context state) GL won't use it because the default hint is already 3D.
> That makes VAAPI the only user that only sets the hint once, and maybe it's
> not worth even adding this uapi just for VAAPI.
> >>
> >> Marek
> >>
> >> On Wed, Mar 22, 2023 at 10:08 AM Christian König <
> christian.koe...@amd.com> wrote:
> >>>
> >>> Well completely agree that we shouldn't have unused API. That's why I
> said we should remove the getting the hint from the UAPI.
> >>>
> >>> But what's wrong with setting it after creating the context? Don't you
> know enough about the use case? I need to understand the background a bit
> better here.
> >>>
> >>> Christian.
> >>>
> >>> Am 22.03.23 um 15:05 schrieb Marek Olšák:
> >>>
> >>> The option to change the hint after context creation and get the hint
> would be unused uapi, and AFAIK we are not supposed to add unused uapi.
> What I asked is to change it to a uapi that userspace will actually use.
> >>>
> >>> Marek
> >>>
> >>> On Tue, Mar 21, 2023 at 9:54 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
> >>>>
> >>>> Yes, I would like to avoid having multiple code paths for context
> creation.
> >>>>
> >>>> Setting it later on should be equally to specifying it on creation
> since we only need it during CS.
> >>>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>> Am 21.03.23 um 14:00 schrieb Sharma, Shashank:
> >>>>
> >>>> [AMD Official Use Only - General]
> >>>>
> >>>>
> >>>>
> >>>> When we started this patch series, the workload hint was a part of
> the ctx_flag only,
> >>>>
> >>

Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-22 Thread Marek Olšák

It sounds like the kernel should set the hint based on which queues are
used, so that every UMD doesn't have to duplicate the same logic.

Marek

On Wed, Mar 22, 2023 at 10:29 AM Christian König 
wrote:

> Well that sounds like being able to optionally set it after context
> creation is actually the right approach.
>
> VA-API could set it as soon as we know that this is a video codec
> application.
>
> Vulkan can set it depending on what features are used by the application.
>
> But yes, Shashank (or whoever requested that) should come up with some
> code for Mesa to actually use it. Otherwise we don't have the justification
> to push it into the kernel driver.
>
> Christian.
>
> Am 22.03.23 um 15:24 schrieb Marek Olšák:
>
> The hint is static per API (one of graphics, video, compute, unknown). In
> the case of Vulkan, which exposes all queues, the hint is unknown, so
> Vulkan won't use it. (or make it based on the queue being used and not the
> uapi context state) GL won't use it because the default hint is already 3D.
> That makes VAAPI the only user that only sets the hint once, and maybe it's
> not worth even adding this uapi just for VAAPI.
>
> Marek
>
> On Wed, Mar 22, 2023 at 10:08 AM Christian König 
> wrote:
>
>> Well completely agree that we shouldn't have unused API. That's why I
>> said we should remove the getting the hint from the UAPI.
>>
>> But what's wrong with setting it after creating the context? Don't you
>> know enough about the use case? I need to understand the background a bit
>> better here.
>>
>> Christian.
>>
>> Am 22.03.23 um 15:05 schrieb Marek Olšák:
>>
>> The option to change the hint after context creation and get the hint
>> would be unused uapi, and AFAIK we are not supposed to add unused uapi.
>> What I asked is to change it to a uapi that userspace will actually use.
>>
>> Marek
>>
>> On Tue, Mar 21, 2023 at 9:54 AM Christian König <
>> ckoenig.leichtzumer...@gmail.com> wrote:
>>
>>> Yes, I would like to avoid having multiple code paths for context
>>> creation.
>>>
>>> Setting it later on should be equally to specifying it on creation since
>>> we only need it during CS.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 21.03.23 um 14:00 schrieb Sharma, Shashank:
>>>
>>> [AMD Official Use Only - General]
>>>
>>>
>>>
>>> When we started this patch series, the workload hint was a part of the
>>> ctx_flag only,
>>>
>>> But we changed that after the design review, to make it more like how we
>>> are handling PSTATE.
>>>
>>>
>>>
>>> Details:
>>>
>>> https://patchwork.freedesktop.org/patch/496111/
>>>
>>>
>>>
>>> Regards
>>>
>>> Shashank
>>>
>>>
>>>
>>> *From:* Marek Olšák  
>>> *Sent:* 21 March 2023 04:05
>>> *To:* Sharma, Shashank 
>>> 
>>> *Cc:* amd-gfx@lists.freedesktop.org; Deucher, Alexander
>>>  ; Somalapuram,
>>> Amaranath 
>>> ; Koenig, Christian
>>>  
>>> *Subject:* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints
>>> to ctx ioctl
>>>
>>>
>>>
>>> I think we should do it differently because this interface will be
>>> mostly unused by open source userspace in its current form.
>>>
>>>
>>>
>>> Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will
>>> be immutable for the lifetime of the context. No other interface is needed.
>>>
>>>
>>>
>>> Marek
>>>
>>>
>>>
>>> On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma 
>>> wrote:
>>>
>>> Allow the user to specify a workload hint to the kernel.
>>> We can use these to tweak the dpm heuristics to better match
>>> the workload for improved performance.
>>>
>>> V3: Create only set() workload UAPI (Christian)
>>>
>>> Signed-off-by: Alex Deucher 
>>> Signed-off-by: Shashank Sharma 
>>> ---
>>>  include/uapi/drm/amdgpu_drm.h | 17 +
>>>  1 file changed, 17 insertions(+)
>>>
>>> diff --git a/include/uapi/drm/amdgpu_drm.h
>>> b/include/uapi/drm/amdgpu_drm.h
>>> index c2c9c674a223..23d354242699 100644
>>> --- a/include/uapi/drm/amdgpu_drm.h
>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>> @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list {
>>>  #defi

Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-22 Thread Marek Olšák

The hint is static per API (one of graphics, video, compute, unknown). In
the case of Vulkan, which exposes all queues, the hint is unknown, so
Vulkan won't use it. (or make it based on the queue being used and not the
uapi context state) GL won't use it because the default hint is already 3D.
That makes VAAPI the only user that only sets the hint once, and maybe it's
not worth even adding this uapi just for VAAPI.

Marek

On Wed, Mar 22, 2023 at 10:08 AM Christian König 
wrote:

> Well completely agree that we shouldn't have unused API. That's why I said
> we should remove the getting the hint from the UAPI.
>
> But what's wrong with setting it after creating the context? Don't you
> know enough about the use case? I need to understand the background a bit
> better here.
>
> Christian.
>
> Am 22.03.23 um 15:05 schrieb Marek Olšák:
>
> The option to change the hint after context creation and get the hint
> would be unused uapi, and AFAIK we are not supposed to add unused uapi.
> What I asked is to change it to a uapi that userspace will actually use.
>
> Marek
>
> On Tue, Mar 21, 2023 at 9:54 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> Yes, I would like to avoid having multiple code paths for context
>> creation.
>>
>> Setting it later on should be equally to specifying it on creation since
>> we only need it during CS.
>>
>> Regards,
>> Christian.
>>
>> Am 21.03.23 um 14:00 schrieb Sharma, Shashank:
>>
>> [AMD Official Use Only - General]
>>
>>
>>
>> When we started this patch series, the workload hint was a part of the
>> ctx_flag only,
>>
>> But we changed that after the design review, to make it more like how we
>> are handling PSTATE.
>>
>>
>>
>> Details:
>>
>> https://patchwork.freedesktop.org/patch/496111/
>>
>>
>>
>> Regards
>>
>> Shashank
>>
>>
>>
>> *From:* Marek Olšák  
>> *Sent:* 21 March 2023 04:05
>> *To:* Sharma, Shashank 
>> 
>> *Cc:* amd-gfx@lists.freedesktop.org; Deucher, Alexander
>>  ; Somalapuram,
>> Amaranath  ;
>> Koenig, Christian  
>> *Subject:* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to
>> ctx ioctl
>>
>>
>>
>> I think we should do it differently because this interface will be mostly
>> unused by open source userspace in its current form.
>>
>>
>>
>> Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will be
>> immutable for the lifetime of the context. No other interface is needed.
>>
>>
>>
>> Marek
>>
>>
>>
>> On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma 
>> wrote:
>>
>> Allow the user to specify a workload hint to the kernel.
>> We can use these to tweak the dpm heuristics to better match
>> the workload for improved performance.
>>
>> V3: Create only set() workload UAPI (Christian)
>>
>> Signed-off-by: Alex Deucher 
>> Signed-off-by: Shashank Sharma 
>> ---
>>  include/uapi/drm/amdgpu_drm.h | 17 +
>>  1 file changed, 17 insertions(+)
>>
>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>> index c2c9c674a223..23d354242699 100644
>> --- a/include/uapi/drm/amdgpu_drm.h
>> +++ b/include/uapi/drm/amdgpu_drm.h
>> @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list {
>>  #define AMDGPU_CTX_OP_QUERY_STATE2 4
>>  #define AMDGPU_CTX_OP_GET_STABLE_PSTATE5
>>  #define AMDGPU_CTX_OP_SET_STABLE_PSTATE6
>> +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7
>>
>>  /* GPU reset status */
>>  #define AMDGPU_CTX_NO_RESET0
>> @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list {
>>  #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK  3
>>  #define AMDGPU_CTX_STABLE_PSTATE_PEAK  4
>>
>> +/* GPU workload hints, flag bits 8-15 */
>> +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8
>> +#define AMDGPU_CTX_WORKLOAD_HINT_MASK  (0xff <<
>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>> +#define AMDGPU_CTX_WORKLOAD_HINT_NONE  (0 <<
>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>> +#define AMDGPU_CTX_WORKLOAD_HINT_3D(1 <<
>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>> +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 <<
>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>> +#define AMDGPU_CTX_WORKLOAD_HINT_VR(3 <<
>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>> +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE   (4 <<
>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>> +#define AMDGPU_CTX_WORKLOAD_HINT_MAX
>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE
>> +#define AMDGPU_CTX_WORKLOAD_INDEX(n)  (n >>
>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>> +
>>  struct drm_amdgpu_ctx_in {
>> /** AMDGPU_CTX_OP_* */
>> __u32   op;
>> @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out {
>> __u32   flags;
>> __u32   _pad;
>> } pstate;
>> +
>> +   struct {
>> +   __u32   flags;
>> +   __u32   _pad;
>> +   } workload;
>>  };
>>
>>  union drm_amdgpu_ctx {
>> --
>> 2.34.1
>>
>>
>>
>

Re: [PATCH 07/11] drm/amdgpu: add UAPI to query GFX shadow sizes

2023-03-22 Thread Marek Olšák

On Tue, Mar 21, 2023 at 3:51 PM Alex Deucher  wrote:

> On Mon, Mar 20, 2023 at 8:30 PM Marek Olšák  wrote:
> >
> >
> > On Mon, Mar 20, 2023 at 1:38 PM Alex Deucher 
> wrote:
> >>
> >> Add UAPI to query the GFX shadow buffer requirements
> >> for preemption on GFX11.  UMDs need to specify the shadow
> >> areas for preemption.
> >>
> >> Signed-off-by: Alex Deucher 
> >> ---
> >>  include/uapi/drm/amdgpu_drm.h | 10 ++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/include/uapi/drm/amdgpu_drm.h
> b/include/uapi/drm/amdgpu_drm.h
> >> index 3d9474af6566..19a806145371 100644
> >> --- a/include/uapi/drm/amdgpu_drm.h
> >> +++ b/include/uapi/drm/amdgpu_drm.h
> >> @@ -886,6 +886,7 @@ struct drm_amdgpu_cs_chunk_cp_gfx_shadow {
> >> #define AMDGPU_INFO_VIDEO_CAPS_DECODE   0
> >> /* Subquery id: Encode */
> >> #define AMDGPU_INFO_VIDEO_CAPS_ENCODE   1
> >> +#define AMDGPU_INFO_CP_GFX_SHADOW_SIZE 0x22
> >>
> >>  #define AMDGPU_INFO_MMR_SE_INDEX_SHIFT 0
> >>  #define AMDGPU_INFO_MMR_SE_INDEX_MASK  0xff
> >> @@ -1203,6 +1204,15 @@ struct drm_amdgpu_info_video_caps {
> >> struct drm_amdgpu_info_video_codec_info
> codec_info[AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_COUNT];
> >>  };
> >>
> >> +struct drm_amdgpu_info_cp_gfx_shadow_size {
> >> +   __u32 shadow_size;
> >> +   __u32 shadow_alignment;
> >> +   __u32 csa_size;
> >> +   __u32 csa_alignment;
> >> +   __u32 gds_size;
> >> +   __u32 gds_alignment;
> >
> >
> > Can you document the fields? What is CSA? Also, why is GDS there when
> the hw deprecated it and replaced it with GDS registers?
>
> Will add documentation.  For reference:
> CSA (Context Save Area) - used as a scratch area for FW for saving
> various things
> Shadow - stores the pipeline state
> GDS backup - stores the GDS state used by the pipeline.  I'm not sure
> if this is registers or the old GDS memory.  Presumably the former.
>

1. The POR for gfx11 was not to use GDS memory. I don't know why it's
there, but it would be unused uapi.

2. Is it secure to give userspace write access to the CSA and shadow
buffers? In the case of CSA, it looks like userspace could break the
firmware.

Marek

Re: [PATCH 07/11] drm/amdgpu: add UAPI to query GFX shadow sizes

2023-03-22 Thread Marek Olšák

On Tue, Mar 21, 2023 at 3:54 PM Alex Deucher  wrote:

> On Mon, Mar 20, 2023 at 8:31 PM Marek Olšák  wrote:
> >
> > On Mon, Mar 20, 2023 at 1:38 PM Alex Deucher 
> wrote:
> >>
> >> Add UAPI to query the GFX shadow buffer requirements
> >> for preemption on GFX11.  UMDs need to specify the shadow
> >> areas for preemption.
> >>
> >> Signed-off-by: Alex Deucher 
> >> ---
> >>  include/uapi/drm/amdgpu_drm.h | 10 ++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/include/uapi/drm/amdgpu_drm.h
> b/include/uapi/drm/amdgpu_drm.h
> >> index 3d9474af6566..19a806145371 100644
> >> --- a/include/uapi/drm/amdgpu_drm.h
> >> +++ b/include/uapi/drm/amdgpu_drm.h
> >> @@ -886,6 +886,7 @@ struct drm_amdgpu_cs_chunk_cp_gfx_shadow {
> >> #define AMDGPU_INFO_VIDEO_CAPS_DECODE   0
> >> /* Subquery id: Encode */
> >> #define AMDGPU_INFO_VIDEO_CAPS_ENCODE   1
> >> +#define AMDGPU_INFO_CP_GFX_SHADOW_SIZE 0x22
> >
> >
> > Can you put this into the device structure instead? Let's minimize the
> number of kernel queries as much as possible.
>
> I guess, but one nice thing about this is that we can use the query as
> a way to determine if the kernel supports this functionality or not.
> If not, the query returns -ENOTSUP.
>

That should be another flag in the device info structure or the sizes
should be 0. There is never a reason to add a new single-value INFO query.

Marek

Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-22 Thread Marek Olšák

The option to change the hint after context creation and get the hint would
be unused uapi, and AFAIK we are not supposed to add unused uapi. What I
asked is to change it to a uapi that userspace will actually use.

Marek

On Tue, Mar 21, 2023 at 9:54 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Yes, I would like to avoid having multiple code paths for context creation.
>
> Setting it later on should be equally to specifying it on creation since
> we only need it during CS.
>
> Regards,
> Christian.
>
> Am 21.03.23 um 14:00 schrieb Sharma, Shashank:
>
> [AMD Official Use Only - General]
>
>
>
> When we started this patch series, the workload hint was a part of the
> ctx_flag only,
>
> But we changed that after the design review, to make it more like how we
> are handling PSTATE.
>
>
>
> Details:
>
> https://patchwork.freedesktop.org/patch/496111/
>
>
>
> Regards
>
> Shashank
>
>
>
> *From:* Marek Olšák  
> *Sent:* 21 March 2023 04:05
> *To:* Sharma, Shashank  
> *Cc:* amd-gfx@lists.freedesktop.org; Deucher, Alexander
>  ; Somalapuram,
> Amaranath  ;
> Koenig, Christian  
> *Subject:* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to
> ctx ioctl
>
>
>
> I think we should do it differently because this interface will be mostly
> unused by open source userspace in its current form.
>
>
>
> Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will be
> immutable for the lifetime of the context. No other interface is needed.
>
>
>
> Marek
>
>
>
> On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma 
> wrote:
>
> Allow the user to specify a workload hint to the kernel.
> We can use these to tweak the dpm heuristics to better match
> the workload for improved performance.
>
> V3: Create only set() workload UAPI (Christian)
>
> Signed-off-by: Alex Deucher 
> Signed-off-by: Shashank Sharma 
> ---
>  include/uapi/drm/amdgpu_drm.h | 17 +
>  1 file changed, 17 insertions(+)
>
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index c2c9c674a223..23d354242699 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list {
>  #define AMDGPU_CTX_OP_QUERY_STATE2 4
>  #define AMDGPU_CTX_OP_GET_STABLE_PSTATE5
>  #define AMDGPU_CTX_OP_SET_STABLE_PSTATE6
> +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7
>
>  /* GPU reset status */
>  #define AMDGPU_CTX_NO_RESET0
> @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list {
>  #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK  3
>  #define AMDGPU_CTX_STABLE_PSTATE_PEAK  4
>
> +/* GPU workload hints, flag bits 8-15 */
> +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8
> +#define AMDGPU_CTX_WORKLOAD_HINT_MASK  (0xff <<
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +#define AMDGPU_CTX_WORKLOAD_HINT_NONE  (0 <<
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +#define AMDGPU_CTX_WORKLOAD_HINT_3D(1 <<
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 <<
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +#define AMDGPU_CTX_WORKLOAD_HINT_VR(3 <<
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE   (4 <<
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +#define AMDGPU_CTX_WORKLOAD_HINT_MAX  AMDGPU_CTX_WORKLOAD_HINT_COMPUTE
> +#define AMDGPU_CTX_WORKLOAD_INDEX(n)  (n >>
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +
>  struct drm_amdgpu_ctx_in {
> /** AMDGPU_CTX_OP_* */
> __u32   op;
> @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out {
> __u32   flags;
> __u32   _pad;
> } pstate;
> +
> +   struct {
> +   __u32   flags;
> +   __u32   _pad;
> +   } workload;
>  };
>
>  union drm_amdgpu_ctx {
> --
> 2.34.1
>
>
>

Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl

2023-03-20 Thread Marek Olšák

I think we should do it differently because this interface will be mostly
unused by open source userspace in its current form.

Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will be
immutable for the lifetime of the context. No other interface is needed.

Marek

On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma 
wrote:

> Allow the user to specify a workload hint to the kernel.
> We can use these to tweak the dpm heuristics to better match
> the workload for improved performance.
>
> V3: Create only set() workload UAPI (Christian)
>
> Signed-off-by: Alex Deucher 
> Signed-off-by: Shashank Sharma 
> ---
>  include/uapi/drm/amdgpu_drm.h | 17 +
>  1 file changed, 17 insertions(+)
>
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index c2c9c674a223..23d354242699 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list {
>  #define AMDGPU_CTX_OP_QUERY_STATE2 4
>  #define AMDGPU_CTX_OP_GET_STABLE_PSTATE5
>  #define AMDGPU_CTX_OP_SET_STABLE_PSTATE6
> +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7
>
>  /* GPU reset status */
>  #define AMDGPU_CTX_NO_RESET0
> @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list {
>  #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK  3
>  #define AMDGPU_CTX_STABLE_PSTATE_PEAK  4
>
> +/* GPU workload hints, flag bits 8-15 */
> +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8
> +#define AMDGPU_CTX_WORKLOAD_HINT_MASK  (0xff <<
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +#define AMDGPU_CTX_WORKLOAD_HINT_NONE  (0 <<
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +#define AMDGPU_CTX_WORKLOAD_HINT_3D(1 <<
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 <<
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +#define AMDGPU_CTX_WORKLOAD_HINT_VR(3 <<
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE   (4 <<
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +#define AMDGPU_CTX_WORKLOAD_HINT_MAX  AMDGPU_CTX_WORKLOAD_HINT_COMPUTE
> +#define AMDGPU_CTX_WORKLOAD_INDEX(n)  (n >>
> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
> +
>  struct drm_amdgpu_ctx_in {
> /** AMDGPU_CTX_OP_* */
> __u32   op;
> @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out {
> __u32   flags;
> __u32   _pad;
> } pstate;
> +
> +   struct {
> +   __u32   flags;
> +   __u32   _pad;
> +   } workload;
>  };
>
>  union drm_amdgpu_ctx {
> --
> 2.34.1
>
>

Re: [PATCH 07/11] drm/amdgpu: add UAPI to query GFX shadow sizes

2023-03-20 Thread Marek Olšák

On Mon, Mar 20, 2023 at 1:38 PM Alex Deucher 
wrote:

> Add UAPI to query the GFX shadow buffer requirements
> for preemption on GFX11.  UMDs need to specify the shadow
> areas for preemption.
>
> Signed-off-by: Alex Deucher 
> ---
>  include/uapi/drm/amdgpu_drm.h | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index 3d9474af6566..19a806145371 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -886,6 +886,7 @@ struct drm_amdgpu_cs_chunk_cp_gfx_shadow {
> #define AMDGPU_INFO_VIDEO_CAPS_DECODE   0
> /* Subquery id: Encode */
> #define AMDGPU_INFO_VIDEO_CAPS_ENCODE   1
> +#define AMDGPU_INFO_CP_GFX_SHADOW_SIZE 0x22
>

Can you put this into the device structure instead? Let's minimize the
number of kernel queries as much as possible.

Thanks,
Marek

Re: [PATCH 07/11] drm/amdgpu: add UAPI to query GFX shadow sizes

2023-03-20 Thread Marek Olšák

On Mon, Mar 20, 2023 at 1:38 PM Alex Deucher 
wrote:

> Add UAPI to query the GFX shadow buffer requirements
> for preemption on GFX11.  UMDs need to specify the shadow
> areas for preemption.
>
> Signed-off-by: Alex Deucher 
> ---
>  include/uapi/drm/amdgpu_drm.h | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index 3d9474af6566..19a806145371 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -886,6 +886,7 @@ struct drm_amdgpu_cs_chunk_cp_gfx_shadow {
> #define AMDGPU_INFO_VIDEO_CAPS_DECODE   0
> /* Subquery id: Encode */
> #define AMDGPU_INFO_VIDEO_CAPS_ENCODE   1
> +#define AMDGPU_INFO_CP_GFX_SHADOW_SIZE 0x22
>
>  #define AMDGPU_INFO_MMR_SE_INDEX_SHIFT 0
>  #define AMDGPU_INFO_MMR_SE_INDEX_MASK  0xff
> @@ -1203,6 +1204,15 @@ struct drm_amdgpu_info_video_caps {
> struct drm_amdgpu_info_video_codec_info
> codec_info[AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_COUNT];
>  };
>
> +struct drm_amdgpu_info_cp_gfx_shadow_size {
> +   __u32 shadow_size;
> +   __u32 shadow_alignment;
> +   __u32 csa_size;
> +   __u32 csa_alignment;
> +   __u32 gds_size;
> +   __u32 gds_alignment;
>

Can you document the fields? What is CSA? Also, why is GDS there when the
hw deprecated it and replaced it with GDS registers?

Thanks,
Marek

Re: [PATCH] drm/amdgpu: expose more memory stats in fdinfo

2023-03-09 Thread Marek Olšák

Ping

On Thu, Feb 23, 2023 at 1:46 PM Marek Olšák  wrote:

> Updated patch attached.
>
> Marek
>
> On Mon, Feb 6, 2023 at 4:05 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> Just two nit picks:
>>
>> +seq_printf(m, "drm-evicted-visible-vram:\t%llu KiB\n",
>> +   stats.evicted_visible_vram/1024UL);
>>
>> For the values not standardized for all DRM drivers we might want to use
>> amd as prefix here instead of drm.
>>
>> +uint64_t requested_gtt;/* how much userspace asked for */
>>
>> We used to have automated checkers complaining about comments after
>> members.
>>
>> Kerneldoc complicent comments look like this:
>>
>>  /* @timestamp replaced by @rcu on dma_fence_release() */
>>      struct rcu_head rcu;
>>
>> Apart from that looks good to me.
>>
>> Regards,
>> Christian.
>>
>> Am 30.01.23 um 07:56 schrieb Marek Olšák:
>> > Hi,
>> >
>> > This will be used for performance investigations. The patch is attached.
>> >
>> > Thanks,
>> > Marek
>>
>>

Re: [PATCH] drm/amdgpu: expose more memory stats in fdinfo

2023-02-23 Thread Marek Olšák

Updated patch attached.

Marek

On Mon, Feb 6, 2023 at 4:05 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Just two nit picks:
>
> +seq_printf(m, "drm-evicted-visible-vram:\t%llu KiB\n",
> +   stats.evicted_visible_vram/1024UL);
>
> For the values not standardized for all DRM drivers we might want to use
> amd as prefix here instead of drm.
>
> +uint64_t requested_gtt;/* how much userspace asked for */
>
> We used to have automated checkers complaining about comments after
> members.
>
> Kerneldoc complicent comments look like this:
>
>  /* @timestamp replaced by @rcu on dma_fence_release() */
>  struct rcu_head rcu;
>
> Apart from that looks good to me.
>
> Regards,
> Christian.
>
> Am 30.01.23 um 07:56 schrieb Marek Olšák:
> > Hi,
> >
> > This will be used for performance investigations. The patch is attached.
> >
> > Thanks,
> > Marek
>
>
From 3971ab629b17e15343ee428d32c3422f44c915bc Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= 
Date: Mon, 30 Jan 2023 01:52:40 -0500
Subject: [PATCH] drm/amdgpu: expose more memory stats in fdinfo
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This will be used for performance investigations.

Signed-off-by: Marek Olšák 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 24 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 25 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 23 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  5 ++--
 5 files changed, 76 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 99a7855ab1bc..c57252f004e8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -60,12 +60,13 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
 	struct amdgpu_fpriv *fpriv = file->driver_priv;
 	struct amdgpu_vm *vm = >vm;
 
-	uint64_t vram_mem = 0, gtt_mem = 0, cpu_mem = 0;
+	struct amdgpu_mem_stats stats;
 	ktime_t usage[AMDGPU_HW_IP_NUM];
 	uint32_t bus, dev, fn, domain;
 	unsigned int hw_ip;
 	int ret;
 
+	memset(, 0, sizeof(stats));
 	bus = adev->pdev->bus->number;
 	domain = pci_domain_nr(adev->pdev->bus);
 	dev = PCI_SLOT(adev->pdev->devfn);
@@ -75,7 +76,7 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
 	if (ret)
 		return;
 
-	amdgpu_vm_get_memory(vm, _mem, _mem, _mem);
+	amdgpu_vm_get_memory(vm, );
 	amdgpu_bo_unreserve(vm->root.bo);
 
 	amdgpu_ctx_mgr_usage(>ctx_mgr, usage);
@@ -90,9 +91,22 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
 	seq_printf(m, "drm-driver:\t%s\n", file->minor->dev->driver->name);
 	seq_printf(m, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
 	seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
-	seq_printf(m, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
-	seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
-	seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
+	seq_printf(m, "drm-memory-vram:\t%llu KiB\n", stats.vram/1024UL);
+	seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", stats.gtt/1024UL);
+	seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", stats.cpu/1024UL);
+	seq_printf(m, "amd-memory-visible-vram:\t%llu KiB\n",
+		   stats.visible_vram/1024UL);
+	seq_printf(m, "amd-evicted-vram:\t%llu KiB\n",
+		   stats.evicted_vram/1024UL);
+	seq_printf(m, "amd-evicted-visible-vram:\t%llu KiB\n",
+		   stats.evicted_visible_vram/1024UL);
+	seq_printf(m, "amd-requested-vram:\t%llu KiB\n",
+		   stats.requested_vram/1024UL);
+	seq_printf(m, "amd-requested-visible-vram:\t%llu KiB\n",
+		   stats.requested_visible_vram/1024UL);
+	seq_printf(m, "amd-requested-gtt:\t%llu KiB\n",
+		   stats.requested_gtt/1024UL);
+
 	for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
 		if (!usage[hw_ip])
 			continue;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 1c3e647400bd..2681e3582f75 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1264,24 +1264,41 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
 	trace_amdgpu_bo_move(abo, new_mem->mem_type, old_mem->mem_type);
 }
 
-void amdgpu_bo_get_memory(struct amdgpu_bo *bo, uint64_t *vram_mem,
-uint64_t *gtt_mem, uint64_t *cpu_mem)
+void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
+			  struct amdgpu_mem_stats *stats)
 {
 	unsigned int domain;
+	uint

[PATCH] drm/amdgpu: expose more memory stats in fdinfo

2023-01-29 Thread Marek Olšák

Hi,

This will be used for performance investigations. The patch is attached.

Thanks,
Marek
From 144b478f4e5779314c1965dca43a8d713d5a0fbf Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= 
Date: Mon, 30 Jan 2023 01:52:40 -0500
Subject: [PATCH] drm/amdgpu: expose more memory stats in fdinfo
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This will be used for performance investigations.

Signed-off-by: Marek Olšák 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 24 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 16 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 23 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  5 ++--
 5 files changed, 67 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 99a7855ab1bc..6bd7ccc3db65 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -60,12 +60,13 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
 	struct amdgpu_fpriv *fpriv = file->driver_priv;
 	struct amdgpu_vm *vm = >vm;
 
-	uint64_t vram_mem = 0, gtt_mem = 0, cpu_mem = 0;
+	struct amdgpu_mem_stats stats;
 	ktime_t usage[AMDGPU_HW_IP_NUM];
 	uint32_t bus, dev, fn, domain;
 	unsigned int hw_ip;
 	int ret;
 
+	memset(, 0, sizeof(stats));
 	bus = adev->pdev->bus->number;
 	domain = pci_domain_nr(adev->pdev->bus);
 	dev = PCI_SLOT(adev->pdev->devfn);
@@ -75,7 +76,7 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
 	if (ret)
 		return;
 
-	amdgpu_vm_get_memory(vm, _mem, _mem, _mem);
+	amdgpu_vm_get_memory(vm, );
 	amdgpu_bo_unreserve(vm->root.bo);
 
 	amdgpu_ctx_mgr_usage(>ctx_mgr, usage);
@@ -90,9 +91,22 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
 	seq_printf(m, "drm-driver:\t%s\n", file->minor->dev->driver->name);
 	seq_printf(m, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
 	seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
-	seq_printf(m, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
-	seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
-	seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
+	seq_printf(m, "drm-memory-vram:\t%llu KiB\n", stats.vram/1024UL);
+	seq_printf(m, "drm-memory-visible-vram:\t%llu KiB\n",
+		   stats.visible_vram/1024UL);
+	seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", stats.gtt/1024UL);
+	seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", stats.cpu/1024UL);
+	seq_printf(m, "drm-evicted-vram:\t%llu KiB\n",
+		   stats.evicted_vram/1024UL);
+	seq_printf(m, "drm-evicted-visible-vram:\t%llu KiB\n",
+		   stats.evicted_visible_vram/1024UL);
+	seq_printf(m, "drm-requested-vram:\t%llu KiB\n",
+		   stats.requested_vram/1024UL);
+	seq_printf(m, "drm-requested-visible-vram:\t%llu KiB\n",
+		   stats.requested_visible_vram/1024UL);
+	seq_printf(m, "drm-requested-gtt:\t%llu KiB\n",
+		   stats.requested_gtt/1024UL);
+
 	for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
 		if (!usage[hw_ip])
 			continue;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 2d237f3d3a2e..a66827427887 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1264,24 +1264,41 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
 	trace_amdgpu_bo_move(abo, new_mem->mem_type, old_mem->mem_type);
 }
 
-void amdgpu_bo_get_memory(struct amdgpu_bo *bo, uint64_t *vram_mem,
-uint64_t *gtt_mem, uint64_t *cpu_mem)
+void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
+			  struct amdgpu_mem_stats *stats)
 {
 	unsigned int domain;
+	uint64_t size = amdgpu_bo_size(bo);
 
 	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
 	switch (domain) {
 	case AMDGPU_GEM_DOMAIN_VRAM:
-		*vram_mem += amdgpu_bo_size(bo);
+		stats->vram += size;
+		if (amdgpu_bo_in_cpu_visible_vram(bo))
+			stats->visible_vram += size;
 		break;
 	case AMDGPU_GEM_DOMAIN_GTT:
-		*gtt_mem += amdgpu_bo_size(bo);
+		stats->gtt += size;
 		break;
 	case AMDGPU_GEM_DOMAIN_CPU:
 	default:
-		*cpu_mem += amdgpu_bo_size(bo);
+		stats->cpu += size;
 		break;
 	}
+
+	if (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) {
+		stats->requested_vram += size;
+		if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
+			stats->requested_visible_vram += size;
+
+		if (domain != AMDGPU_GEM_DOMAIN_VRAM) {
+			stats->evicted_vram += size;
+			if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
+stats->evicted_visible_vram += size;
+		}
+	} else if (bo->preferred_domains & AM

[PATCH] drm/amdgpu: add more fields into device info, caches sizes, etc.

2023-01-29 Thread Marek Olšák

AMDGPU_IDS_FLAGS_CONFORMANT_TRUNC_COORD: important for conformance on gfx11
Other fields are exposed from IP discovery.
enabled_rb_pipes_mask_hi is added for future chips, currently 0.

The patch is attached.

Thanks,
Marek
From e0ff08483761cf2be924f696a156a5a66cc04133 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= 
Date: Sun, 29 Jan 2023 23:00:59 -0500
Subject: [PATCH] drm/amdgpu: add more fields into device info, caches sizes,
 etc.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

AMDGPU_IDS_FLAGS_CONFORMANT_TRUNC_COORD: important for conformance on gfx11
Other fields are exposed from IP discovery.
enabled_rb_pipes_mask_hi is added for future chips, currently 0.

Signed-off-by: Marek Olšák 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  5 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 11 +++
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c  |  5 +
 include/uapi/drm/amdgpu_drm.h   | 11 +++
 5 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 7edbaa90fac9..834c047a754d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -107,9 +107,12 @@
  * - 3.50.0 - Update AMDGPU_INFO_DEV_INFO IOCTL for minimum engine and memory clock
  *Update AMDGPU_INFO_SENSOR IOCTL for PEAK_PSTATE engine and memory clock
  *   3.51.0 - Return the PCIe gen and lanes from the INFO ioctl
+ *   3.52.0 - Add AMDGPU_IDS_FLAGS_CONFORMANT_TRUNC_COORD, add device_info fields:
+ *tcp_cache_size, num_sqc_per_wgp, sqc_data_cache_size, sqc_inst_cache_size,
+ *gl1c_cache_size, gl2c_cache_size, mall_size, enabled_rb_pipes_mask_hi
  */
 #define KMS_DRIVER_MAJOR	3
-#define KMS_DRIVER_MINOR	51
+#define KMS_DRIVER_MINOR	52
 #define KMS_DRIVER_PATCHLEVEL	0
 
 unsigned int amdgpu_vram_limit = UINT_MAX;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 86ec9d0d12c8..de9e7a00bb15 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -178,6 +178,8 @@ struct amdgpu_gfx_config {
 	uint32_t num_sc_per_sh;
 	uint32_t num_packer_per_sc;
 	uint32_t pa_sc_tile_steering_override;
+	/* Whether texture coordinate truncation is conformant. */
+	bool ta_cntl2_truncate_coord_mode;
 	uint64_t tcc_disabled_mask;
 	uint32_t gc_num_tcp_per_sa;
 	uint32_t gc_num_sdp_interface;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index fba306e0ef87..9e85eedb57d8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -807,6 +807,8 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 			dev_info->ids_flags |= AMDGPU_IDS_FLAGS_PREEMPTION;
 		if (amdgpu_is_tmz(adev))
 			dev_info->ids_flags |= AMDGPU_IDS_FLAGS_TMZ;
+		if (adev->gfx.config.ta_cntl2_truncate_coord_mode)
+			dev_info->ids_flags |= AMDGPU_IDS_FLAGS_CONFORMANT_TRUNC_COORD;
 
 		vm_size = adev->vm_manager.max_pfn * AMDGPU_GPU_PAGE_SIZE;
 		vm_size -= AMDGPU_VA_RESERVED_SIZE;
@@ -864,6 +866,15 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 			adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X4 ? 4 :
 			adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X2 ? 2 : 1;
 
+		dev_info->tcp_cache_size = adev->gfx.config.gc_tcp_l1_size;
+		dev_info->num_sqc_per_wgp = adev->gfx.config.gc_num_sqc_per_wgp;
+		dev_info->sqc_data_cache_size = adev->gfx.config.gc_l1_data_cache_size_per_sqc;
+		dev_info->sqc_inst_cache_size = adev->gfx.config.gc_l1_instruction_cache_size_per_sqc;
+		dev_info->gl1c_cache_size = adev->gfx.config.gc_gl1c_size_per_instance *
+	adev->gfx.config.gc_gl1c_per_sa;
+		dev_info->gl2c_cache_size = adev->gfx.config.gc_gl2c_per_gpu;
+		dev_info->mall_size = adev->gmc.mall_size;
+
 		ret = copy_to_user(out, dev_info,
    min((size_t)size, sizeof(*dev_info))) ? -EFAULT : 0;
 		kfree(dev_info);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 8ad8a0bffcac..fb8afc596f51 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -1633,6 +1633,11 @@ static void gfx_v11_0_constants_init(struct amdgpu_device *adev)
 	gfx_v11_0_get_tcc_info(adev);
 	adev->gfx.config.pa_sc_tile_steering_override = 0;
 
+	/* Set whether texture coordinate truncation is conformant. */
+	tmp = RREG32_SOC15(GC, 0, regTA_CNTL2);
+	adev->gfx.config.ta_cntl2_truncate_coord_mode =
+		REG_GET_FIELD(tmp, TA_CNTL2, TRUNCATE_COORD_MODE);
+
 	/* XXX SH_MEM regs */
 	/* where to put LDS, scratch, GPUVM in FSA64 space */
 	mutex_lock(>srbm_mutex);
diff --git a/include/uapi/drm/amdgpu_drm.h b/inclu

Re: [PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM

2023-01-24 Thread Marek Olšák

A new Gallium HUD "value producer" could be added that reads fdinfo without
calling the driver. I still think there is merit in having this in
amdgpu_drm.h too.

Marek

On Tue, Jan 24, 2023 at 3:13 AM Marek Olšák  wrote:

> The table of exposed driver-specific counters:
>
> https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/drivers/radeonsi/si_query.c#L1751
>
> Counter enums. They use the same interface as e.g. occlusion queries,
> except that begin_query and end_query save the results in the driver/CPU.
>
> https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/drivers/radeonsi/si_query.h#L45
>
> Counters exposed by the winsys:
>
> https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/include/winsys/radeon_winsys.h#L126
>
> I just need to query the counters in the winsys and return them.
>
> Marek
>
> On Tue, Jan 24, 2023 at 2:58 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> How are the counters which the HUD consumes declared?
>>
>> See what I want to avoid is a) to nail down the interface with the kernel
>> on specific values and b) make it possible to easily expose new values.
>>
>> In other words what we could do with fdinfo is to have something like
>> this:
>>
>> GALLIUM_FDINFO_HUD=drm-memory-vram,amd-evicted-vram,amd-mclk glxgears
>>
>> And the HUD just displays the values the kernel provides without the need
>> to re-compile mesa when we want to add some more values nor have the values
>> as part of the UAPI.
>>
>> Christian.
>>
>> Am 24.01.23 um 08:37 schrieb Marek Olšák:
>>
>> The Gallium HUD doesn't consume strings. It only consumes values that are
>> exposed as counters from the driver. In this case, we need the driver to
>> expose evicted stats as counters. Each counter can set whether the value is
>> absolute (e.g. memory usage) or monotonic (e.g. perf counter). Parsing
>> fdinfo to get the values is undesirable.
>>
>> Marek
>>
>> On Mon, Jan 23, 2023 at 4:31 AM Christian König <
>> ckoenig.leichtzumer...@gmail.com> wrote:
>>
>>> Let's do this as valid in fdinfo.
>>>
>>> This way we can easily extend whatever the kernel wants to display as
>>> statistics in the userspace HUD.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 21.01.23 um 01:45 schrieb Marek Olšák:
>>>
>>> We badly need a way to query evicted memory usage. It's essential for
>>> investigating performance problems and it uncovered the buddy allocator
>>> disaster. Please either suggest an alternative, suggest changes, or review.
>>> We need it ASAP.
>>>
>>> Thanks,
>>> Marek
>>>
>>> On Tue, Jan 10, 2023 at 11:55 AM Marek Olšák  wrote:
>>>
>>>> On Tue, Jan 10, 2023 at 11:23 AM Christian König <
>>>> ckoenig.leichtzumer...@gmail.com> wrote:
>>>>
>>>>> Am 10.01.23 um 16:28 schrieb Marek Olšák:
>>>>>
>>>>> On Wed, Jan 4, 2023 at 9:51 AM Christian König <
>>>>> ckoenig.leichtzumer...@gmail.com> wrote:
>>>>>
>>>>>> Am 04.01.23 um 00:08 schrieb Marek Olšák:
>>>>>>
>>>>>> I see about the access now, but did you even look at the patch?
>>>>>>
>>>>>>
>>>>>> I did look at the patch, but I haven't fully understood yet what you
>>>>>> are trying to do here.
>>>>>>
>>>>>
>>>>> First and foremost, it returns the evicted size of VRAM and visible
>>>>> VRAM, and returns visible VRAM usage. It should be obvious which stat
>>>>> includes the size of another.
>>>>>
>>>>>
>>>>>> Because what the patch does isn't even exposed to common drm code,
>>>>>> such as the preferred domain and visible VRAM placement, so it can't be 
>>>>>> in
>>>>>> fdinfo right now.
>>>>>>
>>>>>> Or do you even know what fdinfo contains? Because it contains nothing
>>>>>> useful. It only has VRAM and GTT usage, which we already have in the INFO
>>>>>> ioctl, so it has nothing that we need. We mainly need the eviction
>>>>>> information and visible VRAM information now. Everything else is a bonus.
>>>>>>
>>>>>>
>>>>>> Well the main question is what are you trying to get from that
>>>>>> information? The eviction

Re: [PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM

2023-01-24 Thread Marek Olšák

The table of exposed driver-specific counters:
https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/drivers/radeonsi/si_query.c#L1751

Counter enums. They use the same interface as e.g. occlusion queries,
except that begin_query and end_query save the results in the driver/CPU.
https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/drivers/radeonsi/si_query.h#L45

Counters exposed by the winsys:
https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/include/winsys/radeon_winsys.h#L126

I just need to query the counters in the winsys and return them.

Marek

On Tue, Jan 24, 2023 at 2:58 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> How are the counters which the HUD consumes declared?
>
> See what I want to avoid is a) to nail down the interface with the kernel
> on specific values and b) make it possible to easily expose new values.
>
> In other words what we could do with fdinfo is to have something like this:
>
> GALLIUM_FDINFO_HUD=drm-memory-vram,amd-evicted-vram,amd-mclk glxgears
>
> And the HUD just displays the values the kernel provides without the need
> to re-compile mesa when we want to add some more values nor have the values
> as part of the UAPI.
>
> Christian.
>
> Am 24.01.23 um 08:37 schrieb Marek Olšák:
>
> The Gallium HUD doesn't consume strings. It only consumes values that are
> exposed as counters from the driver. In this case, we need the driver to
> expose evicted stats as counters. Each counter can set whether the value is
> absolute (e.g. memory usage) or monotonic (e.g. perf counter). Parsing
> fdinfo to get the values is undesirable.
>
> Marek
>
> On Mon, Jan 23, 2023 at 4:31 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> Let's do this as valid in fdinfo.
>>
>> This way we can easily extend whatever the kernel wants to display as
>> statistics in the userspace HUD.
>>
>> Regards,
>> Christian.
>>
>> Am 21.01.23 um 01:45 schrieb Marek Olšák:
>>
>> We badly need a way to query evicted memory usage. It's essential for
>> investigating performance problems and it uncovered the buddy allocator
>> disaster. Please either suggest an alternative, suggest changes, or review.
>> We need it ASAP.
>>
>> Thanks,
>> Marek
>>
>> On Tue, Jan 10, 2023 at 11:55 AM Marek Olšák  wrote:
>>
>>> On Tue, Jan 10, 2023 at 11:23 AM Christian König <
>>> ckoenig.leichtzumer...@gmail.com> wrote:
>>>
>>>> Am 10.01.23 um 16:28 schrieb Marek Olšák:
>>>>
>>>> On Wed, Jan 4, 2023 at 9:51 AM Christian König <
>>>> ckoenig.leichtzumer...@gmail.com> wrote:
>>>>
>>>>> Am 04.01.23 um 00:08 schrieb Marek Olšák:
>>>>>
>>>>> I see about the access now, but did you even look at the patch?
>>>>>
>>>>>
>>>>> I did look at the patch, but I haven't fully understood yet what you
>>>>> are trying to do here.
>>>>>
>>>>
>>>> First and foremost, it returns the evicted size of VRAM and visible
>>>> VRAM, and returns visible VRAM usage. It should be obvious which stat
>>>> includes the size of another.
>>>>
>>>>
>>>>> Because what the patch does isn't even exposed to common drm code,
>>>>> such as the preferred domain and visible VRAM placement, so it can't be in
>>>>> fdinfo right now.
>>>>>
>>>>> Or do you even know what fdinfo contains? Because it contains nothing
>>>>> useful. It only has VRAM and GTT usage, which we already have in the INFO
>>>>> ioctl, so it has nothing that we need. We mainly need the eviction
>>>>> information and visible VRAM information now. Everything else is a bonus.
>>>>>
>>>>>
>>>>> Well the main question is what are you trying to get from that
>>>>> information? The eviction list for example is completely meaningless to
>>>>> userspace, that stuff is only temporary and will be cleared on the next CS
>>>>> again.
>>>>>
>>>>
>>>> I don't know what you mean. The returned eviction stats look correct
>>>> and are stable (they don't change much). You can suggest changes if you
>>>> think some numbers are not reported correctly.
>>>>
>>>>
>>>>>
>>>>> What we could expose is the VRAM over-commit value, e.g. how much BOs
>>>>> which where supposed to be in VRAM are in GTT now. I think that's what you
>

Re: [PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM

2023-01-23 Thread Marek Olšák

The Gallium HUD doesn't consume strings. It only consumes values that are
exposed as counters from the driver. In this case, we need the driver to
expose evicted stats as counters. Each counter can set whether the value is
absolute (e.g. memory usage) or monotonic (e.g. perf counter). Parsing
fdinfo to get the values is undesirable.

Marek

On Mon, Jan 23, 2023 at 4:31 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Let's do this as valid in fdinfo.
>
> This way we can easily extend whatever the kernel wants to display as
> statistics in the userspace HUD.
>
> Regards,
> Christian.
>
> Am 21.01.23 um 01:45 schrieb Marek Olšák:
>
> We badly need a way to query evicted memory usage. It's essential for
> investigating performance problems and it uncovered the buddy allocator
> disaster. Please either suggest an alternative, suggest changes, or review.
> We need it ASAP.
>
> Thanks,
> Marek
>
> On Tue, Jan 10, 2023 at 11:55 AM Marek Olšák  wrote:
>
>> On Tue, Jan 10, 2023 at 11:23 AM Christian König <
>> ckoenig.leichtzumer...@gmail.com> wrote:
>>
>>> Am 10.01.23 um 16:28 schrieb Marek Olšák:
>>>
>>> On Wed, Jan 4, 2023 at 9:51 AM Christian König <
>>> ckoenig.leichtzumer...@gmail.com> wrote:
>>>
>>>> Am 04.01.23 um 00:08 schrieb Marek Olšák:
>>>>
>>>> I see about the access now, but did you even look at the patch?
>>>>
>>>>
>>>> I did look at the patch, but I haven't fully understood yet what you
>>>> are trying to do here.
>>>>
>>>
>>> First and foremost, it returns the evicted size of VRAM and visible
>>> VRAM, and returns visible VRAM usage. It should be obvious which stat
>>> includes the size of another.
>>>
>>>
>>>> Because what the patch does isn't even exposed to common drm code, such
>>>> as the preferred domain and visible VRAM placement, so it can't be in
>>>> fdinfo right now.
>>>>
>>>> Or do you even know what fdinfo contains? Because it contains nothing
>>>> useful. It only has VRAM and GTT usage, which we already have in the INFO
>>>> ioctl, so it has nothing that we need. We mainly need the eviction
>>>> information and visible VRAM information now. Everything else is a bonus.
>>>>
>>>>
>>>> Well the main question is what are you trying to get from that
>>>> information? The eviction list for example is completely meaningless to
>>>> userspace, that stuff is only temporary and will be cleared on the next CS
>>>> again.
>>>>
>>>
>>> I don't know what you mean. The returned eviction stats look correct and
>>> are stable (they don't change much). You can suggest changes if you think
>>> some numbers are not reported correctly.
>>>
>>>
>>>>
>>>> What we could expose is the VRAM over-commit value, e.g. how much BOs
>>>> which where supposed to be in VRAM are in GTT now. I think that's what you
>>>> are looking for here, right?
>>>>
>>>
>>> The VRAM overcommit value is "evicted_vram".
>>>
>>>
>>>>
>>>> Also, it's undesirable to open and parse a text file if we can just
>>>> call an ioctl.
>>>>
>>>>
>>>> Well I see the reasoning for that, but I also see why other drivers do
>>>> a lot of the stuff we have as IOCTL as separate files in sysfs, fdinfo or
>>>> debugfs.
>>>>
>>>> Especially repeating all the static information which were already
>>>> available under sysfs in the INFO IOCTL was a design mistake as far as I
>>>> can see. Just compare what AMDGPU and the KFD code is doing to what for
>>>> example i915 is doing.
>>>>
>>>> Same for things like debug information about a process. The fdinfo
>>>> stuff can be queried from external tools (gdb, gputop, umr etc...) as well
>>>> which makes that interface more preferred.
>>>>
>>>
>>> Nothing uses fdinfo in Mesa. No driver uses sysfs in Mesa except drm
>>> shims, noop drivers, and Intel for perf metrics. sysfs itself is an
>>> unusable mess for the PCIe query and is missing information.
>>>
>>> I'm not against exposing more stuff through sysfs and fdinfo for tools,
>>> but I don't see any reason why drivers should use it (other than for
>>> slowing down queries and initialization).
>>>
>>>
>>> That's wh

Re: [PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM

2023-01-20 Thread Marek Olšák

We badly need a way to query evicted memory usage. It's essential for
investigating performance problems and it uncovered the buddy allocator
disaster. Please either suggest an alternative, suggest changes, or review.
We need it ASAP.

Thanks,
Marek

On Tue, Jan 10, 2023 at 11:55 AM Marek Olšák  wrote:

> On Tue, Jan 10, 2023 at 11:23 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> Am 10.01.23 um 16:28 schrieb Marek Olšák:
>>
>> On Wed, Jan 4, 2023 at 9:51 AM Christian König <
>> ckoenig.leichtzumer...@gmail.com> wrote:
>>
>>> Am 04.01.23 um 00:08 schrieb Marek Olšák:
>>>
>>> I see about the access now, but did you even look at the patch?
>>>
>>>
>>> I did look at the patch, but I haven't fully understood yet what you are
>>> trying to do here.
>>>
>>
>> First and foremost, it returns the evicted size of VRAM and visible VRAM,
>> and returns visible VRAM usage. It should be obvious which stat includes
>> the size of another.
>>
>>
>>> Because what the patch does isn't even exposed to common drm code, such
>>> as the preferred domain and visible VRAM placement, so it can't be in
>>> fdinfo right now.
>>>
>>> Or do you even know what fdinfo contains? Because it contains nothing
>>> useful. It only has VRAM and GTT usage, which we already have in the INFO
>>> ioctl, so it has nothing that we need. We mainly need the eviction
>>> information and visible VRAM information now. Everything else is a bonus.
>>>
>>>
>>> Well the main question is what are you trying to get from that
>>> information? The eviction list for example is completely meaningless to
>>> userspace, that stuff is only temporary and will be cleared on the next CS
>>> again.
>>>
>>
>> I don't know what you mean. The returned eviction stats look correct and
>> are stable (they don't change much). You can suggest changes if you think
>> some numbers are not reported correctly.
>>
>>
>>>
>>> What we could expose is the VRAM over-commit value, e.g. how much BOs
>>> which where supposed to be in VRAM are in GTT now. I think that's what you
>>> are looking for here, right?
>>>
>>
>> The VRAM overcommit value is "evicted_vram".
>>
>>
>>>
>>> Also, it's undesirable to open and parse a text file if we can just call
>>> an ioctl.
>>>
>>>
>>> Well I see the reasoning for that, but I also see why other drivers do a
>>> lot of the stuff we have as IOCTL as separate files in sysfs, fdinfo or
>>> debugfs.
>>>
>>> Especially repeating all the static information which were already
>>> available under sysfs in the INFO IOCTL was a design mistake as far as I
>>> can see. Just compare what AMDGPU and the KFD code is doing to what for
>>> example i915 is doing.
>>>
>>> Same for things like debug information about a process. The fdinfo stuff
>>> can be queried from external tools (gdb, gputop, umr etc...) as well which
>>> makes that interface more preferred.
>>>
>>
>> Nothing uses fdinfo in Mesa. No driver uses sysfs in Mesa except drm
>> shims, noop drivers, and Intel for perf metrics. sysfs itself is an
>> unusable mess for the PCIe query and is missing information.
>>
>> I'm not against exposing more stuff through sysfs and fdinfo for tools,
>> but I don't see any reason why drivers should use it (other than for
>> slowing down queries and initialization).
>>
>>
>> That's what I'm asking: Is this for some tool or to make some driver
>> decision based on it?
>>
>> If you just want the numbers for over displaying then I think it would be
>> better to put this into fdinfo together with the other existing stuff there.
>>
>
>> If you want to make allocation decisions based on this then we should
>> have that as IOCTL or even better as mmap() page between kernel and
>> userspace. But in this case I would also calculation the numbers completely
>> different as well.
>>
>> See we have at least the following things in the kernel:
>> 1. The eviction list in the VM.
>> Those are the BOs which are currently evicted and tried to moved back
>> in on the next CS.
>>
>> 2. The VRAM over commit value.
>> In other words how much more VRAM than available has the application
>> tried to allocate?
>>
>> 3. The visible VRAM usage by this application.
>>
>> The end goal is that the eviction list w

Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2023-01-13 Thread Marek Olšák

Valve would like this in kernel 6.2, but if put it there, we also need to
backport INFO ioctl changes for DRM driver version 3.50.0.

Marek

On Fri, Jan 13, 2023 at 6:33 PM Marek Olšák  wrote:

> There is no hole on 32-bit unfortunately. It looks like the hole on 64-bit
> is now ABI.
>
> I moved the field to replace _pad1. The patch is attached (with your Rb).
>
> Marek
>
> On Fri, Jan 13, 2023 at 4:20 PM Alex Deucher 
> wrote:
>
>> On Fri, Jan 13, 2023 at 4:02 PM Marek Olšák  wrote:
>> >
>> > i've added the comments and indeed pahole shows the hole as expected.
>>
>> What about on 32-bit?
>>
>> Alex
>>
>> >
>> > Marek
>> >
>> > On Thu, Jan 12, 2023 at 11:44 AM Alex Deucher 
>> wrote:
>> >>
>> >> On Thu, Jan 12, 2023 at 6:50 AM Christian König
>> >>  wrote:
>> >> >
>> >> > Am 11.01.23 um 21:48 schrieb Alex Deucher:
>> >> > > On Wed, Jan 4, 2023 at 3:17 PM Marek Olšák 
>> wrote:
>> >> > >> Yes, it's meant to be like a spec sheet. We are not interested in
>> the current bandwidth utilization.
>> >> > > After chatting with Marek on IRC and thinking about this more, I
>> think
>> >> > > this patch is fine.  It's not really meant for bandwidth per se,
>> but
>> >> > > rather as a limit to determine what the driver should do in certain
>> >> > > cases (i.e., when does it make sense to copy to vram vs not).  It's
>> >> > > not straightforward for userspace to parse the full topology to
>> >> > > determine what links may be slow.  I guess one potential pitfall
>> would
>> >> > > be that if you pass the device into a VM, the driver may report the
>> >> > > wrong values.  Generally in a VM the VM doesn't get the full view
>> up
>> >> > > to the root port.  I don't know if the hypervisors report properly
>> for
>> >> > > pcie_bandwidth_available() in a VM or if it just shows the info
>> about
>> >> > > the endpoint in the VM.
>> >> >
>> >> > So this basically doesn't return the gen and lanes of the device, but
>> >> > rather what was negotiated between the device and the upstream root
>> port?
>> >>
>> >> Correct. It exposes the max gen and lanes of the slowest link between
>> >> the device and the root port.
>> >>
>> >> >
>> >> > If I got that correctly then we should probably document that cause
>> >> > otherwise somebody will try to "fix" it at some time.
>> >>
>> >> Good point.
>> >>
>> >> Alex
>> >>
>> >> >
>> >> > Christian.
>> >> >
>> >> > >
>> >> > > Reviewed-by: Alex Deucher 
>> >> > >
>> >> > > Alex
>> >> > >
>> >> > >> Marek
>> >> > >>
>> >> > >> On Wed, Jan 4, 2023 at 10:33 AM Lazar, Lijo 
>> wrote:
>> >> > >>> [AMD Official Use Only - General]
>> >> > >>>
>> >> > >>>
>> >> > >>> To clarify, with DPM in place, the current bandwidth will be
>> changing based on the load.
>> >> > >>>
>> >> > >>> If apps/umd already has a way to know the current bandwidth
>> utilisation, then possible maximum also could be part of the same API.
>> Otherwise, this only looks like duplicate information. We have the same
>> information in sysfs DPM nodes.
>> >> > >>>
>> >> > >>> BTW, I don't know to what extent app/umd really makes use of
>> this. Take that memory frequency as an example (I'm reading it as 16GHz).
>> It only looks like a spec sheet.
>> >> > >>>
>> >> > >>> Thanks,
>> >> > >>> Lijo
>> >> > >>> 
>> >> > >>> From: Marek Olšák 
>> >> > >>> Sent: Wednesday, January 4, 2023 8:40:00 PM
>> >> > >>> To: Lazar, Lijo 
>> >> > >>> Cc: amd-gfx@lists.freedesktop.org > >
>> >> > >>> Subject: Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and
>> lanes from the INFO
>> >> > >>>
>> >> > >>> On Wed, Jan 4, 2023 at 9:19 AM

Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2023-01-13 Thread Marek Olšák

There is no hole on 32-bit unfortunately. It looks like the hole on 64-bit
is now ABI.

I moved the field to replace _pad1. The patch is attached (with your Rb).

Marek

On Fri, Jan 13, 2023 at 4:20 PM Alex Deucher  wrote:

> On Fri, Jan 13, 2023 at 4:02 PM Marek Olšák  wrote:
> >
> > i've added the comments and indeed pahole shows the hole as expected.
>
> What about on 32-bit?
>
> Alex
>
> >
> > Marek
> >
> > On Thu, Jan 12, 2023 at 11:44 AM Alex Deucher 
> wrote:
> >>
> >> On Thu, Jan 12, 2023 at 6:50 AM Christian König
> >>  wrote:
> >> >
> >> > Am 11.01.23 um 21:48 schrieb Alex Deucher:
> >> > > On Wed, Jan 4, 2023 at 3:17 PM Marek Olšák 
> wrote:
> >> > >> Yes, it's meant to be like a spec sheet. We are not interested in
> the current bandwidth utilization.
> >> > > After chatting with Marek on IRC and thinking about this more, I
> think
> >> > > this patch is fine.  It's not really meant for bandwidth per se, but
> >> > > rather as a limit to determine what the driver should do in certain
> >> > > cases (i.e., when does it make sense to copy to vram vs not).  It's
> >> > > not straightforward for userspace to parse the full topology to
> >> > > determine what links may be slow.  I guess one potential pitfall
> would
> >> > > be that if you pass the device into a VM, the driver may report the
> >> > > wrong values.  Generally in a VM the VM doesn't get the full view up
> >> > > to the root port.  I don't know if the hypervisors report properly
> for
> >> > > pcie_bandwidth_available() in a VM or if it just shows the info
> about
> >> > > the endpoint in the VM.
> >> >
> >> > So this basically doesn't return the gen and lanes of the device, but
> >> > rather what was negotiated between the device and the upstream root
> port?
> >>
> >> Correct. It exposes the max gen and lanes of the slowest link between
> >> the device and the root port.
> >>
> >> >
> >> > If I got that correctly then we should probably document that cause
> >> > otherwise somebody will try to "fix" it at some time.
> >>
> >> Good point.
> >>
> >> Alex
> >>
> >> >
> >> > Christian.
> >> >
> >> > >
> >> > > Reviewed-by: Alex Deucher 
> >> > >
> >> > > Alex
> >> > >
> >> > >> Marek
> >> > >>
> >> > >> On Wed, Jan 4, 2023 at 10:33 AM Lazar, Lijo 
> wrote:
> >> > >>> [AMD Official Use Only - General]
> >> > >>>
> >> > >>>
> >> > >>> To clarify, with DPM in place, the current bandwidth will be
> changing based on the load.
> >> > >>>
> >> > >>> If apps/umd already has a way to know the current bandwidth
> utilisation, then possible maximum also could be part of the same API.
> Otherwise, this only looks like duplicate information. We have the same
> information in sysfs DPM nodes.
> >> > >>>
> >> > >>> BTW, I don't know to what extent app/umd really makes use of
> this. Take that memory frequency as an example (I'm reading it as 16GHz).
> It only looks like a spec sheet.
> >> > >>>
> >> > >>> Thanks,
> >> > >>> Lijo
> >> > >>> 
> >> > >>> From: Marek Olšák 
> >> > >>> Sent: Wednesday, January 4, 2023 8:40:00 PM
> >> > >>> To: Lazar, Lijo 
> >> > >>> Cc: amd-gfx@lists.freedesktop.org 
> >> > >>> Subject: Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and
> lanes from the INFO
> >> > >>>
> >> > >>> On Wed, Jan 4, 2023 at 9:19 AM Lazar, Lijo 
> wrote:
> >> > >>>
> >> > >>>
> >> > >>>
> >> > >>> On 1/4/2023 7:43 PM, Marek Olšák wrote:
> >> > >>>> On Wed, Jan 4, 2023 at 6:50 AM Lazar, Lijo  >> > >>>> <mailto:lijo.la...@amd.com>> wrote:
> >> > >>>>
> >> > >>>>
> >> > >>>>
> >> > >>>>  On 1/4/2023 4:11 AM, Marek Olšák wrote:
> >> > >>>>   > I see. Well, those sysfs files are not usable

Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2023-01-13 Thread Marek Olšák

i've added the comments and indeed pahole shows the hole as expected.

Marek

On Thu, Jan 12, 2023 at 11:44 AM Alex Deucher  wrote:

> On Thu, Jan 12, 2023 at 6:50 AM Christian König
>  wrote:
> >
> > Am 11.01.23 um 21:48 schrieb Alex Deucher:
> > > On Wed, Jan 4, 2023 at 3:17 PM Marek Olšák  wrote:
> > >> Yes, it's meant to be like a spec sheet. We are not interested in the
> current bandwidth utilization.
> > > After chatting with Marek on IRC and thinking about this more, I think
> > > this patch is fine.  It's not really meant for bandwidth per se, but
> > > rather as a limit to determine what the driver should do in certain
> > > cases (i.e., when does it make sense to copy to vram vs not).  It's
> > > not straightforward for userspace to parse the full topology to
> > > determine what links may be slow.  I guess one potential pitfall would
> > > be that if you pass the device into a VM, the driver may report the
> > > wrong values.  Generally in a VM the VM doesn't get the full view up
> > > to the root port.  I don't know if the hypervisors report properly for
> > > pcie_bandwidth_available() in a VM or if it just shows the info about
> > > the endpoint in the VM.
> >
> > So this basically doesn't return the gen and lanes of the device, but
> > rather what was negotiated between the device and the upstream root port?
>
> Correct. It exposes the max gen and lanes of the slowest link between
> the device and the root port.
>
> >
> > If I got that correctly then we should probably document that cause
> > otherwise somebody will try to "fix" it at some time.
>
> Good point.
>
> Alex
>
> >
> > Christian.
> >
> > >
> > > Reviewed-by: Alex Deucher 
> > >
> > > Alex
> > >
> > >> Marek
> > >>
> > >> On Wed, Jan 4, 2023 at 10:33 AM Lazar, Lijo 
> wrote:
> > >>> [AMD Official Use Only - General]
> > >>>
> > >>>
> > >>> To clarify, with DPM in place, the current bandwidth will be
> changing based on the load.
> > >>>
> > >>> If apps/umd already has a way to know the current bandwidth
> utilisation, then possible maximum also could be part of the same API.
> Otherwise, this only looks like duplicate information. We have the same
> information in sysfs DPM nodes.
> > >>>
> > >>> BTW, I don't know to what extent app/umd really makes use of this.
> Take that memory frequency as an example (I'm reading it as 16GHz). It only
> looks like a spec sheet.
> > >>>
> > >>> Thanks,
> > >>> Lijo
> > >>> 
> > >>> From: Marek Olšák 
> > >>> Sent: Wednesday, January 4, 2023 8:40:00 PM
> > >>> To: Lazar, Lijo 
> > >>> Cc: amd-gfx@lists.freedesktop.org 
> > >>> Subject: Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes
> from the INFO
> > >>>
> > >>> On Wed, Jan 4, 2023 at 9:19 AM Lazar, Lijo 
> wrote:
> > >>>
> > >>>
> > >>>
> > >>> On 1/4/2023 7:43 PM, Marek Olšák wrote:
> > >>>> On Wed, Jan 4, 2023 at 6:50 AM Lazar, Lijo  > >>>> <mailto:lijo.la...@amd.com>> wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>>  On 1/4/2023 4:11 AM, Marek Olšák wrote:
> > >>>>   > I see. Well, those sysfs files are not usable, and I don't
> think it
> > >>>>   > would be important even if they were usable, but for
> completeness:
> > >>>>   >
> > >>>>   > The ioctl returns:
> > >>>>   >  pcie_gen = 1
> > >>>>   >  pcie_num_lanes = 16
> > >>>>   >
> > >>>>   > Theoretical bandwidth from those values: 4.0 GB/s
> > >>>>   > My DMA test shows this write bandwidth: 3.5 GB/s
> > >>>>   > It matches the expectation.
> > >>>>   >
> > >>>>   > Let's see the devices (there is only 1 GPU Navi21 in the
> system):
> > >>>>   > $ lspci |egrep '(PCI|VGA).*Navi'
> > >>>>   > 0a:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI]
> Navi
> > >>>>  10 XL
> > >>>>   > Upstream Port of PCI Express Switch (rev c3)
> > >>>

Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2023-01-11 Thread Marek Olšák

On Wed, Jan 11, 2023, 15:50 Alex Deucher  wrote:

> On Wed, Jan 11, 2023 at 3:48 PM Alex Deucher 
> wrote:
> >
> > On Wed, Jan 4, 2023 at 3:17 PM Marek Olšák  wrote:
> > >
> > > Yes, it's meant to be like a spec sheet. We are not interested in the
> current bandwidth utilization.
> >
> > After chatting with Marek on IRC and thinking about this more, I think
> > this patch is fine.  It's not really meant for bandwidth per se, but
> > rather as a limit to determine what the driver should do in certain
> > cases (i.e., when does it make sense to copy to vram vs not).  It's
> > not straightforward for userspace to parse the full topology to
> > determine what links may be slow.  I guess one potential pitfall would
> > be that if you pass the device into a VM, the driver may report the
> > wrong values.  Generally in a VM the VM doesn't get the full view up
> > to the root port.  I don't know if the hypervisors report properly for
> > pcie_bandwidth_available() in a VM or if it just shows the info about
> > the endpoint in the VM.
> >
> > Reviewed-by: Alex Deucher 
>
> Actually:
>
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index fe7f871e3080..f7fc7325f17f 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -1053,7 +1053,7 @@ struct drm_amdgpu_info_device {
>  __u32 enabled_rb_pipes_mask;
>  __u32 num_rb_pipes;
>  __u32 num_hw_gfx_contexts;
> -__u32 _pad;
> +__u32 pcie_gen;
>  __u64 ids_flags;
>  /** Starting virtual address for UMDs. */
>  __u64 virtual_address_offset;
> @@ -1109,6 +1109,7 @@ struct drm_amdgpu_info_device {
>  __u64 high_va_max;
>  /* gfx10 pa_sc_tile_steering_override */
>  __u32 pa_sc_tile_steering_override;
> +__u32 pcie_num_lanes;
>  /* disabled TCCs */
>  __u64 tcc_disabled_mask;
>  __u64 min_engine_clock;
>
> Doesn't that last one need to be added to the end of the structure?
>

There was a hole because one u32 was surrounded by 2 u64s. (unless I missed
some packing #pragma)

Marek


> Alex
>
> >
> > Alex
> >
> > >
> > > Marek
> > >
> > > On Wed, Jan 4, 2023 at 10:33 AM Lazar, Lijo 
> wrote:
> > >>
> > >> [AMD Official Use Only - General]
> > >>
> > >>
> > >> To clarify, with DPM in place, the current bandwidth will be changing
> based on the load.
> > >>
> > >> If apps/umd already has a way to know the current bandwidth
> utilisation, then possible maximum also could be part of the same API.
> Otherwise, this only looks like duplicate information. We have the same
> information in sysfs DPM nodes.
> > >>
> > >> BTW, I don't know to what extent app/umd really makes use of this.
> Take that memory frequency as an example (I'm reading it as 16GHz). It only
> looks like a spec sheet.
> > >>
> > >> Thanks,
> > >> Lijo
> > >> 
> > >> From: Marek Olšák 
> > >> Sent: Wednesday, January 4, 2023 8:40:00 PM
> > >> To: Lazar, Lijo 
> > >> Cc: amd-gfx@lists.freedesktop.org 
> > >> Subject: Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes
> from the INFO
> > >>
> > >> On Wed, Jan 4, 2023 at 9:19 AM Lazar, Lijo 
> wrote:
> > >>
> > >>
> > >>
> > >> On 1/4/2023 7:43 PM, Marek Olšák wrote:
> > >> > On Wed, Jan 4, 2023 at 6:50 AM Lazar, Lijo  > >> > <mailto:lijo.la...@amd.com>> wrote:
> > >> >
> > >> >
> > >> >
> > >> > On 1/4/2023 4:11 AM, Marek Olšák wrote:
> > >> >  > I see. Well, those sysfs files are not usable, and I don't
> think it
> > >> >  > would be important even if they were usable, but for
> completeness:
> > >> >  >
> > >> >  > The ioctl returns:
> > >> >  >  pcie_gen = 1
> > >> >  >  pcie_num_lanes = 16
> > >> >  >
> > >> >  > Theoretical bandwidth from those values: 4.0 GB/s
> > >> >  > My DMA test shows this write bandwidth: 3.5 GB/s
> > >> >  > It matches the expectation.
> > >> >  >
> > >> >  > Let's see the devices (there is only 1 GPU Navi21 in the
> system):
> > >> >  > $ lspci |egrep '(PCI|VGA).*Navi'
> > >> >  > 0a:00.0 PCI

Re: [PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM

2023-01-10 Thread Marek Olšák

On Tue, Jan 10, 2023 at 11:23 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Am 10.01.23 um 16:28 schrieb Marek Olšák:
>
> On Wed, Jan 4, 2023 at 9:51 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> Am 04.01.23 um 00:08 schrieb Marek Olšák:
>>
>> I see about the access now, but did you even look at the patch?
>>
>>
>> I did look at the patch, but I haven't fully understood yet what you are
>> trying to do here.
>>
>
> First and foremost, it returns the evicted size of VRAM and visible VRAM,
> and returns visible VRAM usage. It should be obvious which stat includes
> the size of another.
>
>
>> Because what the patch does isn't even exposed to common drm code, such
>> as the preferred domain and visible VRAM placement, so it can't be in
>> fdinfo right now.
>>
>> Or do you even know what fdinfo contains? Because it contains nothing
>> useful. It only has VRAM and GTT usage, which we already have in the INFO
>> ioctl, so it has nothing that we need. We mainly need the eviction
>> information and visible VRAM information now. Everything else is a bonus.
>>
>>
>> Well the main question is what are you trying to get from that
>> information? The eviction list for example is completely meaningless to
>> userspace, that stuff is only temporary and will be cleared on the next CS
>> again.
>>
>
> I don't know what you mean. The returned eviction stats look correct and
> are stable (they don't change much). You can suggest changes if you think
> some numbers are not reported correctly.
>
>
>>
>> What we could expose is the VRAM over-commit value, e.g. how much BOs
>> which where supposed to be in VRAM are in GTT now. I think that's what you
>> are looking for here, right?
>>
>
> The VRAM overcommit value is "evicted_vram".
>
>
>>
>> Also, it's undesirable to open and parse a text file if we can just call
>> an ioctl.
>>
>>
>> Well I see the reasoning for that, but I also see why other drivers do a
>> lot of the stuff we have as IOCTL as separate files in sysfs, fdinfo or
>> debugfs.
>>
>> Especially repeating all the static information which were already
>> available under sysfs in the INFO IOCTL was a design mistake as far as I
>> can see. Just compare what AMDGPU and the KFD code is doing to what for
>> example i915 is doing.
>>
>> Same for things like debug information about a process. The fdinfo stuff
>> can be queried from external tools (gdb, gputop, umr etc...) as well which
>> makes that interface more preferred.
>>
>
> Nothing uses fdinfo in Mesa. No driver uses sysfs in Mesa except drm
> shims, noop drivers, and Intel for perf metrics. sysfs itself is an
> unusable mess for the PCIe query and is missing information.
>
> I'm not against exposing more stuff through sysfs and fdinfo for tools,
> but I don't see any reason why drivers should use it (other than for
> slowing down queries and initialization).
>
>
> That's what I'm asking: Is this for some tool or to make some driver
> decision based on it?
>
> If you just want the numbers for over displaying then I think it would be
> better to put this into fdinfo together with the other existing stuff there.
>

> If you want to make allocation decisions based on this then we should have
> that as IOCTL or even better as mmap() page between kernel and userspace.
> But in this case I would also calculation the numbers completely different
> as well.
>
> See we have at least the following things in the kernel:
> 1. The eviction list in the VM.
> Those are the BOs which are currently evicted and tried to moved back
> in on the next CS.
>
> 2. The VRAM over commit value.
> In other words how much more VRAM than available has the application
> tried to allocate?
>
> 3. The visible VRAM usage by this application.
>
> The end goal is that the eviction list will go away, e.g. we will always
> have stable allocations based on allocations of other applications and not
> constantly swap things in and out.
>
> When you now expose the eviction list to userspace we will be stuck with
> this interface forever.
>

It's for the GALLIUM HUD.

The only missing thing is the size of all evicted VRAM allocations, and the
size of all evicted visible VRAM allocations.

1. No list is exposed. Only sums of buffer sizes are exposed. Also, the
eviction list has no meaning here. All lists are treated equally, and
mem_type is compared with preferred_domains to determine where buffers are
and where they should be.

2. I'm not interested in the overcommit value. I'm only interested in
knowing the number of bytes of evicted VRAM right now. It can be as
variable as the CPU load, but in practice it shouldn't be because PCIe
doesn't have the bandwidth to move things quickly.

3. Yes, that's true.

Marek

Re: [PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM

2023-01-10 Thread Marek Olšák

On Wed, Jan 4, 2023 at 9:51 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Am 04.01.23 um 00:08 schrieb Marek Olšák:
>
> I see about the access now, but did you even look at the patch?
>
>
> I did look at the patch, but I haven't fully understood yet what you are
> trying to do here.
>

First and foremost, it returns the evicted size of VRAM and visible VRAM,
and returns visible VRAM usage. It should be obvious which stat includes
the size of another.


> Because what the patch does isn't even exposed to common drm code, such as
> the preferred domain and visible VRAM placement, so it can't be in fdinfo
> right now.
>
> Or do you even know what fdinfo contains? Because it contains nothing
> useful. It only has VRAM and GTT usage, which we already have in the INFO
> ioctl, so it has nothing that we need. We mainly need the eviction
> information and visible VRAM information now. Everything else is a bonus.
>
>
> Well the main question is what are you trying to get from that
> information? The eviction list for example is completely meaningless to
> userspace, that stuff is only temporary and will be cleared on the next CS
> again.
>

I don't know what you mean. The returned eviction stats look correct and
are stable (they don't change much). You can suggest changes if you think
some numbers are not reported correctly.


>
> What we could expose is the VRAM over-commit value, e.g. how much BOs
> which where supposed to be in VRAM are in GTT now. I think that's what you
> are looking for here, right?
>

The VRAM overcommit value is "evicted_vram".


>
> Also, it's undesirable to open and parse a text file if we can just call
> an ioctl.
>
>
> Well I see the reasoning for that, but I also see why other drivers do a
> lot of the stuff we have as IOCTL as separate files in sysfs, fdinfo or
> debugfs.
>
> Especially repeating all the static information which were already
> available under sysfs in the INFO IOCTL was a design mistake as far as I
> can see. Just compare what AMDGPU and the KFD code is doing to what for
> example i915 is doing.
>
> Same for things like debug information about a process. The fdinfo stuff
> can be queried from external tools (gdb, gputop, umr etc...) as well which
> makes that interface more preferred.
>

Nothing uses fdinfo in Mesa. No driver uses sysfs in Mesa except drm shims,
noop drivers, and Intel for perf metrics. sysfs itself is an unusable mess
for the PCIe query and is missing information.

I'm not against exposing more stuff through sysfs and fdinfo for tools, but
I don't see any reason why drivers should use it (other than for slowing
down queries and initialization).

Marek

Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2023-01-04 Thread Marek Olšák

Yes, it's meant to be like a spec sheet. We are not interested in the
current bandwidth utilization.

Marek

On Wed, Jan 4, 2023 at 10:33 AM Lazar, Lijo  wrote:

> [AMD Official Use Only - General]
>
> To clarify, with DPM in place, the current bandwidth will be changing
> based on the load.
>
> If apps/umd already has a way to know the current bandwidth utilisation,
> then possible maximum also could be part of the same API. Otherwise, this
> only looks like duplicate information. We have the same information in
> sysfs DPM nodes.
>
> BTW, I don't know to what extent app/umd really makes use of this. Take
> that memory frequency as an example (I'm reading it as 16GHz). It only
> looks like a spec sheet.
>
> Thanks,
> Lijo
> --
> *From:* Marek Olšák 
> *Sent:* Wednesday, January 4, 2023 8:40:00 PM
> *To:* Lazar, Lijo 
> *Cc:* amd-gfx@lists.freedesktop.org 
> *Subject:* Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from
> the INFO
>
> On Wed, Jan 4, 2023 at 9:19 AM Lazar, Lijo  wrote:
>
>
>
> On 1/4/2023 7:43 PM, Marek Olšák wrote:
> > On Wed, Jan 4, 2023 at 6:50 AM Lazar, Lijo  > <mailto:lijo.la...@amd.com>> wrote:
> >
> >
> >
> > On 1/4/2023 4:11 AM, Marek Olšák wrote:
> >  > I see. Well, those sysfs files are not usable, and I don't think
> it
> >  > would be important even if they were usable, but for completeness:
> >  >
> >  > The ioctl returns:
> >  >  pcie_gen = 1
> >  >  pcie_num_lanes = 16
> >  >
> >  > Theoretical bandwidth from those values: 4.0 GB/s
> >  > My DMA test shows this write bandwidth: 3.5 GB/s
> >  > It matches the expectation.
> >  >
> >  > Let's see the devices (there is only 1 GPU Navi21 in the system):
> >  > $ lspci |egrep '(PCI|VGA).*Navi'
> >  > 0a:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi
> > 10 XL
> >  > Upstream Port of PCI Express Switch (rev c3)
> >  > 0b:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi
> > 10 XL
> >  > Downstream Port of PCI Express Switch
> >  > 0c:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> >  > [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c3)
> >  >
> >  > Let's read sysfs:
> >  >
> >  > $ cat /sys/bus/pci/devices/:0a:00.0/current_link_width
> >  > 16
> >  > $ cat /sys/bus/pci/devices/:0b:00.0/current_link_width
> >  > 16
> >  > $ cat /sys/bus/pci/devices/:0c:00.0/current_link_width
> >  > 16
> >  > $ cat /sys/bus/pci/devices/:0a:00.0/current_link_speed
> >  > 2.5 GT/s PCIe
> >  > $ cat /sys/bus/pci/devices/:0b:00.0/current_link_speed
> >  > 16.0 GT/s PCIe
> >  > $ cat /sys/bus/pci/devices/:0c:00.0/current_link_speed
> >  > 16.0 GT/s PCIe
> >  >
> >  > Problem 1: None of the speed numbers match 4 GB/s.
> >
> > US bridge = 2.5GT/s means operating at PCIe Gen 1 speed. Total
> > theoretical bandwidth is then derived based on encoding and total
> > number
> > of lanes.
> >
> >  > Problem 2: Userspace doesn't know the bus index of the bridges,
> > and it's
> >  > not clear which bridge should be used.
> >
> > In general, modern ones have this arch= US->DS->EP. US is the one
> > connected to physical link.
> >
> >  > Problem 3: The PCIe gen number is missing.
> >
> > Current link speed is based on whether it's Gen1/2/3/4/5.
> >
> > BTW, your patch makes use of capabilities flags which gives the
> maximum
> > supported speed/width by the device. It may not necessarily reflect
> the
> > current speed/width negotiated. I guess in NV, this info is already
> > obtained from PMFW and made available through metrics table.
> >
> >
> > It computes the minimum of the device PCIe gen and the motherboard/slot
> > PCIe gen to get the final value. These 2 lines do that. The low 16 bits
> > of the mask contain the device PCIe gen mask. The high 16 bits of the
> > mask contain the slot PCIe gen mask.
> > + pcie_gen_mask = adev->pm.pcie_gen_mask & (adev->pm.pcie_gen_mask >>
> 16);
> > + dev_info->pcie_gen = fls(pcie_gen_mask);
> >
>
> With DPM in place on some ASICs, how much does this static info help for
> upper level apps?
>
>
&g

Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2023-01-04 Thread Marek Olšák

On Wed, Jan 4, 2023 at 9:19 AM Lazar, Lijo  wrote:

>
>
> On 1/4/2023 7:43 PM, Marek Olšák wrote:
> > On Wed, Jan 4, 2023 at 6:50 AM Lazar, Lijo  > <mailto:lijo.la...@amd.com>> wrote:
> >
> >
> >
> > On 1/4/2023 4:11 AM, Marek Olšák wrote:
> >  > I see. Well, those sysfs files are not usable, and I don't think
> it
> >  > would be important even if they were usable, but for completeness:
> >  >
> >  > The ioctl returns:
> >  >  pcie_gen = 1
> >  >  pcie_num_lanes = 16
> >  >
> >  > Theoretical bandwidth from those values: 4.0 GB/s
> >  > My DMA test shows this write bandwidth: 3.5 GB/s
> >  > It matches the expectation.
> >  >
> >  > Let's see the devices (there is only 1 GPU Navi21 in the system):
> >  > $ lspci |egrep '(PCI|VGA).*Navi'
> >  > 0a:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi
> > 10 XL
> >  > Upstream Port of PCI Express Switch (rev c3)
> >  > 0b:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi
> > 10 XL
> >  > Downstream Port of PCI Express Switch
> >  > 0c:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> >  > [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c3)
> >  >
> >  > Let's read sysfs:
> >  >
> >  > $ cat /sys/bus/pci/devices/:0a:00.0/current_link_width
> >  > 16
> >  > $ cat /sys/bus/pci/devices/:0b:00.0/current_link_width
> >  > 16
> >  > $ cat /sys/bus/pci/devices/:0c:00.0/current_link_width
> >  > 16
> >  > $ cat /sys/bus/pci/devices/:0a:00.0/current_link_speed
> >  > 2.5 GT/s PCIe
> >  > $ cat /sys/bus/pci/devices/:0b:00.0/current_link_speed
> >  > 16.0 GT/s PCIe
> >  > $ cat /sys/bus/pci/devices/:0c:00.0/current_link_speed
> >  > 16.0 GT/s PCIe
> >  >
> >  > Problem 1: None of the speed numbers match 4 GB/s.
> >
> > US bridge = 2.5GT/s means operating at PCIe Gen 1 speed. Total
> > theoretical bandwidth is then derived based on encoding and total
> > number
> > of lanes.
> >
> >  > Problem 2: Userspace doesn't know the bus index of the bridges,
> > and it's
> >  > not clear which bridge should be used.
> >
> > In general, modern ones have this arch= US->DS->EP. US is the one
> > connected to physical link.
> >
> >  > Problem 3: The PCIe gen number is missing.
> >
> > Current link speed is based on whether it's Gen1/2/3/4/5.
> >
> > BTW, your patch makes use of capabilities flags which gives the
> maximum
> > supported speed/width by the device. It may not necessarily reflect
> the
> > current speed/width negotiated. I guess in NV, this info is already
> > obtained from PMFW and made available through metrics table.
> >
> >
> > It computes the minimum of the device PCIe gen and the motherboard/slot
> > PCIe gen to get the final value. These 2 lines do that. The low 16 bits
> > of the mask contain the device PCIe gen mask. The high 16 bits of the
> > mask contain the slot PCIe gen mask.
> > + pcie_gen_mask = adev->pm.pcie_gen_mask & (adev->pm.pcie_gen_mask >>
> 16);
> > + dev_info->pcie_gen = fls(pcie_gen_mask);
> >
>
> With DPM in place on some ASICs, how much does this static info help for
> upper level apps?
>

It helps UMDs make better decisions if they know the maximum achievable
bandwidth. UMDs also compute the maximum memory bandwidth and compute
performance (FLOPS). Right now it's printed by Mesa to give users detailed
information about their GPU. For example:

$ AMD_DEBUG=info glxgears
Device info:
name = NAVI21
marketing_name = AMD Radeon RX 6800
num_se = 3
num_rb = 12
num_cu = 60
max_gpu_freq = 2475 MHz
max_gflops = 19008 GFLOPS
l0_cache_size = 16 KB
l1_cache_size = 128 KB
l2_cache_size = 4096 KB
l3_cache_size = 128 MB
memory_channels = 16 (TCC blocks)
memory_size = 16 GB (16384 MB)
memory_freq = 16 GHz
memory_bus_width = 256 bits
memory_bandwidth = 512 GB/s
pcie_gen = 1
pcie_num_lanes = 16
pcie_bandwidth = 4.0 GB/s

Marek

Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2023-01-04 Thread Marek Olšák

On Wed, Jan 4, 2023 at 6:50 AM Lazar, Lijo  wrote:

>
>
> On 1/4/2023 4:11 AM, Marek Olšák wrote:
> > I see. Well, those sysfs files are not usable, and I don't think it
> > would be important even if they were usable, but for completeness:
> >
> > The ioctl returns:
> >  pcie_gen = 1
> >  pcie_num_lanes = 16
> >
> > Theoretical bandwidth from those values: 4.0 GB/s
> > My DMA test shows this write bandwidth: 3.5 GB/s
> > It matches the expectation.
> >
> > Let's see the devices (there is only 1 GPU Navi21 in the system):
> > $ lspci |egrep '(PCI|VGA).*Navi'
> > 0a:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL
> > Upstream Port of PCI Express Switch (rev c3)
> > 0b:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL
> > Downstream Port of PCI Express Switch
> > 0c:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> > [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c3)
> >
> > Let's read sysfs:
> >
> > $ cat /sys/bus/pci/devices/:0a:00.0/current_link_width
> > 16
> > $ cat /sys/bus/pci/devices/:0b:00.0/current_link_width
> > 16
> > $ cat /sys/bus/pci/devices/:0c:00.0/current_link_width
> > 16
> > $ cat /sys/bus/pci/devices/:0a:00.0/current_link_speed
> > 2.5 GT/s PCIe
> > $ cat /sys/bus/pci/devices/:0b:00.0/current_link_speed
> > 16.0 GT/s PCIe
> > $ cat /sys/bus/pci/devices/:0c:00.0/current_link_speed
> > 16.0 GT/s PCIe
> >
> > Problem 1: None of the speed numbers match 4 GB/s.
>
> US bridge = 2.5GT/s means operating at PCIe Gen 1 speed. Total
> theoretical bandwidth is then derived based on encoding and total number
> of lanes.
>
> > Problem 2: Userspace doesn't know the bus index of the bridges, and it's
> > not clear which bridge should be used.
>
> In general, modern ones have this arch= US->DS->EP. US is the one
> connected to physical link.
>
> > Problem 3: The PCIe gen number is missing.
>
> Current link speed is based on whether it's Gen1/2/3/4/5.
>
> BTW, your patch makes use of capabilities flags which gives the maximum
> supported speed/width by the device. It may not necessarily reflect the
> current speed/width negotiated. I guess in NV, this info is already
> obtained from PMFW and made available through metrics table.
>

It computes the minimum of the device PCIe gen and the motherboard/slot
PCIe gen to get the final value. These 2 lines do that. The low 16 bits of
the mask contain the device PCIe gen mask. The high 16 bits of the mask
contain the slot PCIe gen mask.
+ pcie_gen_mask = adev->pm.pcie_gen_mask & (adev->pm.pcie_gen_mask >> 16);
+ dev_info->pcie_gen = fls(pcie_gen_mask);

Marek

Re: [PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM

2023-01-03 Thread Marek Olšák

I see about the access now, but did you even look at the patch? Because
what the patch does isn't even exposed to common drm code, such as the
preferred domain and visible VRAM placement, so it can't be in fdinfo right
now.

Or do you even know what fdinfo contains? Because it contains nothing
useful. It only has VRAM and GTT usage, which we already have in the INFO
ioctl, so it has nothing that we need. We mainly need the eviction
information and visible VRAM information now. Everything else is a bonus.

Also, it's undesirable to open and parse a text file if we can just call an
ioctl.

So do you want me to move it into amdgpu_vm.c? Because you could have just
said: Let's move it into amdgpu_vm.c. :)

Thanks,
Marek

On Tue, Jan 3, 2023 at 3:33 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Take a look at /proc/self/fdinfo/$fd.
>
> The Intel guys made that vendor agnostic and are using it within their IGT
> gpu top tool.
>
> Christian.
>
> Am 02.01.23 um 18:57 schrieb Marek Olšák:
>
> What are you talking about? Is fdinfo in sysfs? Userspace drivers can't
> access sysfs.
>
> Marek
>
> On Mon, Jan 2, 2023, 10:56 Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> Well first of all don't mess with the VM internals outside of the VM code.
>>
>> Then why would we want to expose this through the IOCTL interface? We
>> already have this in the fdinfo.
>>
>> Christian.
>>
>> Am 30.12.22 um 23:07 schrieb Marek Olšák:
>>
>> To give userspace a detailed view about its GPU memory usage and
>> evictions.
>> This will help performance investigations.
>>
>> Signed-off-by: Marek Olšák 
>>
>> The patch is attached.
>>
>> Marek
>>
>>
>>
>

Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2023-01-03 Thread Marek Olšák

I see. Well, those sysfs files are not usable, and I don't think it would
be important even if they were usable, but for completeness:

The ioctl returns:
pcie_gen = 1
pcie_num_lanes = 16

Theoretical bandwidth from those values: 4.0 GB/s
My DMA test shows this write bandwidth: 3.5 GB/s
It matches the expectation.

Let's see the devices (there is only 1 GPU Navi21 in the system):
$ lspci |egrep '(PCI|VGA).*Navi'
0a:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL
Upstream Port of PCI Express Switch (rev c3)
0b:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL
Downstream Port of PCI Express Switch
0c:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c3)

Let's read sysfs:

$ cat /sys/bus/pci/devices/:0a:00.0/current_link_width
16
$ cat /sys/bus/pci/devices/:0b:00.0/current_link_width
16
$ cat /sys/bus/pci/devices/:0c:00.0/current_link_width
16
$ cat /sys/bus/pci/devices/:0a:00.0/current_link_speed
2.5 GT/s PCIe
$ cat /sys/bus/pci/devices/:0b:00.0/current_link_speed
16.0 GT/s PCIe
$ cat /sys/bus/pci/devices/:0c:00.0/current_link_speed
16.0 GT/s PCIe

Problem 1: None of the speed numbers match 4 GB/s.
Problem 2: Userspace doesn't know the bus index of the bridges, and it's
not clear which bridge should be used.
Problem 3: The PCIe gen number is missing.

That's all irrelevant because all information should be queryable via the
INFO ioctl. It doesn't matter what sysfs contains because UMDs shouldn't
have to open and parse extra files just to read a couple of integers.

Marek

On Tue, Jan 3, 2023 at 3:31 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Sure they can, those files are accessible to everyone.
>
> The massive advantage is that this is standard for all PCIe devices, so it
> should work vendor independent.
>
> Christian.
>
> Am 02.01.23 um 18:55 schrieb Marek Olšák:
>
> Userspace drivers can't access sysfs.
>
> Marek
>
> On Mon, Jan 2, 2023, 10:54 Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> That stuff is already available as current_link_speed and
>> current_link_width in sysfs.
>>
>> I'm a bit reluctant duplicating this information in the IOCTL interface.
>>
>> Christian.
>>
>> Am 30.12.22 um 23:07 schrieb Marek Olšák:
>>
>> For computing PCIe bandwidth in userspace and troubleshooting PCIe
>> bandwidth issues.
>>
>> For example, my Navi21 has been limited to PCIe gen 1 and this is
>> the first time I noticed it after 2 years.
>>
>> Note that this intentionally fills a hole and padding
>> in drm_amdgpu_info_device.
>>
>> Signed-off-by: Marek Olšák 
>>
>> The patch is attached.
>>
>> Marek
>>
>>
>>
>

Re: [PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM

2023-01-02 Thread Marek Olšák

What are you talking about? Is fdinfo in sysfs? Userspace drivers can't
access sysfs.

Marek

On Mon, Jan 2, 2023, 10:56 Christian König 
wrote:

> Well first of all don't mess with the VM internals outside of the VM code.
>
> Then why would we want to expose this through the IOCTL interface? We
> already have this in the fdinfo.
>
> Christian.
>
> Am 30.12.22 um 23:07 schrieb Marek Olšák:
>
> To give userspace a detailed view about its GPU memory usage and evictions.
> This will help performance investigations.
>
> Signed-off-by: Marek Olšák 
>
> The patch is attached.
>
> Marek
>
>
>

Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2023-01-02 Thread Marek Olšák

Userspace drivers can't access sysfs.

Marek

On Mon, Jan 2, 2023, 10:54 Christian König 
wrote:

> That stuff is already available as current_link_speed and
> current_link_width in sysfs.
>
> I'm a bit reluctant duplicating this information in the IOCTL interface.
>
> Christian.
>
> Am 30.12.22 um 23:07 schrieb Marek Olšák:
>
> For computing PCIe bandwidth in userspace and troubleshooting PCIe
> bandwidth issues.
>
> For example, my Navi21 has been limited to PCIe gen 1 and this is
> the first time I noticed it after 2 years.
>
> Note that this intentionally fills a hole and padding
> in drm_amdgpu_info_device.
>
> Signed-off-by: Marek Olšák 
>
> The patch is attached.
>
> Marek
>
>
>

Re: [PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM

2022-12-30 Thread Marek Olšák

FYI, I've fixed the mixed tabs/spaces in amdgpu_drm.h locally.

Marek

On Fri, Dec 30, 2022 at 5:07 PM Marek Olšák  wrote:

> To give userspace a detailed view about its GPU memory usage and evictions.
> This will help performance investigations.
>
> Signed-off-by: Marek Olšák 
>
> The patch is attached.
>
> Marek
>

[PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM

2022-12-30 Thread Marek Olšák

To give userspace a detailed view about its GPU memory usage and evictions.
This will help performance investigations.

Signed-off-by: Marek Olšák 

The patch is attached.

Marek
From 01f41d5b49920b11494ca07f6dde24ea3098fa9f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= 
Date: Sat, 24 Dec 2022 17:41:51 -0500
Subject: [PATCH 2/2] drm/amdgpu: add AMDGPU_INFO_VM_STAT to return GPU VM
 stats about the process
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

To give userspace a detailed view about its GPU memory usage and evictions.
This will help performance investigations.

Signed-off-by: Marek Olšák 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 101 
 include/uapi/drm/amdgpu_drm.h   |  29 +++
 3 files changed, 132 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 155f905b00c9..ee1532959032 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -108,9 +108,10 @@
  * - 3.50.0 - Update AMDGPU_INFO_DEV_INFO IOCTL for minimum engine and memory clock
  *Update AMDGPU_INFO_SENSOR IOCTL for PEAK_PSTATE engine and memory clock
  *   3.51.0 - Return the PCIe gen and lanes from the INFO ioctl
+ *   3.52.0 - Add AMDGPU_INFO_VM_STAT
  */
 #define KMS_DRIVER_MAJOR	3
-#define KMS_DRIVER_MINOR	51
+#define KMS_DRIVER_MINOR	52
 #define KMS_DRIVER_PATCHLEVEL	0
 
 unsigned int amdgpu_vram_limit = UINT_MAX;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index fba306e0ef87..619c3a633ee6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -515,6 +515,67 @@ static int amdgpu_hw_ip_info(struct amdgpu_device *adev,
 	return 0;
 }
 
+static void amdgpu_vm_stat_visit_bo(struct drm_amdgpu_info_vm_stat *stat,
+struct amdgpu_bo_va *bo_va)
+{
+	struct amdgpu_bo *bo = bo_va->base.bo;
+	uint64_t size;
+
+	if (!bo)
+		return;
+
+	size = amdgpu_bo_size(bo);
+
+	switch (bo->tbo.resource->mem_type) {
+	case TTM_PL_VRAM:
+		if (bo->tbo.deleted) {
+			stat->unreclaimed_vram += size;
+			stat->unreclaimed_vram_bo_count++;
+		} else {
+			stat->vram += size;
+			stat->vram_bo_count++;
+
+			if (amdgpu_bo_in_cpu_visible_vram(bo)) {
+stat->visible_vram += size;
+stat->visible_vram_bo_count++;
+			}
+		}
+		break;
+	case TTM_PL_TT:
+		if (bo->tbo.deleted) {
+			stat->unreclaimed_gtt += size;
+			stat->unreclaimed_gtt_bo_count++;
+		} else {
+			stat->gtt += size;
+			stat->gtt_bo_count++;
+		}
+		break;
+	case TTM_PL_SYSTEM:
+		stat->sysmem += size;
+		stat->sysmem_bo_count++;
+		break;
+	/* Ignore GDS, GWS, and OA - those are not important. */
+	}
+
+	if (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) {
+		stat->requested_vram += size;
+		stat->requested_vram_bo_count++;
+
+		if (bo->tbo.resource->mem_type != TTM_PL_VRAM) {
+			stat->evicted_vram += size;
+			stat->evicted_vram_bo_count++;
+
+			if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
+stat->evicted_visible_vram += size;
+stat->evicted_visible_vram_bo_count++;
+			}
+		}
+	} else if (bo->preferred_domains & AMDGPU_GEM_DOMAIN_GTT) {
+		stat->requested_gtt += size;
+		stat->requested_gtt_bo_count++;
+	}
+}
+
 /*
  * Userspace get information ioctl
  */
@@ -1128,6 +1189,46 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 		kfree(caps);
 		return r;
 	}
+	case AMDGPU_INFO_VM_STAT: {
+		struct drm_amdgpu_info_vm_stat stat = {};
+		struct amdgpu_fpriv *fpriv = filp->driver_priv;
+		struct amdgpu_vm *vm = >vm;
+		struct amdgpu_bo_va *bo_va, *tmp;
+		int r;
+
+		r = amdgpu_bo_reserve(vm->root.bo, true);
+		if (r)
+			return r;
+
+		spin_lock(>status_lock);
+
+		list_for_each_entry_safe(bo_va, tmp, >idle, base.vm_status) {
+			amdgpu_vm_stat_visit_bo(, bo_va);
+		}
+		list_for_each_entry_safe(bo_va, tmp, >evicted, base.vm_status) {
+			amdgpu_vm_stat_visit_bo(, bo_va);
+		}
+		list_for_each_entry_safe(bo_va, tmp, >relocated, base.vm_status) {
+			amdgpu_vm_stat_visit_bo(, bo_va);
+		}
+		list_for_each_entry_safe(bo_va, tmp, >moved, base.vm_status) {
+			amdgpu_vm_stat_visit_bo(, bo_va);
+		}
+		list_for_each_entry_safe(bo_va, tmp, >invalidated, base.vm_status) {
+			amdgpu_vm_stat_visit_bo(, bo_va);
+		}
+		list_for_each_entry_safe(bo_va, tmp, >done, base.vm_status) {
+			amdgpu_vm_stat_visit_bo(, bo_va);
+		}
+		list_for_each_entry_safe(bo_va, tmp, >freed, base.vm_status) {
+			amdgpu_vm_stat_visit_bo(, bo_va);
+		}
+
+		spin_unlock(>status_lock);
+		amdgpu_bo_unreserve(vm->root.bo);
+		return copy_to_user(out, ,
+min((size_t)size, sizeof(stat))) ? -EFAULT : 0;
+	}
 	default:
 		DRM_DEBUG_KMS("Inv

[PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2022-12-30 Thread Marek Olšák

For computing PCIe bandwidth in userspace and troubleshooting PCIe
bandwidth issues.

For example, my Navi21 has been limited to PCIe gen 1 and this is
the first time I noticed it after 2 years.

Note that this intentionally fills a hole and padding
in drm_amdgpu_info_device.

Signed-off-by: Marek Olšák 

The patch is attached.

Marek
From 5c5f5b707327b030a22824f0f9c9d24b0787 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= 
Date: Sat, 24 Dec 2022 17:44:26 -0500
Subject: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO
 ioctl
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

For computing PCIe bandwidth in userspace and troubleshooting PCIe
bandwidth issues.

For example, my Navi21 has been limited to PCIe gen 1 and this is
the first time I noticed it after 2 years.

Note that this intentionally fills a hole and padding
in drm_amdgpu_info_device.

Signed-off-by: Marek Olšák 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 14 +-
 include/uapi/drm/amdgpu_drm.h   |  3 ++-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index b8cfa48fb296..155f905b00c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -107,9 +107,10 @@
  * - 3.49.0 - Add gang submit into CS IOCTL
  * - 3.50.0 - Update AMDGPU_INFO_DEV_INFO IOCTL for minimum engine and memory clock
  *Update AMDGPU_INFO_SENSOR IOCTL for PEAK_PSTATE engine and memory clock
+ *   3.51.0 - Return the PCIe gen and lanes from the INFO ioctl
  */
 #define KMS_DRIVER_MAJOR	3
-#define KMS_DRIVER_MINOR	50
+#define KMS_DRIVER_MINOR	51
 #define KMS_DRIVER_PATCHLEVEL	0
 
 unsigned int amdgpu_vram_limit = UINT_MAX;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 903e8770e275..fba306e0ef87 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -42,6 +42,7 @@
 #include "amdgpu_gem.h"
 #include "amdgpu_display.h"
 #include "amdgpu_ras.h"
+#include "amd_pcie.h"
 
 void amdgpu_unregister_gpu_instance(struct amdgpu_device *adev)
 {
@@ -766,6 +767,7 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 	case AMDGPU_INFO_DEV_INFO: {
 		struct drm_amdgpu_info_device *dev_info;
 		uint64_t vm_size;
+		uint32_t pcie_gen_mask;
 		int ret;
 
 		dev_info = kzalloc(sizeof(*dev_info), GFP_KERNEL);
@@ -798,7 +800,6 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 		dev_info->num_rb_pipes = adev->gfx.config.max_backends_per_se *
 			adev->gfx.config.max_shader_engines;
 		dev_info->num_hw_gfx_contexts = adev->gfx.config.max_hw_contexts;
-		dev_info->_pad = 0;
 		dev_info->ids_flags = 0;
 		if (adev->flags & AMD_IS_APU)
 			dev_info->ids_flags |= AMDGPU_IDS_FLAGS_FUSION;
@@ -852,6 +853,17 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 
 		dev_info->tcc_disabled_mask = adev->gfx.config.tcc_disabled_mask;
 
+		/* Combine the chip gen mask with the platform (CPU/mobo) mask. */
+		pcie_gen_mask = adev->pm.pcie_gen_mask & (adev->pm.pcie_gen_mask >> 16);
+		dev_info->pcie_gen = fls(pcie_gen_mask);
+		dev_info->pcie_num_lanes =
+			adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X32 ? 32 :
+			adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X16 ? 16 :
+			adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X12 ? 12 :
+			adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X8 ? 8 :
+			adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X4 ? 4 :
+			adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X2 ? 2 : 1;
+
 		ret = copy_to_user(out, dev_info,
    min((size_t)size, sizeof(*dev_info))) ? -EFAULT : 0;
 		kfree(dev_info);
diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
index fe7f871e3080..f7fc7325f17f 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -1053,7 +1053,7 @@ struct drm_amdgpu_info_device {
 	__u32 enabled_rb_pipes_mask;
 	__u32 num_rb_pipes;
 	__u32 num_hw_gfx_contexts;
-	__u32 _pad;
+	__u32 pcie_gen;
 	__u64 ids_flags;
 	/** Starting virtual address for UMDs. */
 	__u64 virtual_address_offset;
@@ -1109,6 +1109,7 @@ struct drm_amdgpu_info_device {
 	__u64 high_va_max;
 	/* gfx10 pa_sc_tile_steering_override */
 	__u32 pa_sc_tile_steering_override;
+	__u32 pcie_num_lanes;
 	/* disabled TCCs */
 	__u64 tcc_disabled_mask;
 	__u64 min_engine_clock;
-- 
2.25.1

Re: [PATCH] drm/amdgpu: Always align dumb buffer at PAGE_SIZE

2022-09-23 Thread Marek Olšák

The kernel could report the true alignment from the ioctl instead of 0.

Marek

On Fri, Sep 23, 2022 at 1:31 AM Christian König 
wrote:

> Am 23.09.22 um 07:28 schrieb lepton:
> > On Thu, Sep 22, 2022 at 10:14 PM Christian König
> >  wrote:
> >> Am 23.09.22 um 01:04 schrieb Lepton Wu:
> >>> Since size has been aligned to PAGE_SIZE already, just align it
> >>> to PAGE_SIZE so later the buffer can be used as a texture in mesa
> >>> after
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fmesa%2Fmesa%2Fcommit%2F%3Fid%3Df7a4051b8data=05%7C01%7Cchristian.koenig%40amd.com%7C645f6878a7bd487588b708da9d246c4c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637995077041120091%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=NMEAl8TByDLQFWW1d%2FaJfiGrXc4mpwL5dxNH0M0QH84%3Dreserved=0
> >>> Otherwise, si_texture_create_object will fail at line
> >>> "buf->alignment < tex->surface.alignment"
> >> I don't think that those Mesa checks are a good idea in the first place.
> >>
> >> The alignment value is often specified as zero when it doesn't matter
> >> because the minimum alignment can never be less than the page size.
> > Are you suggesting to change those mesa checks?
>
> Yes, the minimum alignment of allocations is always 4096 because that's
> the page size of the GPU.
>
> > While that can be
> > done, I still think a kernel side "fix" is still
> > useful since it doesn't hurt while can fix issues for some versions of
> mesa.
>
> No, we have tons of places where we don't specify and alignment for
> buffers because it never mattered. I certainly don't want to fix all of
> those.
>
> Regards,
> Christian.
>
> >> Christian.
> >>
> >>> Signed-off-by: Lepton Wu 
> >>> ---
> >>>drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 +-
> >>>1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >>> index 8ef31d687ef3b..8dca0c920d3ce 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >>> @@ -928,7 +928,7 @@ int amdgpu_mode_dumb_create(struct drm_file
> *file_priv,
> >>>args->size = ALIGN(args->size, PAGE_SIZE);
> >>>domain = amdgpu_bo_get_preferred_domain(adev,
> >>>amdgpu_display_supported_domains(adev,
> flags));
> >>> - r = amdgpu_gem_object_create(adev, args->size, 0, domain, flags,
> >>> + r = amdgpu_gem_object_create(adev, args->size, PAGE_SIZE,
> domain, flags,
> >>> ttm_bo_type_device, NULL, );
> >>>if (r)
> >>>return -ENOMEM;
>
>

Re: [PATCH 1/2] drm/amdgpu: add the IP discovery IP versions for HW INFO data

2022-07-19 Thread Marek Olšák

Reviewed-by: Marek Olšák 

for the series.

Marek

On Tue, Jul 19, 2022 at 3:53 PM Alex Deucher  wrote:

> Ping on this series.
>
> Alex
>
> On Fri, Jul 8, 2022 at 6:56 PM Alex Deucher 
> wrote:
> >
> > Use the former pad element to store the IP versions from the
> > IP discovery table.  This allows userspace to get the IP
> > version from the kernel to better align with hardware IP
> > versions.
> >
> > Proposed mesa patch:
> >
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17411/diffs?commit_id=c8a63590dfd0d64e6e6a634dcfed993f135dd075
> >
> > Signed-off-by: Alex Deucher 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 24 
> >  include/uapi/drm/amdgpu_drm.h   |  3 ++-
> >  2 files changed, 26 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> > index 4b44a4bc2fb3..7e03f3719d11 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> > @@ -473,6 +473,30 @@ static int amdgpu_hw_ip_info(struct amdgpu_device
> *adev,
> >
> > result->hw_ip_version_major = adev->ip_blocks[i].version->major;
> > result->hw_ip_version_minor = adev->ip_blocks[i].version->minor;
> > +
> > +   if (adev->asic_type >= CHIP_VEGA10) {
> > +   switch (type) {
> > +   case AMD_IP_BLOCK_TYPE_GFX:
> > +   result->ip_discovery_version =
> adev->ip_versions[GC_HWIP][0];
> > +   break;
> > +   case AMD_IP_BLOCK_TYPE_SDMA:
> > +   result->ip_discovery_version =
> adev->ip_versions[SDMA0_HWIP][0];
> > +   break;
> > +   case AMD_IP_BLOCK_TYPE_UVD:
> > +   case AMD_IP_BLOCK_TYPE_VCN:
> > +   case AMD_IP_BLOCK_TYPE_JPEG:
> > +   result->ip_discovery_version =
> adev->ip_versions[UVD_HWIP][0];
> > +   break;
> > +   case AMD_IP_BLOCK_TYPE_VCE:
> > +   result->ip_discovery_version =
> adev->ip_versions[VCE_HWIP][0];
> > +   break;
> > +   default:
> > +   result->ip_discovery_version = 0;
> > +   break;
> > +   }
> > +   } else {
> > +   result->ip_discovery_version = 0;
> > +   }
> > result->capabilities_flags = 0;
> > result->available_rings = (1 << num_rings) - 1;
> > result->ib_start_alignment = ib_start_alignment;
> > diff --git a/include/uapi/drm/amdgpu_drm.h
> b/include/uapi/drm/amdgpu_drm.h
> > index 18d3246d636e..3a2674b4a2d9 100644
> > --- a/include/uapi/drm/amdgpu_drm.h
> > +++ b/include/uapi/drm/amdgpu_drm.h
> > @@ -1093,7 +1093,8 @@ struct drm_amdgpu_info_hw_ip {
> > __u32  ib_size_alignment;
> > /** Bitmask of available rings. Bit 0 means ring 0, etc. */
> > __u32  available_rings;
> > -   __u32  _pad;
> > +   /** version info: bits 23:16 major, 15:8 minor, 7:0 revision */
> > +   __u32  ip_discovery_version;
> >  };
> >
> >  struct drm_amdgpu_info_num_handles {
> > --
> > 2.35.3
> >
>

Re: [PATCH 1/3] drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE

2022-07-08 Thread Marek Olšák

Christian, should we set this flag for GDS too? Will it help with GDS OOM
failures?

Marek

On Fri., May 13, 2022, 07:26 Christian König, <
ckoenig.leichtzumer...@gmail.com> wrote:

> Exactly that's what we can't do.
>
> See the kernel must always be able to move things to GTT or discard. So
> when you want to guarantee that something is in VRAM you must at the
> same time say you can discard it if it can't.
>
> Christian.
>
> Am 13.05.22 um 10:43 schrieb Pierre-Eric Pelloux-Prayer:
> > Hi Marek, Christian,
> >
> > If the main feature for Mesa of AMDGPU_GEM_CREATE_DISCARDABLE is
> > getting the best placement, maybe we should have 2 separate flags:
> >   * AMDGPU_GEM_CREATE_DISCARDABLE: indicates to the kernel that it can
> > discards the content on eviction instead of preserving it
> >   * AMDGPU_GEM_CREATE_FORCE_BEST_PLACEMENT (or
> > AMDGPU_GEM_CREATE_NO_GTT_FALLBACK ? or AMDGPU_CREATE_GEM_AVOID_GTT?):
> > tells the kernel that this bo really needs to be in VRAM
> >
> >
> > Pierre-Eric
> >
> > On 13/05/2022 00:17, Marek Olšák wrote:
> >> Would it be better to set the VM_ALWAYS_VALID flag to have a greater
> >> guarantee that the best placement will be chosen?
> >>
> >> See, the main feature is getting the best placement, not being
> >> discardable. The best placement is a hw design requirement due to
> >> using memory for uses that are expected to have performance similar
> >> to onchip SRAMs. We need to make sure the best placement is
> >> guaranteed if it's VRAM.
> >>
> >> Marek
> >>
> >> On Thu., May 12, 2022, 03:26 Christian König,
> >>  >> <mailto:ckoenig.leichtzumer...@gmail.com>> wrote:
> >>
> >> Am 12.05.22 um 00:06 schrieb Marek Olšák:
> >>> 3rd question: Is it worth using this on APUs?
> >>
> >> It makes memory management somewhat easier when we are really OOM.
> >>
> >> E.g. it should also work for GTT allocations and when the core
> >> kernel says "Hey please free something up or I will start the
> >> OOM-killer" it's something we can easily throw away.
> >>
> >> Not sure how many of those buffers we have, but marking
> >> everything which is temporary with that flag is probably a good idea.
> >>
> >>>
> >>> Thanks,
> >>> Marek
> >>>
> >>> On Wed, May 11, 2022 at 5:58 PM Marek Olšák  >>> <mailto:mar...@gmail.com>> wrote:
> >>>
> >>> Will the kernel keep all discardable buffers in VRAM if VRAM
> >>> is not overcommitted by discardable buffers, or will other buffers
> >>> also affect the placement of discardable buffers?
> >>>
> >>
> >> Regarding the eviction pressure the buffers will be handled like
> >> any other buffer, but instead of preserving the content it is just
> >> discarded on eviction.
> >>
> >>>
> >>>     Do evictions deallocate the buffer, or do they keep an
> >>> allocation in GTT and only the copy is skipped?
> >>>
> >>
> >> It really deallocates the backing store of the buffer, just keeps
> >> a dummy page array around where all entries are NULL.
> >>
> >> There is a patch set on the mailing list to make this a little
> >> bit more efficient, but even using the dummy page array should only
> >> have a few bytes overhead.
> >>
> >> Regards,
> >> Christian.
> >>
> >>>
> >>> Thanks,
> >>> Marek
> >>>
> >>> On Wed, May 11, 2022 at 3:08 AM Marek Olšák
> >>> mailto:mar...@gmail.com>> wrote:
> >>>
> >>> OK that sounds good.
> >>>
> >>> Marek
> >>>
> >>> On Wed, May 11, 2022 at 2:04 AM Christian König
> >>>  >>> <mailto:ckoenig.leichtzumer...@gmail.com>> wrote:
> >>>
> >>> Hi Marek,
> >>>
> >>> Am 10.05.22 um 22:43 schrieb Marek Olšák:
> >>>> A better flag name would be:
> >>>> AMDGPU_GEM_CREATE_BEST_PLACEMENT_OR_DISCARD
> >>>
> >>> A bit long for my taste and I think the best
> >>> placement is just a side effect.
> >>>
> >>>>
> >>>>

Re: [PATCH] drm/amd/display: expose additional modifier for DCN32/321

2022-06-28 Thread Marek Olšák

This needs to be a loop inserting all 64K_R_X and all 256K_R_X modifiers.

If num_pipes > 16, insert 256K_R_X first, else insert 64K_R_X first. Insert
the other one after that. For example:

  for (unsigned i = 0; i < 2; i++) {

 unsigned swizzle_r_x;


 /* Insert the best one first. */

 if (num_pipes > 16)

swizzle_r_x = !i ? AMD_FMT_MOD_TILE_GFX11_256K_R_X :
AMD_FMT_MOD_TILE_GFX9_64K_R_X;

 else

swizzle_r_x = !i ? AMD_FMT_MOD_TILE_GFX9_64K_R_X :
AMD_FMT_MOD_TILE_GFX11_256K_R_X;


 uint64_t modifier_r_x = ...

 add_modifier(,,,

 add_modifier(,,,
 add_modifier(,,,
 add_modifier(,,,
 add_modifier(,,,
  }


Marek

On Mon, Jun 27, 2022 at 10:32 AM Aurabindo Pillai 
wrote:

> [Why]
> Some userspace expect a backwards compatible modifier on DCN32/321. For
> hardware with num_pipes more than 16, we expose the most efficient
> modifier first. As a fall back method, we need to expose slightly
> inefficient
> modifier AMD_FMT_MOD_TILE_GFX9_64K_R_X after the best option.
>
> Also set the number of packers to fixed value as required per hardware
> documentation. This value is cached during hardware initialization and
> can be read through the base driver.
>
> Fixes: 0a2c19562ffe ('Revert "drm/amd/display: ignore modifiers when
> checking for format support"')
> Signed-off-by: Aurabindo Pillai 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 3 +--
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +++-
>  2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> index 1a512d78673a..0f5bfe5df627 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> @@ -743,8 +743,7 @@ static int convert_tiling_flags_to_modifier(struct
> amdgpu_framebuffer *afb)
> switch (version) {
> case AMD_FMT_MOD_TILE_VER_GFX11:
> pipe_xor_bits = min(block_size_bits - 8,
> pipes);
> -   packers = min(block_size_bits - 8 -
> pipe_xor_bits,
> -
>  ilog2(adev->gfx.config.gb_addr_config_fields.num_pkrs));
> +   packers =
> ilog2(adev->gfx.config.gb_addr_config_fields.num_pkrs);
> break;
> case AMD_FMT_MOD_TILE_VER_GFX10_RBPLUS:
> pipe_xor_bits = min(block_size_bits - 8,
> pipes);
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index c9145864ed2b..bea9cee37f65 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -5203,6 +5203,7 @@ add_gfx11_modifiers(struct amdgpu_device *adev,
> int pkrs = 0;
> u32 gb_addr_config;
> unsigned swizzle_r_x;
> +   uint64_t modifier_r_x_best;
> uint64_t modifier_r_x;
> uint64_t modifier_dcc_best;
> uint64_t modifier_dcc_4k;
> @@ -5223,10 +5224,12 @@ add_gfx11_modifiers(struct amdgpu_device *adev,
>
> modifier_r_x = AMD_FMT_MOD |
> AMD_FMT_MOD_SET(TILE_VERSION, AMD_FMT_MOD_TILE_VER_GFX11) |
> -   AMD_FMT_MOD_SET(TILE, swizzle_r_x) |
> AMD_FMT_MOD_SET(PIPE_XOR_BITS, pipe_xor_bits) |
> AMD_FMT_MOD_SET(PACKERS, pkrs);
>
> +   modifier_r_x_best = modifier_r_x | AMD_FMT_MOD_SET(TILE,
> AMD_FMT_MOD_TILE_GFX11_256K_R_X);
> +   modifier_r_x = modifier_r_x | AMD_FMT_MOD_SET(TILE,
> AMD_FMT_MOD_TILE_GFX9_64K_R_X);
> +
> /* DCC_CONSTANT_ENCODE is not set because it can't vary with gfx11
> (it's implied to be 1). */
> modifier_dcc_best = modifier_r_x |
> AMD_FMT_MOD_SET(DCC, 1) |
> @@ -5247,6 +5250,9 @@ add_gfx11_modifiers(struct amdgpu_device *adev,
> add_modifier(mods, size, capacity, modifier_dcc_best |
> AMD_FMT_MOD_SET(DCC_RETILE, 1));
> add_modifier(mods, size, capacity, modifier_dcc_4k |
> AMD_FMT_MOD_SET(DCC_RETILE, 1));
>
> +   if (num_pipes > 16)
> +   add_modifier(mods, size, capacity, modifier_r_x_best);
> +
> add_modifier(mods, size, capacity, modifier_r_x);
>
> add_modifier(mods, size, capacity, AMD_FMT_MOD |
> --
> 2.36.1
>
>

Re: [PATCH] drm/amd/display: ignore modifiers when checking for format support

2022-06-14 Thread Marek Olšák

We can reject invalid modifiers elsewhere, but it can't be here because
it's also the non-modifier path.

We expose 256KB_R_X or 64KB_R_X modifiers depending on chip-specific
settings, but not both. Only the optimal option is exposed. This is OK for
modifiers, but not OK with AMD-specific BO metadata where the UMD
determines the swizzle mode.

Marek

On Tue, Jun 14, 2022 at 8:38 PM Bas Nieuwenhuizen 
wrote:

> On Mon, Jun 13, 2022 at 1:47 PM Marek Olšák  wrote:
> >
> > Bas, the code was literally rejecting swizzle modes that were not in the
> modifier list, which was incorrect. That's because the modifier list is a
> subset of all supported swizzle modes.
>
> That was WAI. The kernel is now in charge of rejecting stuff that is
> not capable of being displayed.
>
> Allowing all in format_mod_supported has several implications on
> exposed & accepted modifiers as well, that should be avoided even if
> we should do a behavior change for non-modifiers: We now expose (i.e.
> list) modifiers for formats which they don't support and we removed
> the check that the modifier is in the list for commits with modifiers
> too. Hence this logic would need a serious rework instead of the patch
> that was sent.
>
> What combinations were failing, and can't we just add modifiers for them?
>
>
>
>
> >
> > Marek
> >
> > On Sun, Jun 12, 2022 at 7:54 PM Bas Nieuwenhuizen <
> b...@basnieuwenhuizen.nl> wrote:
> >>
> >> On Thu, Jun 9, 2022 at 4:27 PM Aurabindo Pillai
> >>  wrote:
> >> >
> >> > [Why]
> >> > There are cases where swizzle modes are set but modifiers arent. For
> >> > such a userspace, we need not check modifiers while checking
> >> > compatibilty in the drm hook for checking plane format.
> >> >
> >> > Ignore checking modifiers but check the DCN generation for the
> >> > supported swizzle mode.
> >> >
> >> > Signed-off-by: Aurabindo Pillai 
> >> > ---
> >> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 51
> +--
> >> >  1 file changed, 46 insertions(+), 5 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >> > index 2023baf41b7e..1322df491736 100644
> >> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >> > @@ -4938,6 +4938,7 @@ static bool
> dm_plane_format_mod_supported(struct drm_plane *plane,
> >> >  {
> >> > struct amdgpu_device *adev = drm_to_adev(plane->dev);
> >> > const struct drm_format_info *info = drm_format_info(format);
> >> > +   struct hw_asic_id asic_id = adev->dm.dc->ctx->asic_id;
> >> > int i;
> >> >
> >> > enum dm_micro_swizzle microtile =
> modifier_gfx9_swizzle_mode(modifier) & 3;
> >> > @@ -4955,13 +4956,53 @@ static bool
> dm_plane_format_mod_supported(struct drm_plane *plane,
> >> > return true;
> >> > }
> >> >
> >> > -   /* Check that the modifier is on the list of the plane's
> supported modifiers. */
> >> > -   for (i = 0; i < plane->modifier_count; i++) {
> >> > -   if (modifier == plane->modifiers[i])
> >> > +   /* check if swizzle mode is supported by this version of DCN
> */
> >> > +   switch (asic_id.chip_family) {
> >> > +   case FAMILY_SI:
> >> > +   case FAMILY_CI:
> >> > +   case FAMILY_KV:
> >> > +   case FAMILY_CZ:
> >> > +   case FAMILY_VI:
> >> > +   /* AI and earlier asics does not have
> modifier support */
> >> > +   return false;
> >> > +   break;
> >> > +   case FAMILY_AI:
> >> > +   case FAMILY_RV:
> >> > +   case FAMILY_NV:
> >> > +   case FAMILY_VGH:
> >> > +   case FAMILY_YELLOW_CARP:
> >> > +   case AMDGPU_FAMILY_GC_10_3_6:
> >> > +   case AMDGPU_FAMILY_GC_10_3_7:
> >> > +   switch (AMD_FMT_MOD_GET(TILE, modifier)) {
> >> > +   case AMD_FMT_MOD_TILE_GFX9_64K_R_X:
> >> > +   case AMD_FMT_MOD_TILE_GFX9_64K_D_

Re: [PATCH] drm/amd/display: ignore modifiers when checking for format support

2022-06-13 Thread Marek Olšák

Bas, the code was literally rejecting swizzle modes that were not in the
modifier list, which was incorrect. That's because the modifier list is a
subset of all supported swizzle modes.

Marek

On Sun, Jun 12, 2022 at 7:54 PM Bas Nieuwenhuizen 
wrote:

> On Thu, Jun 9, 2022 at 4:27 PM Aurabindo Pillai
>  wrote:
> >
> > [Why]
> > There are cases where swizzle modes are set but modifiers arent. For
> > such a userspace, we need not check modifiers while checking
> > compatibilty in the drm hook for checking plane format.
> >
> > Ignore checking modifiers but check the DCN generation for the
> > supported swizzle mode.
> >
> > Signed-off-by: Aurabindo Pillai 
> > ---
> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 51 +--
> >  1 file changed, 46 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 2023baf41b7e..1322df491736 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -4938,6 +4938,7 @@ static bool dm_plane_format_mod_supported(struct
> drm_plane *plane,
> >  {
> > struct amdgpu_device *adev = drm_to_adev(plane->dev);
> > const struct drm_format_info *info = drm_format_info(format);
> > +   struct hw_asic_id asic_id = adev->dm.dc->ctx->asic_id;
> > int i;
> >
> > enum dm_micro_swizzle microtile =
> modifier_gfx9_swizzle_mode(modifier) & 3;
> > @@ -4955,13 +4956,53 @@ static bool dm_plane_format_mod_supported(struct
> drm_plane *plane,
> > return true;
> > }
> >
> > -   /* Check that the modifier is on the list of the plane's
> supported modifiers. */
> > -   for (i = 0; i < plane->modifier_count; i++) {
> > -   if (modifier == plane->modifiers[i])
> > +   /* check if swizzle mode is supported by this version of DCN */
> > +   switch (asic_id.chip_family) {
> > +   case FAMILY_SI:
> > +   case FAMILY_CI:
> > +   case FAMILY_KV:
> > +   case FAMILY_CZ:
> > +   case FAMILY_VI:
> > +   /* AI and earlier asics does not have modifier
> support */
> > +   return false;
> > +   break;
> > +   case FAMILY_AI:
> > +   case FAMILY_RV:
> > +   case FAMILY_NV:
> > +   case FAMILY_VGH:
> > +   case FAMILY_YELLOW_CARP:
> > +   case AMDGPU_FAMILY_GC_10_3_6:
> > +   case AMDGPU_FAMILY_GC_10_3_7:
> > +   switch (AMD_FMT_MOD_GET(TILE, modifier)) {
> > +   case AMD_FMT_MOD_TILE_GFX9_64K_R_X:
> > +   case AMD_FMT_MOD_TILE_GFX9_64K_D_X:
> > +   case AMD_FMT_MOD_TILE_GFX9_64K_S_X:
> > +   case AMD_FMT_MOD_TILE_GFX9_64K_D:
> > +   return true;
> > +   break;
> > +   default:
> > +   return false;
> > +   break;
> > +   }
> > +   break;
> > +   case AMDGPU_FAMILY_GC_11_0_0:
> > +   switch (AMD_FMT_MOD_GET(TILE, modifier)) {
> > +   case AMD_FMT_MOD_TILE_GFX11_256K_R_X:
> > +   case AMD_FMT_MOD_TILE_GFX9_64K_R_X:
> > +   case AMD_FMT_MOD_TILE_GFX9_64K_D_X:
> > +   case AMD_FMT_MOD_TILE_GFX9_64K_S_X:
> > +   case AMD_FMT_MOD_TILE_GFX9_64K_D:
> > +   return true;
> > +   break;
> > +   default:
> > +   return false;
> > +   break;
> > +   }
> > +   break;
> > +   default:
> > +   ASSERT(0); /* Unknown asic */
> > break;
> > }
>
> This seems broken to me. AFAICT we always return in the switch so the
> code after this that checks for valid DCC usage isn't executed.
> Checking the list of modifiers is also essential to make sure other
> stuff in the modifier is set properly.
>
> If you have userspace that is not using modifiers that is giving you
> issues, a better place to look might be
> convert_tiling_flags_to_modifier in amdgpu_display.c
>
> > -   if (i == plane->modifier_count)
> > -   return false;
> >
> > /*
> >  * For D swizzle the canonical modifier depends on the bpp, so
> check
> > --
> > 2.36.1
> >
>

Re: [PATCH] drm/amd/display: ignore modifiers when checking for format support

2022-06-09 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Thu, Jun 9, 2022 at 10:27 AM Aurabindo Pillai 
wrote:

> [Why]
> There are cases where swizzle modes are set but modifiers arent. For
> such a userspace, we need not check modifiers while checking
> compatibilty in the drm hook for checking plane format.
>
> Ignore checking modifiers but check the DCN generation for the
> supported swizzle mode.
>
> Signed-off-by: Aurabindo Pillai 
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 51 +--
>  1 file changed, 46 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 2023baf41b7e..1322df491736 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -4938,6 +4938,7 @@ static bool dm_plane_format_mod_supported(struct
> drm_plane *plane,
>  {
> struct amdgpu_device *adev = drm_to_adev(plane->dev);
> const struct drm_format_info *info = drm_format_info(format);
> +   struct hw_asic_id asic_id = adev->dm.dc->ctx->asic_id;
> int i;
>
> enum dm_micro_swizzle microtile =
> modifier_gfx9_swizzle_mode(modifier) & 3;
> @@ -4955,13 +4956,53 @@ static bool dm_plane_format_mod_supported(struct
> drm_plane *plane,
> return true;
> }
>
> -   /* Check that the modifier is on the list of the plane's supported
> modifiers. */
> -   for (i = 0; i < plane->modifier_count; i++) {
> -   if (modifier == plane->modifiers[i])
> +   /* check if swizzle mode is supported by this version of DCN */
> +   switch (asic_id.chip_family) {
> +   case FAMILY_SI:
> +   case FAMILY_CI:
> +   case FAMILY_KV:
> +   case FAMILY_CZ:
> +   case FAMILY_VI:
> +   /* AI and earlier asics does not have modifier
> support */
> +   return false;
> +   break;
> +   case FAMILY_AI:
> +   case FAMILY_RV:
> +   case FAMILY_NV:
> +   case FAMILY_VGH:
> +   case FAMILY_YELLOW_CARP:
> +   case AMDGPU_FAMILY_GC_10_3_6:
> +   case AMDGPU_FAMILY_GC_10_3_7:
> +   switch (AMD_FMT_MOD_GET(TILE, modifier)) {
> +   case AMD_FMT_MOD_TILE_GFX9_64K_R_X:
> +   case AMD_FMT_MOD_TILE_GFX9_64K_D_X:
> +   case AMD_FMT_MOD_TILE_GFX9_64K_S_X:
> +   case AMD_FMT_MOD_TILE_GFX9_64K_D:
> +   return true;
> +   break;
> +   default:
> +   return false;
> +   break;
> +   }
> +   break;
> +   case AMDGPU_FAMILY_GC_11_0_0:
> +   switch (AMD_FMT_MOD_GET(TILE, modifier)) {
> +   case AMD_FMT_MOD_TILE_GFX11_256K_R_X:
> +   case AMD_FMT_MOD_TILE_GFX9_64K_R_X:
> +   case AMD_FMT_MOD_TILE_GFX9_64K_D_X:
> +   case AMD_FMT_MOD_TILE_GFX9_64K_S_X:
> +   case AMD_FMT_MOD_TILE_GFX9_64K_D:
> +   return true;
> +   break;
> +   default:
> +   return false;
> +   break;
> +   }
> +   break;
> +   default:
> +   ASSERT(0); /* Unknown asic */
> break;
> }
> -   if (i == plane->modifier_count)
> -   return false;
>
> /*
>  * For D swizzle the canonical modifier depends on the bpp, so
> check
> --
> 2.36.1
>
>

Re: Explicit VM updates

2022-06-01 Thread Marek Olšák

Can you please summarize what this is about?

Thanks,
Marek

On Wed, Jun 1, 2022 at 8:40 AM Christian König 
wrote:

> Hey guys,
>
> so today Bas came up with a new requirement regarding the explicit
> synchronization to VM updates and a bunch of prototype patches.
>
> I've been thinking about that stuff for quite some time before, but to
> be honest it's one of the most trickiest parts of the driver.
>
> So my current thinking is that we could potentially handle those
> requirements like this:
>
> 1. We add some new EXPLICIT flag to context (or CS?) and VM IOCTL. This
> way we either get the new behavior for the whole CS+VM or the old one,
> but never both mixed.
>
> 2. When memory is unmapped we keep around the last unmap operation
> inside the bo_va.
>
> 3. When memory is freed we add all the CS fences which could access that
> memory + the last unmap operation as BOOKKEEP fences to the BO and as
> mandatory sync fence to the VM.
>
> Memory freed either because of an eviction or because of userspace
> closing the handle will be seen as a combination of unmap+free.
>
>
> The result is the following semantic for userspace to avoid implicit
> synchronization as much as possible:
>
> 1. When you allocate and map memory it is mandatory to either wait for
> the mapping operation to complete or to add it as dependency for your CS.
>  If this isn't followed the application will run into CS faults
> (that's what we pretty much already implemented).
>
> 2. When memory is freed you must unmap that memory first and then wait
> for this unmap operation to complete before freeing the memory.
>  If this isn't followed the kernel will add a forcefully wait to the
> next CS to block until the unmap is completed.
>
> 3. All VM operations requested by userspace will still be executed in
> order, e.g. we can't run unmap + map in parallel or something like this.
>
> Is that something you guys can live with? As far as I can see it should
> give you the maximum freedom possible, but is still doable.
>
> Regards,
> Christian.
>

Re: [PATCH 14/43] drm/amd: Add GFX11 modifiers support to AMDGPU

2022-05-25 Thread Marek Olšák

On Wed, May 25, 2022 at 12:20 PM Alex Deucher 
wrote:

> From: Aurabindo Pillai 
>
> GFX11 IP introduces new tiling mode. Various combinations of DCC
> settings are possible and the most preferred settings must be exposed
> for optimal use of the hardware.
>
> add_gfx11_modifiers() is based on recommendation from Marek for the
> preferred tiling modifier that are most efficient for the hardware.
>
> Signed-off-by: Aurabindo Pillai 
> Acked-by: Alex Deucher 
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 40 --
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 74 ++-
>  include/uapi/drm/drm_fourcc.h |  2 +
>  3 files changed, 108 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> index ec395fe427f2..a54081a89282 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> @@ -30,6 +30,9 @@
>  #include "atom.h"
>  #include "amdgpu_connectors.h"
>  #include "amdgpu_display.h"
> +#include "soc15_common.h"
> +#include "gc/gc_11_0_0_offset.h"
> +#include "gc/gc_11_0_0_sh_mask.h"
>  #include 
>
>  #include 
> @@ -662,6 +665,11 @@ static int convert_tiling_flags_to_modifier(struct
> amdgpu_framebuffer *afb)
>  {
> struct amdgpu_device *adev = drm_to_adev(afb->base.dev);
> uint64_t modifier = 0;
> +   int num_pipes = 0;
> +   int num_pkrs = 0;
> +
> +   num_pkrs = adev->gfx.config.gb_addr_config_fields.num_pkrs;
> +   num_pipes = adev->gfx.config.gb_addr_config_fields.num_pipes;
>
> if (!afb->tiling_flags || !AMDGPU_TILING_GET(afb->tiling_flags,
> SWIZZLE_MODE)) {
> modifier = DRM_FORMAT_MOD_LINEAR;
> @@ -674,7 +682,7 @@ static int convert_tiling_flags_to_modifier(struct
> amdgpu_framebuffer *afb)
> int bank_xor_bits = 0;
> int packers = 0;
> int rb = 0;
> -   int pipes =
> ilog2(adev->gfx.config.gb_addr_config_fields.num_pipes);
> +   int pipes = ilog2(num_pipes);
> uint32_t dcc_offset = AMDGPU_TILING_GET(afb->tiling_flags,
> DCC_OFFSET_256B);
>
> switch (swizzle >> 2) {
> @@ -690,12 +698,17 @@ static int convert_tiling_flags_to_modifier(struct
> amdgpu_framebuffer *afb)
> case 6: /* 64 KiB _X */
> block_size_bits = 16;
> break;
> +   case 7: /* 256 KiB */
> +   block_size_bits = 18;
> +   break;
> default:
> /* RESERVED or VAR */
> return -EINVAL;
> }
>
> -   if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 3, 0))
> +   if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(11, 0, 0))
> +   version = AMD_FMT_MOD_TILE_VER_GFX11;
> +   else if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10,
> 3, 0))
> version = AMD_FMT_MOD_TILE_VER_GFX10_RBPLUS;
> else if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10,
> 0, 0))
> version = AMD_FMT_MOD_TILE_VER_GFX10;
> @@ -718,7 +731,17 @@ static int convert_tiling_flags_to_modifier(struct
> amdgpu_framebuffer *afb)
> }
>

The switch statement right above this that is not in this patch and changes
"version" should be skipped on >= gfx11. Under no circumstances should the
version be changed on gfx11.

Marek


>
> if (has_xor) {
> +   if (num_pipes == num_pkrs && num_pkrs == 0) {
> +   DRM_ERROR("invalid number of pipes and
> packers\n");
> +   return -EINVAL;
> +   }
> +
> switch (version) {
> +   case AMD_FMT_MOD_TILE_VER_GFX11:
> +   pipe_xor_bits = min(block_size_bits - 8,
> pipes);
> +   packers = min(block_size_bits - 8 -
> pipe_xor_bits,
> +
>  ilog2(adev->gfx.config.gb_addr_config_fields.num_pkrs));
> +   break;
> case AMD_FMT_MOD_TILE_VER_GFX10_RBPLUS:
> pipe_xor_bits = min(block_size_bits - 8,
> pipes);
> packers = min(block_size_bits - 8 -
> pipe_xor_bits,
> @@ -752,9 +775,10 @@ static int convert_tiling_flags_to_modifier(struct
> amdgpu_framebuffer *afb)
> u64 render_dcc_offset;
>
> /* Enable constant encode on RAVEN2 and later. */
> -   bool dcc_constant_encode = adev->asic_type >
> CHIP_RAVEN ||
> +   bool dcc_constant_encode = (adev->asic_type >
> CHIP_RAVEN ||
>(adev->asic_type ==
> CHIP_RAVEN &&
> -

Re: [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)

2022-05-25 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Fri, May 20, 2022 at 11:09 AM Alex Deucher 
wrote:

> Certain GL unit tests for large textures can cause problems
> with the OOM killer since there is no way to link this memory
> to a process.  This was originally mitigated (but not necessarily
> eliminated) by limiting the GTT size.  The problem is this limit
> is often too low for many modern games so just make the limit 1/2
> of system memory. The OOM accounting needs to be addressed, but
> we shouldn't prevent common 3D applications from being usable
> just to potentially mitigate that corner case.
>
> Set default GTT size to max(3G, 1/2 of system ram) by default.
>
> v2: drop previous logic and default to 3/4 of ram
> v3: default to half of ram to align with ttm
>
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 ++--
>  1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index d2b5cccb45c3..7195ed77c85a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1798,18 +1798,26 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
> DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
>  (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));
>
> -   /* Compute GTT size, either bsaed on 3/4th the size of RAM size
> +   /* Compute GTT size, either bsaed on 1/2 the size of RAM size
>  * or whatever the user passed on module init */
> if (amdgpu_gtt_size == -1) {
> struct sysinfo si;
>
> si_meminfo();
> -   gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> -  adev->gmc.mc_vram_size),
> -  ((uint64_t)si.totalram * si.mem_unit *
> 3/4));
> -   }
> -   else
> +   /* Certain GL unit tests for large textures can cause
> problems
> +* with the OOM killer since there is no way to link this
> memory
> +* to a process.  This was originally mitigated (but not
> necessarily
> +* eliminated) by limiting the GTT size.  The problem is
> this limit
> +* is often too low for many modern games so just make the
> limit 1/2
> +* of system memory which aligns with TTM. The OOM
> accounting needs
> +* to be addressed, but we shouldn't prevent common 3D
> applications
> +* from being usable just to potentially mitigate that
> corner case.
> +*/
> +   gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> +  (u64)si.totalram * si.mem_unit / 2);
> +   } else {
> gtt_size = (uint64_t)amdgpu_gtt_size << 20;
> +   }
>
> /* Initialize GTT memory pool */
> r = amdgpu_gtt_mgr_init(adev, gtt_size);
> --
> 2.35.3
>
>

Re: [PATCH] drm/amdgpu: Adjust logic around GTT size

2022-05-20 Thread Marek Olšák

We don't have to care about case 2 here. Broken apps will be handled by app
profiles. The problem is that some games don't work with the current limit
on the most powerful consumer APU we've ever made (Rembrandt) with
precisely the games that the APU was made for, and instead of increasing
the limit, we continue arguing about some TTM stuff that doesn't help
anything right now.

Marek

On Fri., May 20, 2022, 14:25 Christian König, <
ckoenig.leichtzumer...@gmail.com> wrote:

> Am 20.05.22 um 19:41 schrieb Bas Nieuwenhuizen:
> > On Fri, May 20, 2022 at 11:42 AM Christian König
> >  wrote:
> >> In theory we should allow much more than that. The problem is just that
> we can't.
> >>
> >> We have the following issues:
> >> 1. For swapping out stuff we need to make sure that we can allocate
> temporary pages.
> >>  Because of this TTM has a fixed 50% limit where it starts to unmap
> memory from GPUs.
> >>  So currently even with a higher GTT limit you can't actually use
> this.
> >>
> >> 2. Apart from the test case of allocating textures with increasing
> power of two until it fails we also have a bunch of extremely stupid
> applications.
> >>  E.g. stuff like looking at the amount of memory available and
> trying preallocate everything.
> > I hear you but we also have an increasing number of games that don't
> > even start with 3 GiB. At some point (which I'd speculate has already
> > been reached, I've seen a number of complaints of games that ran on
> > deck but not on desktop linux because on deck we set amdgpu.gttsize)
> > we have more games broken due to the low limit than there would be
> > apps broken due to a high limit.
>
> That's a really good argument, but the issue is that it is fixable. It's
> just that nobody had time to look into all the issues.
>
> If started efforts to fix this years ago, but there was always something
> more important going on.
>
> So we are left with the choice of breaking old applications or new
> applications or getting somebody working on fixing this.
>
> Christian.
>
> >
> >> I'm working on this for years, but there aren't easy solutions to those
> issues. Felix has opted out for adding a separate domain for KFD
> allocations, but sooner or later we need to find a solution which works for
> everybody.
> >>
> >> Christian.
> >>
> >> Am 20.05.22 um 11:14 schrieb Marek Olšák:
> >>
> >> Ignore the silly tests. We only need to make sure games work. The
> current minimum requirement for running modern games is 8GB of GPU memory.
> Soon it will be 12GB. APUs will need to support that.
> >>
> >> Marek
> >>
> >> On Fri., May 20, 2022, 03:52 Christian König, <
> ckoenig.leichtzumer...@gmail.com> wrote:
> >>> Am 19.05.22 um 16:34 schrieb Alex Deucher:
> >>>> The current somewhat strange logic is in place because certain
> >>>> GL unit tests for large textures can cause problems with the
> >>>> OOM killer since there is no way to link this memory to a
> >>>> process.  The problem is this limit is often too low for many
> >>>> modern games on systems with more memory so limit the logic to
> >>>> systems with less than 8GB of main memory.  For systems with 8
> >>>> or more GB of system memory, set the GTT size to 3/4 of system
> >>>> memory.
> >>> It's unfortunately not only the unit tests, but some games as well.
> >>>
> >>> 3/4 of total system memory sounds reasonable to be, but I'm 100% sure
> >>> that this will break some tests.
> >>>
> >>> Christian.
> >>>
> >>>> Signed-off-by: Alex Deucher 
> >>>> ---
> >>>>drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25
> -
> >>>>1 file changed, 20 insertions(+), 5 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>> index 4b9ee6e27f74..daa0babcf869 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>> @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device
> *adev)
> >>>>/* Compute GTT size, either bsaed on 3/4th the size of RAM size
> >>>> * or whatever the user passed on module init */
> >>>>if (amdgpu_gtt_size == -1) {
> >>>> + const u64 eight_GB = 8192ULL * 1024 * 1024;

Re: [PATCH] drm/amdgpu: Adjust logic around GTT size

2022-05-20 Thread Marek Olšák

1. So make gtt = ram/2. There's your 50%.

2. Not our problem.

Marek

On Fri., May 20, 2022, 05:42 Christian König, <
ckoenig.leichtzumer...@gmail.com> wrote:

> In theory we should allow much more than that. The problem is just that we
> can't.
>
> We have the following issues:
> 1. For swapping out stuff we need to make sure that we can allocate
> temporary pages.
> Because of this TTM has a fixed 50% limit where it starts to unmap
> memory from GPUs.
> So currently even with a higher GTT limit you can't actually use this.
>
> 2. Apart from the test case of allocating textures with increasing power
> of two until it fails we also have a bunch of extremely stupid applications.
> E.g. stuff like looking at the amount of memory available and trying
> preallocate everything.
>
> I'm working on this for years, but there aren't easy solutions to those
> issues. Felix has opted out for adding a separate domain for KFD
> allocations, but sooner or later we need to find a solution which works for
> everybody.
>
> Christian.
>
> Am 20.05.22 um 11:14 schrieb Marek Olšák:
>
> Ignore the silly tests. We only need to make sure games work. The current
> minimum requirement for running modern games is 8GB of GPU memory. Soon it
> will be 12GB. APUs will need to support that.
>
> Marek
>
> On Fri., May 20, 2022, 03:52 Christian König, <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> Am 19.05.22 um 16:34 schrieb Alex Deucher:
>> > The current somewhat strange logic is in place because certain
>> > GL unit tests for large textures can cause problems with the
>> > OOM killer since there is no way to link this memory to a
>> > process.  The problem is this limit is often too low for many
>> > modern games on systems with more memory so limit the logic to
>> > systems with less than 8GB of main memory.  For systems with 8
>> > or more GB of system memory, set the GTT size to 3/4 of system
>> > memory.
>>
>> It's unfortunately not only the unit tests, but some games as well.
>>
>> 3/4 of total system memory sounds reasonable to be, but I'm 100% sure
>> that this will break some tests.
>>
>> Christian.
>>
>> >
>> > Signed-off-by: Alex Deucher 
>> > ---
>> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25 -
>> >   1 file changed, 20 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> > index 4b9ee6e27f74..daa0babcf869 100644
>> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> > @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>> >   /* Compute GTT size, either bsaed on 3/4th the size of RAM size
>> >* or whatever the user passed on module init */
>> >   if (amdgpu_gtt_size == -1) {
>> > + const u64 eight_GB = 8192ULL * 1024 * 1024;
>> >   struct sysinfo si;
>> > + u64 total_memory, default_gtt_size;
>> >
>> >   si_meminfo();
>> > - gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>> > -adev->gmc.mc_vram_size),
>> > -((uint64_t)si.totalram * si.mem_unit *
>> 3/4));
>> > - }
>> > - else
>> > + total_memory = (u64)si.totalram * si.mem_unit;
>> > + default_gtt_size = total_memory * 3 / 4;
>> > + /* This somewhat strange logic is in place because
>> certain GL unit
>> > +  * tests for large textures can cause problems with the
>> OOM killer
>> > +  * since there is no way to link this memory to a process.
>> > +  * The problem is this limit is often too low for many
>> modern games
>> > +  * on systems with more memory so limit the logic to
>> systems with
>> > +  * less than 8GB of main memory.
>> > +  */
>> > + if (total_memory < eight_GB) {
>> > + gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB <<
>> 20),
>> > +adev->gmc.mc_vram_size),
>> > +default_gtt_size);
>> > + } else {
>> > + gtt_size = default_gtt_size;
>> > + }
>> > + } else {
>> >   gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>> > + }
>> >
>> >   /* Initialize GTT memory pool */
>> >   r = amdgpu_gtt_mgr_init(adev, gtt_size);
>>
>>
>

Re: [PATCH] drm/amdgpu: Adjust logic around GTT size

2022-05-20 Thread Marek Olšák

Ignore the silly tests. We only need to make sure games work. The current
minimum requirement for running modern games is 8GB of GPU memory. Soon it
will be 12GB. APUs will need to support that.

Marek

On Fri., May 20, 2022, 03:52 Christian König, <
ckoenig.leichtzumer...@gmail.com> wrote:

> Am 19.05.22 um 16:34 schrieb Alex Deucher:
> > The current somewhat strange logic is in place because certain
> > GL unit tests for large textures can cause problems with the
> > OOM killer since there is no way to link this memory to a
> > process.  The problem is this limit is often too low for many
> > modern games on systems with more memory so limit the logic to
> > systems with less than 8GB of main memory.  For systems with 8
> > or more GB of system memory, set the GTT size to 3/4 of system
> > memory.
>
> It's unfortunately not only the unit tests, but some games as well.
>
> 3/4 of total system memory sounds reasonable to be, but I'm 100% sure
> that this will break some tests.
>
> Christian.
>
> >
> > Signed-off-by: Alex Deucher 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25 -
> >   1 file changed, 20 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index 4b9ee6e27f74..daa0babcf869 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
> >   /* Compute GTT size, either bsaed on 3/4th the size of RAM size
> >* or whatever the user passed on module init */
> >   if (amdgpu_gtt_size == -1) {
> > + const u64 eight_GB = 8192ULL * 1024 * 1024;
> >   struct sysinfo si;
> > + u64 total_memory, default_gtt_size;
> >
> >   si_meminfo();
> > - gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> > -adev->gmc.mc_vram_size),
> > -((uint64_t)si.totalram * si.mem_unit *
> 3/4));
> > - }
> > - else
> > + total_memory = (u64)si.totalram * si.mem_unit;
> > + default_gtt_size = total_memory * 3 / 4;
> > + /* This somewhat strange logic is in place because certain
> GL unit
> > +  * tests for large textures can cause problems with the
> OOM killer
> > +  * since there is no way to link this memory to a process.
> > +  * The problem is this limit is often too low for many
> modern games
> > +  * on systems with more memory so limit the logic to
> systems with
> > +  * less than 8GB of main memory.
> > +  */
> > + if (total_memory < eight_GB) {
> > + gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB <<
> 20),
> > +adev->gmc.mc_vram_size),
> > +default_gtt_size);
> > + } else {
> > + gtt_size = default_gtt_size;
> > + }
> > + } else {
> >   gtt_size = (uint64_t)amdgpu_gtt_size << 20;
> > + }
> >
> >   /* Initialize GTT memory pool */
> >   r = amdgpu_gtt_mgr_init(adev, gtt_size);
>
>

Re: [PATCH 2/3] drm/amdgpu: add AMDGPU_VM_NOALLOC

2022-05-16 Thread Marek Olšák

Dmesg doesn't contain anything. There is no backtrace because it's not a
crash. The VA map ioctl just fails with the new flag. It looks like the
flag is considered invalid.

Marek

On Mon., May 16, 2022, 12:13 Christian König, <
ckoenig.leichtzumer...@gmail.com> wrote:

> I don't have access to any gfx10 hardware.
>
> Can you give me a dmesg and/or backtrace, etc..?
>
> I can't push this unless it's working properly.
>
> Christian.
>
> Am 16.05.22 um 14:56 schrieb Marek Olšák:
>
> Reproduction steps:
> - use mesa/main on gfx10.3 (not sure what other GPUs do)
> - run: radeonsi_mall_noalloc=true glxgears
>
> Marek
>
> On Mon, May 16, 2022 at 7:53 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> Crap, do you have a link to the failure?
>>
>> Am 16.05.22 um 13:10 schrieb Marek Olšák:
>>
>> I forgot to say: The NOALLOC flag causes an allocation failure, so there
>> is a kernel bug somewhere.
>>
>> Marek
>>
>> On Mon, May 16, 2022 at 7:06 AM Marek Olšák  wrote:
>>
>>> FYI, I think it's time to merge this because the Mesa commits are going
>>> to be merged in ~30 minutes if Gitlab CI is green, and that includes
>>> updated amdgpu_drm.h.
>>>
>>> Marek
>>>
>>> On Wed, May 11, 2022 at 2:55 PM Marek Olšák  wrote:
>>>
>>>> Ok sounds good.
>>>>
>>>> Marek
>>>>
>>>> On Wed., May 11, 2022, 03:43 Christian König, <
>>>> ckoenig.leichtzumer...@gmail.com> wrote:
>>>>
>>>>> It really *is* a NOALLOC feature. In other words there is no latency
>>>>> improvement on reads because the cache is always checked, even with the
>>>>> noalloc flag set.
>>>>>
>>>>> The only thing it affects is that misses not enter the cache and so
>>>>> don't cause any additional pressure on evicting cache lines.
>>>>>
>>>>> You might want to double check with the hardware guys, but I'm
>>>>> something like 95% sure that it works this way.
>>>>>
>>>>> Christian.
>>>>>
>>>>> Am 11.05.22 um 09:22 schrieb Marek Olšák:
>>>>>
>>>>> Bypass means that the contents of the cache are ignored, which
>>>>> decreases latency at the cost of no coherency between bypassed and normal
>>>>> memory requests. NOA (noalloc) means that the cache is checked and can 
>>>>> give
>>>>> you cache hits, but misses are not cached and the overall latency is
>>>>> higher. I don't know what the hw does, but I hope it was misnamed and it
>>>>> really means bypass because there is no point in doing cache lookups on
>>>>> every memory request if the driver wants to disable caching to *decrease*
>>>>> latency in the situations when the cache isn't helping.
>>>>>
>>>>> Marek
>>>>>
>>>>> On Wed, May 11, 2022 at 2:15 AM Lazar, Lijo 
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On 5/11/2022 11:36 AM, Christian König wrote:
>>>>>> > Mhm, it doesn't really bypass MALL. It just doesn't allocate any
>>>>>> MALL
>>>>>> > entries on write.
>>>>>> >
>>>>>> > How about AMDGPU_VM_PAGE_NO_MALL ?
>>>>>>
>>>>>> One more - AMDGPU_VM_PAGE_LLC_* [ LLC = last level cache, * = some
>>>>>> sort
>>>>>> of attribute which decides LLC behaviour]
>>>>>>
>>>>>> Thanks,
>>>>>> Lijo
>>>>>>
>>>>>> >
>>>>>> > Christian.
>>>>>> >
>>>>>> > Am 10.05.22 um 23:21 schrieb Marek Olšák:
>>>>>> >> A better name would be:
>>>>>> >> AMDGPU_VM_PAGE_BYPASS_MALL
>>>>>> >>
>>>>>> >> Marek
>>>>>> >>
>>>>>> >> On Fri, May 6, 2022 at 7:23 AM Christian König
>>>>>> >>  wrote:
>>>>>> >>
>>>>>> >> Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL
>>>>>> >> allocation.
>>>>>> >>
>>>>>> >> Only compile tested!
>>>>>> >>
>>>>>> >> Signed-off-by: Christian König 
>>&g

Re: [PATCH 2/3] drm/amdgpu: add AMDGPU_VM_NOALLOC

2022-05-16 Thread Marek Olšák

Reproduction steps:
- use mesa/main on gfx10.3 (not sure what other GPUs do)
- run: radeonsi_mall_noalloc=true glxgears

Marek

On Mon, May 16, 2022 at 7:53 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Crap, do you have a link to the failure?
>
> Am 16.05.22 um 13:10 schrieb Marek Olšák:
>
> I forgot to say: The NOALLOC flag causes an allocation failure, so there
> is a kernel bug somewhere.
>
> Marek
>
> On Mon, May 16, 2022 at 7:06 AM Marek Olšák  wrote:
>
>> FYI, I think it's time to merge this because the Mesa commits are going
>> to be merged in ~30 minutes if Gitlab CI is green, and that includes
>> updated amdgpu_drm.h.
>>
>> Marek
>>
>> On Wed, May 11, 2022 at 2:55 PM Marek Olšák  wrote:
>>
>>> Ok sounds good.
>>>
>>> Marek
>>>
>>> On Wed., May 11, 2022, 03:43 Christian König, <
>>> ckoenig.leichtzumer...@gmail.com> wrote:
>>>
>>>> It really *is* a NOALLOC feature. In other words there is no latency
>>>> improvement on reads because the cache is always checked, even with the
>>>> noalloc flag set.
>>>>
>>>> The only thing it affects is that misses not enter the cache and so
>>>> don't cause any additional pressure on evicting cache lines.
>>>>
>>>> You might want to double check with the hardware guys, but I'm
>>>> something like 95% sure that it works this way.
>>>>
>>>> Christian.
>>>>
>>>> Am 11.05.22 um 09:22 schrieb Marek Olšák:
>>>>
>>>> Bypass means that the contents of the cache are ignored, which
>>>> decreases latency at the cost of no coherency between bypassed and normal
>>>> memory requests. NOA (noalloc) means that the cache is checked and can give
>>>> you cache hits, but misses are not cached and the overall latency is
>>>> higher. I don't know what the hw does, but I hope it was misnamed and it
>>>> really means bypass because there is no point in doing cache lookups on
>>>> every memory request if the driver wants to disable caching to *decrease*
>>>> latency in the situations when the cache isn't helping.
>>>>
>>>> Marek
>>>>
>>>> On Wed, May 11, 2022 at 2:15 AM Lazar, Lijo  wrote:
>>>>
>>>>>
>>>>>
>>>>> On 5/11/2022 11:36 AM, Christian König wrote:
>>>>> > Mhm, it doesn't really bypass MALL. It just doesn't allocate any
>>>>> MALL
>>>>> > entries on write.
>>>>> >
>>>>> > How about AMDGPU_VM_PAGE_NO_MALL ?
>>>>>
>>>>> One more - AMDGPU_VM_PAGE_LLC_* [ LLC = last level cache, * = some
>>>>> sort
>>>>> of attribute which decides LLC behaviour]
>>>>>
>>>>> Thanks,
>>>>> Lijo
>>>>>
>>>>> >
>>>>> > Christian.
>>>>> >
>>>>> > Am 10.05.22 um 23:21 schrieb Marek Olšák:
>>>>> >> A better name would be:
>>>>> >> AMDGPU_VM_PAGE_BYPASS_MALL
>>>>> >>
>>>>> >> Marek
>>>>> >>
>>>>> >> On Fri, May 6, 2022 at 7:23 AM Christian König
>>>>> >>  wrote:
>>>>> >>
>>>>> >> Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL
>>>>> >> allocation.
>>>>> >>
>>>>> >> Only compile tested!
>>>>> >>
>>>>> >> Signed-off-by: Christian König 
>>>>> >> ---
>>>>> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 ++
>>>>> >>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 3 +++
>>>>> >>  drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c  | 3 +++
>>>>> >>  include/uapi/drm/amdgpu_drm.h   | 2 ++
>>>>> >>  4 files changed, 10 insertions(+)
>>>>> >>
>>>>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>>> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>>> >> index bf97d8f07f57..d8129626581f 100644
>>>>> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>>> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>>> >> @@ -650,6 +650,8 @@ uint64_t amdgpu_gem_va_map_flags(struct
&g

Re: [PATCH 2/3] drm/amdgpu: add AMDGPU_VM_NOALLOC

2022-05-16 Thread Marek Olšák

I forgot to say: The NOALLOC flag causes an allocation failure, so there is
a kernel bug somewhere.

Marek

On Mon, May 16, 2022 at 7:06 AM Marek Olšák  wrote:

> FYI, I think it's time to merge this because the Mesa commits are going to
> be merged in ~30 minutes if Gitlab CI is green, and that includes updated
> amdgpu_drm.h.
>
> Marek
>
> On Wed, May 11, 2022 at 2:55 PM Marek Olšák  wrote:
>
>> Ok sounds good.
>>
>> Marek
>>
>> On Wed., May 11, 2022, 03:43 Christian König, <
>> ckoenig.leichtzumer...@gmail.com> wrote:
>>
>>> It really *is* a NOALLOC feature. In other words there is no latency
>>> improvement on reads because the cache is always checked, even with the
>>> noalloc flag set.
>>>
>>> The only thing it affects is that misses not enter the cache and so
>>> don't cause any additional pressure on evicting cache lines.
>>>
>>> You might want to double check with the hardware guys, but I'm something
>>> like 95% sure that it works this way.
>>>
>>> Christian.
>>>
>>> Am 11.05.22 um 09:22 schrieb Marek Olšák:
>>>
>>> Bypass means that the contents of the cache are ignored, which decreases
>>> latency at the cost of no coherency between bypassed and normal memory
>>> requests. NOA (noalloc) means that the cache is checked and can give you
>>> cache hits, but misses are not cached and the overall latency is higher. I
>>> don't know what the hw does, but I hope it was misnamed and it really means
>>> bypass because there is no point in doing cache lookups on every memory
>>> request if the driver wants to disable caching to *decrease* latency in the
>>> situations when the cache isn't helping.
>>>
>>> Marek
>>>
>>> On Wed, May 11, 2022 at 2:15 AM Lazar, Lijo  wrote:
>>>
>>>>
>>>>
>>>> On 5/11/2022 11:36 AM, Christian König wrote:
>>>> > Mhm, it doesn't really bypass MALL. It just doesn't allocate any MALL
>>>> > entries on write.
>>>> >
>>>> > How about AMDGPU_VM_PAGE_NO_MALL ?
>>>>
>>>> One more - AMDGPU_VM_PAGE_LLC_* [ LLC = last level cache, * = some sort
>>>> of attribute which decides LLC behaviour]
>>>>
>>>> Thanks,
>>>> Lijo
>>>>
>>>> >
>>>> > Christian.
>>>> >
>>>> > Am 10.05.22 um 23:21 schrieb Marek Olšák:
>>>> >> A better name would be:
>>>> >> AMDGPU_VM_PAGE_BYPASS_MALL
>>>> >>
>>>> >> Marek
>>>> >>
>>>> >> On Fri, May 6, 2022 at 7:23 AM Christian König
>>>> >>  wrote:
>>>> >>
>>>> >> Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL
>>>> >> allocation.
>>>> >>
>>>> >> Only compile tested!
>>>> >>
>>>> >> Signed-off-by: Christian König 
>>>> >> ---
>>>> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 ++
>>>> >>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 3 +++
>>>> >>  drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c  | 3 +++
>>>> >>  include/uapi/drm/amdgpu_drm.h   | 2 ++
>>>> >>  4 files changed, 10 insertions(+)
>>>> >>
>>>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>> >> index bf97d8f07f57..d8129626581f 100644
>>>> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>> >> @@ -650,6 +650,8 @@ uint64_t amdgpu_gem_va_map_flags(struct
>>>> >> amdgpu_device *adev, uint32_t flags)
>>>> >> pte_flag |= AMDGPU_PTE_WRITEABLE;
>>>> >> if (flags & AMDGPU_VM_PAGE_PRT)
>>>> >> pte_flag |= AMDGPU_PTE_PRT;
>>>> >> +   if (flags & AMDGPU_VM_PAGE_NOALLOC)
>>>> >> +   pte_flag |= AMDGPU_PTE_NOALLOC;
>>>> >>
>>>> >> if (adev->gmc.gmc_funcs->map_mtype)
>>>> >> pte_flag |= amdgpu_gmc_map_mtype(adev,
>>>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>>> >>

Re: [PATCH 2/3] drm/amdgpu: add AMDGPU_VM_NOALLOC

2022-05-16 Thread Marek Olšák

FYI, I think it's time to merge this because the Mesa commits are going to
be merged in ~30 minutes if Gitlab CI is green, and that includes updated
amdgpu_drm.h.

Marek

On Wed, May 11, 2022 at 2:55 PM Marek Olšák  wrote:

> Ok sounds good.
>
> Marek
>
> On Wed., May 11, 2022, 03:43 Christian König, <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> It really *is* a NOALLOC feature. In other words there is no latency
>> improvement on reads because the cache is always checked, even with the
>> noalloc flag set.
>>
>> The only thing it affects is that misses not enter the cache and so don't
>> cause any additional pressure on evicting cache lines.
>>
>> You might want to double check with the hardware guys, but I'm something
>> like 95% sure that it works this way.
>>
>> Christian.
>>
>> Am 11.05.22 um 09:22 schrieb Marek Olšák:
>>
>> Bypass means that the contents of the cache are ignored, which decreases
>> latency at the cost of no coherency between bypassed and normal memory
>> requests. NOA (noalloc) means that the cache is checked and can give you
>> cache hits, but misses are not cached and the overall latency is higher. I
>> don't know what the hw does, but I hope it was misnamed and it really means
>> bypass because there is no point in doing cache lookups on every memory
>> request if the driver wants to disable caching to *decrease* latency in the
>> situations when the cache isn't helping.
>>
>> Marek
>>
>> On Wed, May 11, 2022 at 2:15 AM Lazar, Lijo  wrote:
>>
>>>
>>>
>>> On 5/11/2022 11:36 AM, Christian König wrote:
>>> > Mhm, it doesn't really bypass MALL. It just doesn't allocate any MALL
>>> > entries on write.
>>> >
>>> > How about AMDGPU_VM_PAGE_NO_MALL ?
>>>
>>> One more - AMDGPU_VM_PAGE_LLC_* [ LLC = last level cache, * = some sort
>>> of attribute which decides LLC behaviour]
>>>
>>> Thanks,
>>> Lijo
>>>
>>> >
>>> > Christian.
>>> >
>>> > Am 10.05.22 um 23:21 schrieb Marek Olšák:
>>> >> A better name would be:
>>> >> AMDGPU_VM_PAGE_BYPASS_MALL
>>> >>
>>> >> Marek
>>> >>
>>> >> On Fri, May 6, 2022 at 7:23 AM Christian König
>>> >>  wrote:
>>> >>
>>> >> Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL
>>> >> allocation.
>>> >>
>>> >> Only compile tested!
>>> >>
>>> >> Signed-off-by: Christian König 
>>> >> ---
>>> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 ++
>>> >>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 3 +++
>>> >>  drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c  | 3 +++
>>> >>  include/uapi/drm/amdgpu_drm.h   | 2 ++
>>> >>  4 files changed, 10 insertions(+)
>>> >>
>>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> >> index bf97d8f07f57..d8129626581f 100644
>>> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> >> @@ -650,6 +650,8 @@ uint64_t amdgpu_gem_va_map_flags(struct
>>> >> amdgpu_device *adev, uint32_t flags)
>>> >> pte_flag |= AMDGPU_PTE_WRITEABLE;
>>> >> if (flags & AMDGPU_VM_PAGE_PRT)
>>> >> pte_flag |= AMDGPU_PTE_PRT;
>>> >> +   if (flags & AMDGPU_VM_PAGE_NOALLOC)
>>> >> +   pte_flag |= AMDGPU_PTE_NOALLOC;
>>> >>
>>> >> if (adev->gmc.gmc_funcs->map_mtype)
>>> >> pte_flag |= amdgpu_gmc_map_mtype(adev,
>>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>> >> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>> >> index b8c79789e1e4..9077dfccaf3c 100644
>>> >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>> >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>> >> @@ -613,6 +613,9 @@ static void gmc_v10_0_get_vm_pte(struct
>>> >> amdgpu_device *adev,
>>> >> *flags &= ~AMDGPU_PTE_MTYPE_NV10_MASK;
>>> >> *flags |= (mapping->flags & AMDGPU_PTE_MTYPE_NV10_

Re: [PATCH 1/3] drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE

2022-05-12 Thread Marek Olšák

Would it be better to set the VM_ALWAYS_VALID flag to have a greater
guarantee that the best placement will be chosen?

See, the main feature is getting the best placement, not being discardable.
The best placement is a hw design requirement due to using memory for uses
that are expected to have performance similar to onchip SRAMs. We need to
make sure the best placement is guaranteed if it's VRAM.

Marek

On Thu., May 12, 2022, 03:26 Christian König, <
ckoenig.leichtzumer...@gmail.com> wrote:

> Am 12.05.22 um 00:06 schrieb Marek Olšák:
>
> 3rd question: Is it worth using this on APUs?
>
>
> It makes memory management somewhat easier when we are really OOM.
>
> E.g. it should also work for GTT allocations and when the core kernel says
> "Hey please free something up or I will start the OOM-killer" it's
> something we can easily throw away.
>
> Not sure how many of those buffers we have, but marking everything which
> is temporary with that flag is probably a good idea.
>
>
> Thanks,
> Marek
>
> On Wed, May 11, 2022 at 5:58 PM Marek Olšák  wrote:
>
>> Will the kernel keep all discardable buffers in VRAM if VRAM is not
>> overcommitted by discardable buffers, or will other buffers also affect the
>> placement of discardable buffers?
>>
>
> Regarding the eviction pressure the buffers will be handled like any other
> buffer, but instead of preserving the content it is just discarded on
> eviction.
>
>
>> Do evictions deallocate the buffer, or do they keep an allocation in GTT
>> and only the copy is skipped?
>>
>
> It really deallocates the backing store of the buffer, just keeps a dummy
> page array around where all entries are NULL.
>
> There is a patch set on the mailing list to make this a little bit more
> efficient, but even using the dummy page array should only have a few bytes
> overhead.
>
> Regards,
> Christian.
>
>
>> Thanks,
>> Marek
>>
>> On Wed, May 11, 2022 at 3:08 AM Marek Olšák  wrote:
>>
>>> OK that sounds good.
>>>
>>> Marek
>>>
>>> On Wed, May 11, 2022 at 2:04 AM Christian König <
>>> ckoenig.leichtzumer...@gmail.com> wrote:
>>>
>>>> Hi Marek,
>>>>
>>>> Am 10.05.22 um 22:43 schrieb Marek Olšák:
>>>>
>>>> A better flag name would be:
>>>> AMDGPU_GEM_CREATE_BEST_PLACEMENT_OR_DISCARD
>>>>
>>>>
>>>> A bit long for my taste and I think the best placement is just a side
>>>> effect.
>>>>
>>>>
>>>> Marek
>>>>
>>>> On Tue, May 10, 2022 at 4:13 PM Marek Olšák  wrote:
>>>>
>>>>> Does this really guarantee VRAM placement? The code doesn't say
>>>>> anything about that.
>>>>>
>>>>
>>>> Yes, see the code here:
>>>>
>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>> index 8b7ee1142d9a..1944ef37a61e 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>> @@ -567,6 +567,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
>>>>>> bp->domain;
>>>>>> bo->allowed_domains = bo->preferred_domains;
>>>>>> if (bp->type != ttm_bo_type_kernel &&
>>>>>> +   !(bp->flags & AMDGPU_GEM_CREATE_DISCARDABLE) &&
>>>>>> bo->allowed_domains == AMDGPU_GEM_DOMAIN_VRAM)
>>>>>> bo->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;
>>>>>>
>>>>>
>>>> The only case where this could be circumvented is when you try to
>>>> allocate more than physically available on an APU.
>>>>
>>>> E.g. you only have something like 32 MiB VRAM and request 64 MiB, then
>>>> the GEM code will catch the error and fallback to GTT (IIRC).
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>
>

Re: [PATCH 3/3] drm/amdgpu: bump minor version number

2022-05-11 Thread Marek Olšák

https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16466

Marek

On Fri, May 6, 2022 at 9:35 AM Alex Deucher  wrote:

> On Fri, May 6, 2022 at 7:23 AM Christian König
>  wrote:
> >
> > Increase the minor version number to indicate that the new flags are
> > avaiable.
>
> typo: available.  Other than that the series is:
> Reviewed-by: Alex Deucher 
> Once we get the Mesa patches.
>
> Alex
>
>
> >
> > Signed-off-by: Christian König 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index 16871baee784..3dbf406b4194 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -99,10 +99,11 @@
> >   * - 3.43.0 - Add device hot plug/unplug support
> >   * - 3.44.0 - DCN3 supports DCC independent block settings: !64B &&
> 128B, 64B && 128B
> >   * - 3.45.0 - Add context ioctl stable pstate interface
> > - * * 3.46.0 - To enable hot plug amdgpu tests in libdrm
> > + * - 3.46.0 - To enable hot plug amdgpu tests in libdrm
> > + * * 3.47.0 - Add AMDGPU_GEM_CREATE_DISCARDABLE and AMDGPU_VM_NOALLOC
> flags
> >   */
> >  #define KMS_DRIVER_MAJOR   3
> > -#define KMS_DRIVER_MINOR   46
> > +#define KMS_DRIVER_MINOR   47
> >  #define KMS_DRIVER_PATCHLEVEL  0
> >
> >  int amdgpu_vram_limit;
> > --
> > 2.25.1
> >
>

Re: [PATCH 1/3] drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE

2022-05-11 Thread Marek Olšák

3rd question: Is it worth using this on APUs?

Thanks,
Marek

On Wed, May 11, 2022 at 5:58 PM Marek Olšák  wrote:

> Will the kernel keep all discardable buffers in VRAM if VRAM is not
> overcommitted by discardable buffers, or will other buffers also affect the
> placement of discardable buffers?
>
> Do evictions deallocate the buffer, or do they keep an allocation in GTT
> and only the copy is skipped?
>
> Thanks,
> Marek
>
> On Wed, May 11, 2022 at 3:08 AM Marek Olšák  wrote:
>
>> OK that sounds good.
>>
>> Marek
>>
>> On Wed, May 11, 2022 at 2:04 AM Christian König <
>> ckoenig.leichtzumer...@gmail.com> wrote:
>>
>>> Hi Marek,
>>>
>>> Am 10.05.22 um 22:43 schrieb Marek Olšák:
>>>
>>> A better flag name would be:
>>> AMDGPU_GEM_CREATE_BEST_PLACEMENT_OR_DISCARD
>>>
>>>
>>> A bit long for my taste and I think the best placement is just a side
>>> effect.
>>>
>>>
>>> Marek
>>>
>>> On Tue, May 10, 2022 at 4:13 PM Marek Olšák  wrote:
>>>
>>>> Does this really guarantee VRAM placement? The code doesn't say
>>>> anything about that.
>>>>
>>>
>>> Yes, see the code here:
>>>
>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> index 8b7ee1142d9a..1944ef37a61e 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> @@ -567,6 +567,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
>>>>> bp->domain;
>>>>> bo->allowed_domains = bo->preferred_domains;
>>>>> if (bp->type != ttm_bo_type_kernel &&
>>>>> +   !(bp->flags & AMDGPU_GEM_CREATE_DISCARDABLE) &&
>>>>> bo->allowed_domains == AMDGPU_GEM_DOMAIN_VRAM)
>>>>> bo->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;
>>>>>
>>>>
>>> The only case where this could be circumvented is when you try to
>>> allocate more than physically available on an APU.
>>>
>>> E.g. you only have something like 32 MiB VRAM and request 64 MiB, then
>>> the GEM code will catch the error and fallback to GTT (IIRC).
>>>
>>> Regards,
>>> Christian.
>>>
>>

Re: [PATCH 1/3] drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE

2022-05-11 Thread Marek Olšák

Will the kernel keep all discardable buffers in VRAM if VRAM is not
overcommitted by discardable buffers, or will other buffers also affect the
placement of discardable buffers?

Do evictions deallocate the buffer, or do they keep an allocation in GTT
and only the copy is skipped?

Thanks,
Marek

On Wed, May 11, 2022 at 3:08 AM Marek Olšák  wrote:

> OK that sounds good.
>
> Marek
>
> On Wed, May 11, 2022 at 2:04 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> Hi Marek,
>>
>> Am 10.05.22 um 22:43 schrieb Marek Olšák:
>>
>> A better flag name would be:
>> AMDGPU_GEM_CREATE_BEST_PLACEMENT_OR_DISCARD
>>
>>
>> A bit long for my taste and I think the best placement is just a side
>> effect.
>>
>>
>> Marek
>>
>> On Tue, May 10, 2022 at 4:13 PM Marek Olšák  wrote:
>>
>>> Does this really guarantee VRAM placement? The code doesn't say anything
>>> about that.
>>>
>>
>> Yes, see the code here:
>>
>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> index 8b7ee1142d9a..1944ef37a61e 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> @@ -567,6 +567,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
>>>> bp->domain;
>>>> bo->allowed_domains = bo->preferred_domains;
>>>> if (bp->type != ttm_bo_type_kernel &&
>>>> +   !(bp->flags & AMDGPU_GEM_CREATE_DISCARDABLE) &&
>>>> bo->allowed_domains == AMDGPU_GEM_DOMAIN_VRAM)
>>>> bo->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;
>>>>
>>>
>> The only case where this could be circumvented is when you try to
>> allocate more than physically available on an APU.
>>
>> E.g. you only have something like 32 MiB VRAM and request 64 MiB, then
>> the GEM code will catch the error and fallback to GTT (IIRC).
>>
>> Regards,
>> Christian.
>>
>

Re: [PATCH 2/3] drm/amdgpu: add AMDGPU_VM_NOALLOC

2022-05-11 Thread Marek Olšák

Ok sounds good.

Marek

On Wed., May 11, 2022, 03:43 Christian König, <
ckoenig.leichtzumer...@gmail.com> wrote:

> It really *is* a NOALLOC feature. In other words there is no latency
> improvement on reads because the cache is always checked, even with the
> noalloc flag set.
>
> The only thing it affects is that misses not enter the cache and so don't
> cause any additional pressure on evicting cache lines.
>
> You might want to double check with the hardware guys, but I'm something
> like 95% sure that it works this way.
>
> Christian.
>
> Am 11.05.22 um 09:22 schrieb Marek Olšák:
>
> Bypass means that the contents of the cache are ignored, which decreases
> latency at the cost of no coherency between bypassed and normal memory
> requests. NOA (noalloc) means that the cache is checked and can give you
> cache hits, but misses are not cached and the overall latency is higher. I
> don't know what the hw does, but I hope it was misnamed and it really means
> bypass because there is no point in doing cache lookups on every memory
> request if the driver wants to disable caching to *decrease* latency in the
> situations when the cache isn't helping.
>
> Marek
>
> On Wed, May 11, 2022 at 2:15 AM Lazar, Lijo  wrote:
>
>>
>>
>> On 5/11/2022 11:36 AM, Christian König wrote:
>> > Mhm, it doesn't really bypass MALL. It just doesn't allocate any MALL
>> > entries on write.
>> >
>> > How about AMDGPU_VM_PAGE_NO_MALL ?
>>
>> One more - AMDGPU_VM_PAGE_LLC_* [ LLC = last level cache, * = some sort
>> of attribute which decides LLC behaviour]
>>
>> Thanks,
>> Lijo
>>
>> >
>> > Christian.
>> >
>> > Am 10.05.22 um 23:21 schrieb Marek Olšák:
>> >> A better name would be:
>> >> AMDGPU_VM_PAGE_BYPASS_MALL
>> >>
>> >> Marek
>> >>
>> >> On Fri, May 6, 2022 at 7:23 AM Christian König
>> >>  wrote:
>> >>
>> >> Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL
>> >> allocation.
>> >>
>> >> Only compile tested!
>> >>
>> >> Signed-off-by: Christian König 
>> >> ---
>> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 ++
>> >>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 3 +++
>> >>  drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c  | 3 +++
>> >>  include/uapi/drm/amdgpu_drm.h   | 2 ++
>> >>  4 files changed, 10 insertions(+)
>> >>
>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> >> index bf97d8f07f57..d8129626581f 100644
>> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> >> @@ -650,6 +650,8 @@ uint64_t amdgpu_gem_va_map_flags(struct
>> >> amdgpu_device *adev, uint32_t flags)
>> >> pte_flag |= AMDGPU_PTE_WRITEABLE;
>> >> if (flags & AMDGPU_VM_PAGE_PRT)
>> >> pte_flag |= AMDGPU_PTE_PRT;
>> >> +   if (flags & AMDGPU_VM_PAGE_NOALLOC)
>> >> +   pte_flag |= AMDGPU_PTE_NOALLOC;
>> >>
>> >> if (adev->gmc.gmc_funcs->map_mtype)
>> >> pte_flag |= amdgpu_gmc_map_mtype(adev,
>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>> >> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>> >> index b8c79789e1e4..9077dfccaf3c 100644
>> >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>> >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>> >> @@ -613,6 +613,9 @@ static void gmc_v10_0_get_vm_pte(struct
>> >> amdgpu_device *adev,
>> >> *flags &= ~AMDGPU_PTE_MTYPE_NV10_MASK;
>> >> *flags |= (mapping->flags & AMDGPU_PTE_MTYPE_NV10_MASK);
>> >>
>> >> +   *flags &= ~AMDGPU_PTE_NOALLOC;
>> >> +   *flags |= (mapping->flags & AMDGPU_PTE_NOALLOC);
>> >> +
>> >> if (mapping->flags & AMDGPU_PTE_PRT) {
>> >> *flags |= AMDGPU_PTE_PRT;
>> >> *flags |= AMDGPU_PTE_SNOOPED;
>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>> >> b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>> >> index 8d733eeac556..32ee56adb602 100644
>> &g

Re: [PATCH 2/3] drm/amdgpu: add AMDGPU_VM_NOALLOC

2022-05-11 Thread Marek Olšák

Bypass means that the contents of the cache are ignored, which decreases
latency at the cost of no coherency between bypassed and normal memory
requests. NOA (noalloc) means that the cache is checked and can give you
cache hits, but misses are not cached and the overall latency is higher. I
don't know what the hw does, but I hope it was misnamed and it really means
bypass because there is no point in doing cache lookups on every memory
request if the driver wants to disable caching to *decrease* latency in the
situations when the cache isn't helping.

Marek

On Wed, May 11, 2022 at 2:15 AM Lazar, Lijo  wrote:

>
>
> On 5/11/2022 11:36 AM, Christian König wrote:
> > Mhm, it doesn't really bypass MALL. It just doesn't allocate any MALL
> > entries on write.
> >
> > How about AMDGPU_VM_PAGE_NO_MALL ?
>
> One more - AMDGPU_VM_PAGE_LLC_* [ LLC = last level cache, * = some sort
> of attribute which decides LLC behaviour]
>
> Thanks,
> Lijo
>
> >
> > Christian.
> >
> > Am 10.05.22 um 23:21 schrieb Marek Olšák:
> >> A better name would be:
> >> AMDGPU_VM_PAGE_BYPASS_MALL
> >>
> >> Marek
> >>
> >> On Fri, May 6, 2022 at 7:23 AM Christian König
> >>  wrote:
> >>
> >> Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL
> >> allocation.
> >>
> >> Only compile tested!
> >>
> >> Signed-off-by: Christian König 
> >> ---
> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 ++
> >>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 3 +++
> >>  drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c  | 3 +++
> >>  include/uapi/drm/amdgpu_drm.h   | 2 ++
> >>  4 files changed, 10 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >> index bf97d8f07f57..d8129626581f 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> >> @@ -650,6 +650,8 @@ uint64_t amdgpu_gem_va_map_flags(struct
> >> amdgpu_device *adev, uint32_t flags)
> >> pte_flag |= AMDGPU_PTE_WRITEABLE;
> >> if (flags & AMDGPU_VM_PAGE_PRT)
> >> pte_flag |= AMDGPU_PTE_PRT;
> >> +   if (flags & AMDGPU_VM_PAGE_NOALLOC)
> >> +   pte_flag |= AMDGPU_PTE_NOALLOC;
> >>
> >> if (adev->gmc.gmc_funcs->map_mtype)
> >> pte_flag |= amdgpu_gmc_map_mtype(adev,
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> >> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> >> index b8c79789e1e4..9077dfccaf3c 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> >> @@ -613,6 +613,9 @@ static void gmc_v10_0_get_vm_pte(struct
> >> amdgpu_device *adev,
> >> *flags &= ~AMDGPU_PTE_MTYPE_NV10_MASK;
> >> *flags |= (mapping->flags & AMDGPU_PTE_MTYPE_NV10_MASK);
> >>
> >> +   *flags &= ~AMDGPU_PTE_NOALLOC;
> >> +   *flags |= (mapping->flags & AMDGPU_PTE_NOALLOC);
> >> +
> >> if (mapping->flags & AMDGPU_PTE_PRT) {
> >> *flags |= AMDGPU_PTE_PRT;
> >> *flags |= AMDGPU_PTE_SNOOPED;
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >> b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >> index 8d733eeac556..32ee56adb602 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> >> @@ -508,6 +508,9 @@ static void gmc_v11_0_get_vm_pte(struct
> >> amdgpu_device *adev,
> >> *flags &= ~AMDGPU_PTE_MTYPE_NV10_MASK;
> >> *flags |= (mapping->flags & AMDGPU_PTE_MTYPE_NV10_MASK);
> >>
> >> +   *flags &= ~AMDGPU_PTE_NOALLOC;
> >> +   *flags |= (mapping->flags & AMDGPU_PTE_NOALLOC);
> >> +
> >> if (mapping->flags & AMDGPU_PTE_PRT) {
> >> *flags |= AMDGPU_PTE_PRT;
> >> *flags |= AMDGPU_PTE_SNOOPED;
> >> diff --git a/include/uapi/drm/amdgpu_drm.h
> >> b/include/uapi/drm/amdgpu_drm.h
> >> index 57b9d8f0133a..9d71d6330687 100644
> >> --- a/include/uapi/drm/amdgpu_drm.h
> >> +++ b/include/uapi/drm/amdgpu_drm.h
> >> @@ -533,6 +533,8 @@ struct drm_amdgpu_gem_op {
> >>  #define AMDGPU_VM_MTYPE_UC (4 << 5)
> >>  /* Use Read Write MTYPE instead of default MTYPE */
> >>  #define AMDGPU_VM_MTYPE_RW (5 << 5)
> >> +/* don't allocate MALL */
> >> +#define AMDGPU_VM_PAGE_NOALLOC (1 << 9)
> >>
> >>  struct drm_amdgpu_gem_va {
> >> /** GEM object handle */
> >> --
> >> 2.25.1
> >>
> >
>

Re: [PATCH 1/3] drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE

2022-05-11 Thread Marek Olšák

OK that sounds good.

Marek

On Wed, May 11, 2022 at 2:04 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Hi Marek,
>
> Am 10.05.22 um 22:43 schrieb Marek Olšák:
>
> A better flag name would be:
> AMDGPU_GEM_CREATE_BEST_PLACEMENT_OR_DISCARD
>
>
> A bit long for my taste and I think the best placement is just a side
> effect.
>
>
> Marek
>
> On Tue, May 10, 2022 at 4:13 PM Marek Olšák  wrote:
>
>> Does this really guarantee VRAM placement? The code doesn't say anything
>> about that.
>>
>
> Yes, see the code here:
>
>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 8b7ee1142d9a..1944ef37a61e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -567,6 +567,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
>>> bp->domain;
>>> bo->allowed_domains = bo->preferred_domains;
>>> if (bp->type != ttm_bo_type_kernel &&
>>> +   !(bp->flags & AMDGPU_GEM_CREATE_DISCARDABLE) &&
>>> bo->allowed_domains == AMDGPU_GEM_DOMAIN_VRAM)
>>> bo->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;
>>>
>>
> The only case where this could be circumvented is when you try to allocate
> more than physically available on an APU.
>
> E.g. you only have something like 32 MiB VRAM and request 64 MiB, then the
> GEM code will catch the error and fallback to GTT (IIRC).
>
> Regards,
> Christian.
>

Re: [PATCH 2/3] drm/amdgpu: add AMDGPU_VM_NOALLOC

2022-05-10 Thread Marek Olšák

A better name would be:
AMDGPU_VM_PAGE_BYPASS_MALL

Marek

On Fri, May 6, 2022 at 7:23 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL allocation.
>
> Only compile tested!
>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 ++
>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 3 +++
>  drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c  | 3 +++
>  include/uapi/drm/amdgpu_drm.h   | 2 ++
>  4 files changed, 10 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index bf97d8f07f57..d8129626581f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -650,6 +650,8 @@ uint64_t amdgpu_gem_va_map_flags(struct amdgpu_device
> *adev, uint32_t flags)
> pte_flag |= AMDGPU_PTE_WRITEABLE;
> if (flags & AMDGPU_VM_PAGE_PRT)
> pte_flag |= AMDGPU_PTE_PRT;
> +   if (flags & AMDGPU_VM_PAGE_NOALLOC)
> +   pte_flag |= AMDGPU_PTE_NOALLOC;
>
> if (adev->gmc.gmc_funcs->map_mtype)
> pte_flag |= amdgpu_gmc_map_mtype(adev,
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> index b8c79789e1e4..9077dfccaf3c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> @@ -613,6 +613,9 @@ static void gmc_v10_0_get_vm_pte(struct amdgpu_device
> *adev,
> *flags &= ~AMDGPU_PTE_MTYPE_NV10_MASK;
> *flags |= (mapping->flags & AMDGPU_PTE_MTYPE_NV10_MASK);
>
> +   *flags &= ~AMDGPU_PTE_NOALLOC;
> +   *flags |= (mapping->flags & AMDGPU_PTE_NOALLOC);
> +
> if (mapping->flags & AMDGPU_PTE_PRT) {
> *flags |= AMDGPU_PTE_PRT;
> *flags |= AMDGPU_PTE_SNOOPED;
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> index 8d733eeac556..32ee56adb602 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> @@ -508,6 +508,9 @@ static void gmc_v11_0_get_vm_pte(struct amdgpu_device
> *adev,
> *flags &= ~AMDGPU_PTE_MTYPE_NV10_MASK;
> *flags |= (mapping->flags & AMDGPU_PTE_MTYPE_NV10_MASK);
>
> +   *flags &= ~AMDGPU_PTE_NOALLOC;
> +   *flags |= (mapping->flags & AMDGPU_PTE_NOALLOC);
> +
> if (mapping->flags & AMDGPU_PTE_PRT) {
> *flags |= AMDGPU_PTE_PRT;
> *flags |= AMDGPU_PTE_SNOOPED;
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index 57b9d8f0133a..9d71d6330687 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -533,6 +533,8 @@ struct drm_amdgpu_gem_op {
>  #define AMDGPU_VM_MTYPE_UC (4 << 5)
>  /* Use Read Write MTYPE instead of default MTYPE */
>  #define AMDGPU_VM_MTYPE_RW (5 << 5)
> +/* don't allocate MALL */
> +#define AMDGPU_VM_PAGE_NOALLOC (1 << 9)
>
>  struct drm_amdgpu_gem_va {
> /** GEM object handle */
> --
> 2.25.1
>
>

Re: [PATCH 1/3] drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE

2022-05-10 Thread Marek Olšák

A better flag name would be:
AMDGPU_GEM_CREATE_BEST_PLACEMENT_OR_DISCARD

Marek

On Tue, May 10, 2022 at 4:13 PM Marek Olšák  wrote:

> Does this really guarantee VRAM placement? The code doesn't say anything
> about that.
>
> Marek
>
>
> On Fri, May 6, 2022 at 7:23 AM Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> Add a AMDGPU_GEM_CREATE_DISCARDABLE flag to note that the content of a BO
>> doesn't needs to be preserved during eviction.
>>
>> KFD was already using a similar functionality for SVM BOs so replace the
>> internal flag with the new UAPI.
>>
>> Only compile tested!
>>
>> Signed-off-by: Christian König 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 ++--
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 1 +
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 -
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 2 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c   | 2 +-
>>  include/uapi/drm/amdgpu_drm.h  | 4 
>>  6 files changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> index 2e16484bf606..bf97d8f07f57 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> @@ -302,8 +302,8 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev,
>> void *data,
>>   AMDGPU_GEM_CREATE_VRAM_CLEARED |
>>   AMDGPU_GEM_CREATE_VM_ALWAYS_VALID |
>>   AMDGPU_GEM_CREATE_EXPLICIT_SYNC |
>> - AMDGPU_GEM_CREATE_ENCRYPTED))
>> -
>> + AMDGPU_GEM_CREATE_ENCRYPTED |
>> + AMDGPU_GEM_CREATE_DISCARDABLE))
>> return -EINVAL;
>>
>> /* reject invalid gem domains */
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 8b7ee1142d9a..1944ef37a61e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -567,6 +567,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
>> bp->domain;
>> bo->allowed_domains = bo->preferred_domains;
>> if (bp->type != ttm_bo_type_kernel &&
>> +   !(bp->flags & AMDGPU_GEM_CREATE_DISCARDABLE) &&
>> bo->allowed_domains == AMDGPU_GEM_DOMAIN_VRAM)
>> bo->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> index 4c9cbdc66995..147b79c10cbb 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> @@ -41,7 +41,6 @@
>>
>>  /* BO flag to indicate a KFD userptr BO */
>>  #define AMDGPU_AMDKFD_CREATE_USERPTR_BO(1ULL << 63)
>> -#define AMDGPU_AMDKFD_CREATE_SVM_BO(1ULL << 62)
>>
>>  #define to_amdgpu_bo_user(abo) container_of((abo), struct
>> amdgpu_bo_user, bo)
>>  #define to_amdgpu_bo_vm(abo) container_of((abo), struct amdgpu_bo_vm, bo)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> index 41d6f604813d..ba3221a25e75 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> @@ -117,7 +117,7 @@ static void amdgpu_evict_flags(struct
>> ttm_buffer_object *bo,
>> }
>>
>> abo = ttm_to_amdgpu_bo(bo);
>> -   if (abo->flags & AMDGPU_AMDKFD_CREATE_SVM_BO) {
>> +   if (abo->flags & AMDGPU_GEM_CREATE_DISCARDABLE) {
>> placement->num_placement = 0;
>> placement->num_busy_placement = 0;
>> return;
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> index 5ed8d9b549a4..835b5187f0b8 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> @@ -531,7 +531,7 @@ svm_range_vram_node_new(struct amdgpu_device *adev,
>> struct svm_range *prange,
>> bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
>> bp.flags = AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
>> bp.flags |= clear ? AMDGPU_GEM_CREATE_VRAM_CLEARED : 0;
>> -   bp.flags |= AMDGPU_AMDKFD_CREATE_SVM_BO;
>> +   bp.flags |= AMDGPU_GEM_CREATE_DISCARDABLE;
>> bp.type = ttm_bo_type_device;
>> bp.resv = NULL;
>>
>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>> index 9a1d210d135d..57b9d8f0133a 100644
>> --- a/include/uapi/drm/amdgpu_drm.h
>> +++ b/include/uapi/drm/amdgpu_drm.h
>> @@ -140,6 +140,10 @@ extern "C" {
>>   * not require GTT memory accounting
>>   */
>>  #define AMDGPU_GEM_CREATE_PREEMPTIBLE  (1 << 11)
>> +/* Flag that BO can be discarded under memory pressure without keeping
>> the
>> + * content.
>> + */
>> +#define AMDGPU_GEM_CREATE_DISCARDABLE  (1 << 12)
>>
>>  struct drm_amdgpu_gem_create_in  {
>> /** the requested memory size */
>> --
>> 2.25.1
>>
>>

Re: [PATCH 1/3] drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE

2022-05-10 Thread Marek Olšák

Does this really guarantee VRAM placement? The code doesn't say anything
about that.

Marek


On Fri, May 6, 2022 at 7:23 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Add a AMDGPU_GEM_CREATE_DISCARDABLE flag to note that the content of a BO
> doesn't needs to be preserved during eviction.
>
> KFD was already using a similar functionality for SVM BOs so replace the
> internal flag with the new UAPI.
>
> Only compile tested!
>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c   | 2 +-
>  include/uapi/drm/amdgpu_drm.h  | 4 
>  6 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 2e16484bf606..bf97d8f07f57 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -302,8 +302,8 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev,
> void *data,
>   AMDGPU_GEM_CREATE_VRAM_CLEARED |
>   AMDGPU_GEM_CREATE_VM_ALWAYS_VALID |
>   AMDGPU_GEM_CREATE_EXPLICIT_SYNC |
> - AMDGPU_GEM_CREATE_ENCRYPTED))
> -
> + AMDGPU_GEM_CREATE_ENCRYPTED |
> + AMDGPU_GEM_CREATE_DISCARDABLE))
> return -EINVAL;
>
> /* reject invalid gem domains */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 8b7ee1142d9a..1944ef37a61e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -567,6 +567,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
> bp->domain;
> bo->allowed_domains = bo->preferred_domains;
> if (bp->type != ttm_bo_type_kernel &&
> +   !(bp->flags & AMDGPU_GEM_CREATE_DISCARDABLE) &&
> bo->allowed_domains == AMDGPU_GEM_DOMAIN_VRAM)
> bo->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 4c9cbdc66995..147b79c10cbb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -41,7 +41,6 @@
>
>  /* BO flag to indicate a KFD userptr BO */
>  #define AMDGPU_AMDKFD_CREATE_USERPTR_BO(1ULL << 63)
> -#define AMDGPU_AMDKFD_CREATE_SVM_BO(1ULL << 62)
>
>  #define to_amdgpu_bo_user(abo) container_of((abo), struct amdgpu_bo_user,
> bo)
>  #define to_amdgpu_bo_vm(abo) container_of((abo), struct amdgpu_bo_vm, bo)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 41d6f604813d..ba3221a25e75 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -117,7 +117,7 @@ static void amdgpu_evict_flags(struct
> ttm_buffer_object *bo,
> }
>
> abo = ttm_to_amdgpu_bo(bo);
> -   if (abo->flags & AMDGPU_AMDKFD_CREATE_SVM_BO) {
> +   if (abo->flags & AMDGPU_GEM_CREATE_DISCARDABLE) {
> placement->num_placement = 0;
> placement->num_busy_placement = 0;
> return;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index 5ed8d9b549a4..835b5187f0b8 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -531,7 +531,7 @@ svm_range_vram_node_new(struct amdgpu_device *adev,
> struct svm_range *prange,
> bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
> bp.flags = AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
> bp.flags |= clear ? AMDGPU_GEM_CREATE_VRAM_CLEARED : 0;
> -   bp.flags |= AMDGPU_AMDKFD_CREATE_SVM_BO;
> +   bp.flags |= AMDGPU_GEM_CREATE_DISCARDABLE;
> bp.type = ttm_bo_type_device;
> bp.resv = NULL;
>
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index 9a1d210d135d..57b9d8f0133a 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -140,6 +140,10 @@ extern "C" {
>   * not require GTT memory accounting
>   */
>  #define AMDGPU_GEM_CREATE_PREEMPTIBLE  (1 << 11)
> +/* Flag that BO can be discarded under memory pressure without keeping the
> + * content.
> + */
> +#define AMDGPU_GEM_CREATE_DISCARDABLE  (1 << 12)
>
>  struct drm_amdgpu_gem_create_in  {
> /** the requested memory size */
> --
> 2.25.1
>
>

1 2 3 4 5 >

1 - 100 of 441 matches

Mail list logo