amdgpu: rework sched_list generation

Nirmoy Wed, 01 Apr 2020 08:32:06 -0700


On 4/1/20 5:02 PM, Luben Tuikov wrote:

On 2020-03-31 08:46, Nirmoy wrote:

On 3/31/20 3:01 AM, Luben Tuikov wrote:

This patch seems to be using DOS line-endings.


Strange, I don't see that in my local patch file.

Not sure why "git am" complained about DOS endings
the first time I downloaded it. Second time was fine.

[snip]>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

index 29f0a410091b..27abbdc603dd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -721,6 +721,11 @@ struct amd_powerplay {
        const struct amd_pm_funcs *pp_funcs;
   };

+struct amdgpu_sched {
+       uint32_t                        num_scheds;
+       struct drm_gpu_scheduler        *sched[HWIP_MAX_INSTANCE];
+};
+
   #define AMDGPU_RESET_MAGIC_NUM 64
   #define AMDGPU_MAX_DF_PERFMONS 4
   struct amdgpu_device {
@@ -858,6 +863,8 @@ struct amdgpu_device {
        struct amdgpu_ring              *rings[AMDGPU_MAX_RINGS];
        bool                            ib_pool_ready;
        struct amdgpu_sa_manager        ring_tmp_bo[AMDGPU_IB_POOL_MAX];
+       /* drm scheduler list */
+       struct amdgpu_sched             
gpu_sched[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX];

That's a 2-dimensional array of "struct amdgpu_sched".
I think that the comment should be removed, or at least
not say "drm scheduler list". (I can see the structure
definition above.)


Yes I should remove it.

If this is the path you want to go, consider removing
"num_scheds" and creating a three dimensional array,
which would really essentialize the direction you want
to go:

struct drm_gpu_scheduler 
*gpu_sched[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX][HWIP_MAX_INSTANCE];

Now that this architecture is stripped down to its essentials,
perhaps we can see some optimizations...?


If you mean whether we should see any performance improvement then imo
we may not see much

difference as we are using pretty much same number of memory access plus
now we have extra cost of array_index_nospec().

Also this is  not hot code path. We do only 1
amdgpu_ctx_init_entity()/HW_IP/Context.

No, this has nothing to do with "performance".
It's all about architecture and design.

You seem to have array-array-struct with array and int,
and it seems much cleaner to just have array-array-array.
I think you don't need to break the chain with
struct of int and array, just as I described
in my comment below which you snipped without addressing it.

If you're not going to address a comment, don't delete it, leave it
for others to see that it hasn't been addressed. Only
snip previously addressed and resolved comments and previously
seen patch info, like diffstat/etc.



I wanted to  understand "running pointer" before I could comment in there.

Also consider that since you're creating an array of pointers,
you don't necessarily need to know their count. Your hot-path
algorithms should not need to know it. If you need to print
their count, say in sysfs, then you can count them up on
behalf of the user-space process cat-ing the sysfs entry.

[snip]

+
+       /* set to default prio if sched_list is NULL */
+       if (!adev->gpu_sched[hw_ip][hw_prio].num_scheds)
+               hw_prio = AMDGPU_RING_PRIO_DEFAULT;

That comment is a bit confusing--it talks about a list
not being NULL, while the conditional implicitly checks
against 0.


Yes, this is wrong, will remove it.

<snip>

I wish you hadn't snipped my comment here, but address it
one way or the other. It is:

I'd much rather that integer comparison be performed against
integers, as opposed to using logical NOT operator (which is
implicit in comparing against 0), i.e.,

if (adev->gpu_sched[hw_ip][hw_prio].num_scheds == 0)

Also, architecturally, there seems to be informational
redundancy, in keeping an integer count and list of
objects at the same time, as the above if-conditional
exposes: the comment talks about a list and NULL but
the if-conditional implicitly checks for 0.

Number of valid drm scheduler in adev->gpu_sched[hw_ip][hw_prio].schedwill vary depending on priority and hw_ip.

We need to pass that scheduler array and num_scheds todrm_sched_entity_init(). I think we often use

array and integer count together when the number of valid items in thearray is dynamic.


Perhaps, we don't need "num_scheds" and you can just
check if the index is NULL and assign PRIO_DEFAULT.

@@ -258,6 +272,12 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
        ring->priority = DRM_SCHED_PRIORITY_NORMAL;
        mutex_init(&ring->priority_mutex);

+       if (ring->funcs->type != AMDGPU_RING_TYPE_KIQ) {
+               hw_ip = amdgpu_ring_type_to_drm_hw_ip[ring->funcs->type];
+               num_sched = &adev->gpu_sched[hw_ip][hw_prio].num_scheds;
+               adev->gpu_sched[hw_ip][hw_prio].sched[(*num_sched)++] = 
&ring->sched;
+       }
This seems unnecessarily complicated. Perhaps we can remove
"num_scheds", as recommended above, and keep a running pointer
while initialization is being done...?


What do you mean by running pointer ?

A "running pointer" is a local pointer you're using temporarily
to traverse memory. If you remove the "num_scheds", as noted in my
previous comment, then you can use a running pointer to contain
the next entry you'd initialize.

Okay I understand now. As I said we need to know the valid number ofscheduler in a sched array so that we can pass that information to


drm_sched_entity_init() .


Regards,

Nirmoy


Regards,
Luben

_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v3 1/1] drm/amdgpu: rework sched_list generation

Reply via email to