[AMD Official Use Only - General]

Hi Michel,

It is true that we don’t get obvious improvement on performance with these 
patches.
The original requirement of using mcbp is that when there is a very long ib 
package with many draw cmds on low priority which uses up gpu utilization, we 
give a chance to high priority ibs executed by gpu.
The total performance could be dropped as mcbp drains the pipe and the low 
priority ibs would be resubmitted again after that.

This set of patches is mainly to implement priority queues by software rings. 
We may use other method instead of mcbp to improve it later.

Thanks,
Jiadong

-----Original Message-----
From: Alex Deucher <[email protected]>
Sent: Friday, November 11, 2022 1:54 AM
To: Michel Dänzer <[email protected]>
Cc: Zhu, Jiadong <[email protected]>; Tuikov, Luben <[email protected]>; 
Huang, Ray <[email protected]>; Koenig, Christian <[email protected]>; 
[email protected]
Subject: Re: [PATCH 4/5] drm/amdgpu: MCBP based on DRM scheduler (v8)

On Thu, Nov 10, 2022 at 12:00 PM Michel Dänzer <[email protected]> wrote:
>
> On 2022-11-08 09:01, Zhu, Jiadong wrote:> From: Michel Dänzer
> <[email protected]>
> >
> >>>> The bad news is that this series still makes some things very slow. The 
> >>>> most extreme examples so far are glxgears (runs at ~400 fps now, ~7000 
> >>>> fps before, i.e. almost 20x slowdown) and hexchat (scrolling one page 
> >>>> now takes ~1 second, I can see it drawing line by line; before it was 
> >>>> almost instantaneous). I suspect this series makes the overhead of 
> >>>> running a single GPU job much bigger. On the bright side, I'm not 
> >>>> noticing any significant intermittent freezes anymore.
> >>>
> >>> Hi Michel,
> >>>
> >>> Thanks for the trying.
> >>> Is there high priority jobs running while executing glxgears?
> >>
> >> Yes, mutter is submitting high priority jobs. However, I don't think that 
> >> can explain the problem by itself:
> >>
> >> mutter only draws once per display refresh cycle. Let's assume mutter's 
> >> GPU work takes ~6-7ms (conservative example, should be less than that 
> >> usually). That leaves ~10ms per display refresh cycle (at 60 Hz refresh 
> >> rate) where GPU work from glxgears & Xwayland can run without getting 
> >> preempted. Since glxgears runs at ~7000 fps without this series, it should 
> >> be able to draw at least ~70 frames in 10ms[0], which corresponds to over 
> >> 4000 fps. Yet it manages only 1/10 of that.
> >>
> >> [0] Worst case consideration, ignoring the fact that without this series, 
> >> glxgears runs at ~7000 fps while mutter sustains 60 fps.
> >
> > I reproduced the glxgears 400fps scenario locally. The issue is caused by 
> > the patch5 "drm/amdgpu: Improve the software rings priority scheduler" 
> > which slows down the low priority scheduler thread if high priority ib is 
> > under executing. I'll drop this patch as we cannot identify gpu bound 
> > according to the unsignaled fence, etc.
>
> Okay, I'm testing with patches 1-4 only now.
>
> So far I haven't noticed any negative effects, no slowdowns or intermittent 
> freezes.
>
> The only issue is that there's hardly any positive effect either. While 
> constantly moving the window of a GPU-limited GpuTest benchmark in circles, 
> most of the time it looks exactly the same as without these patches. Only 
> occasionally, at most every few seconds, I notice that the window movement 
> becomes smoother for an instant.
>

I think it will largely depend on the workload.  The gfx pipe can only be 
preempted on draw boundaries so if most operations are a single draw, you 
probably won't see much difference.

Alex

Reply via email to