Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-25 Thread Michel Dänzer
On 25/05/17 08:24 PM, Marek Olšák wrote:
> On Thu, May 25, 2017 at 5:31 AM, Michel Dänzer  wrote:
>> On 24/05/17 08:27 PM, Christian König wrote:
>>> Am 24.05.2017 um 13:03 schrieb Marek Olšák:
>
 I think the final solution (done in fault_reserve_notify) should be:
 if (bo->num_cpu_page_faults++ > 20)
 bo->preferred_domain = GTT_WC;
>>
>> I agree something like that will probably be part of the solution, but I
>> doubt it's quite that simple or that it's the only thing that can be
>> improved.
>>
>>
>>> I more or less agree on that, but setting preferred_domain permanently
>>> to GTT_WC is what worries me a bit.
>>>
>>> E.g. imagine you alt+tab from a game to your browser and back and the
>>> game runs way slower now because BOs are never moved back to VRAM.
>>
>> Right, permanently moving a BO to GTT might itself cause performance to
>> drop down a cliff in some cases. It's possible that this is irrelevant
>> compared to excessive buffer migration for CPU access though.
>>
>>
>>> What we need is a global limit of number of bytes transfered per second
>>> for swap operations or something like that.
>>>
>>> Or maybe a timeout which says when a BO was moved (either by swapping it
>>> out or by a CPU page fault) only move it back after +n jiffies or
>>> something like that.
>>
>> I also feel like something like this will be more useful than the number
>> of CPU page faults per se. But I'm curious what Marek comes up with. :)
> 
> I don't have any better idea at the moment. It looks like John Brooks
> has already solved this issue based on his IRC comments.

I don't think there's "the issue" with a single solution. None of John's
patches that I've tried so far help for the scenario described in the
cover letter of this series.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-25 Thread Marek Olšák
On Thu, May 25, 2017 at 5:31 AM, Michel Dänzer  wrote:
> On 24/05/17 08:27 PM, Christian König wrote:
>> Am 24.05.2017 um 13:03 schrieb Marek Olšák:

>>> I think the final solution (done in fault_reserve_notify) should be:
>>> if (bo->num_cpu_page_faults++ > 20)
>>> bo->preferred_domain = GTT_WC;
>
> I agree something like that will probably be part of the solution, but I
> doubt it's quite that simple or that it's the only thing that can be
> improved.
>
>
>> I more or less agree on that, but setting preferred_domain permanently
>> to GTT_WC is what worries me a bit.
>>
>> E.g. imagine you alt+tab from a game to your browser and back and the
>> game runs way slower now because BOs are never moved back to VRAM.
>
> Right, permanently moving a BO to GTT might itself cause performance to
> drop down a cliff in some cases. It's possible that this is irrelevant
> compared to excessive buffer migration for CPU access though.
>
>
>> What we need is a global limit of number of bytes transfered per second
>> for swap operations or something like that.
>>
>> Or maybe a timeout which says when a BO was moved (either by swapping it
>> out or by a CPU page fault) only move it back after +n jiffies or
>> something like that.
>
> I also feel like something like this will be more useful than the number
> of CPU page faults per se. But I'm curious what Marek comes up with. :)

I don't have any better idea at the moment. It looks like John Brooks
has already solved this issue based on his IRC comments.

Marek
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-24 Thread Michel Dänzer
On 24/05/17 08:27 PM, Christian König wrote:
> Am 24.05.2017 um 13:03 schrieb Marek Olšák:
>>>
>> I think the final solution (done in fault_reserve_notify) should be:
>> if (bo->num_cpu_page_faults++ > 20)
>> bo->preferred_domain = GTT_WC;

I agree something like that will probably be part of the solution, but I
doubt it's quite that simple or that it's the only thing that can be
improved.


> I more or less agree on that, but setting preferred_domain permanently
> to GTT_WC is what worries me a bit.
> 
> E.g. imagine you alt+tab from a game to your browser and back and the
> game runs way slower now because BOs are never moved back to VRAM.

Right, permanently moving a BO to GTT might itself cause performance to
drop down a cliff in some cases. It's possible that this is irrelevant
compared to excessive buffer migration for CPU access though.


> What we need is a global limit of number of bytes transfered per second
> for swap operations or something like that.
> 
> Or maybe a timeout which says when a BO was moved (either by swapping it
> out or by a CPU page fault) only move it back after +n jiffies or
> something like that.

I also feel like something like this will be more useful than the number
of CPU page faults per se. But I'm curious what Marek comes up with. :)


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-24 Thread Christian König

Am 24.05.2017 um 13:03 schrieb Marek Olšák:

On Wed, May 24, 2017 at 9:56 AM, Michel Dänzer  wrote:

On 23/05/17 07:38 PM, Marek Olšák wrote:

On Tue, May 23, 2017 at 2:45 AM, Michel Dänzer  wrote:

On 22/05/17 07:09 PM, Marek Olšák wrote:

On Mon, May 22, 2017 at 12:00 PM, Michel Dänzer  wrote:

On 20/05/17 06:26 PM, Marek Olšák wrote:

On May 20, 2017 3:26 AM, "Michel Dänzer" > wrote:

 On 20/05/17 01:14 AM, Marek Olšák wrote:
 > Hi Michel,
 >
 > I've applied your series

 Thanks for testing it.

 > and it doesn't help with low Dirt Rally performance on Fiji. I see TTM
 > buffer moves at 800MB/s and many VRAM page faults.

 Did you see this:

 >> Note that there's only little if any improvement of the average
 framerate
 >> reported, but the minimum framerate as seen on the HUD goes from
 ~10 fps
 >> to ~17.

 I.e. it mostly affects the minimum framerate and smoothness for me
 as well.


Without the series, I get 70 average fps. With the series, I get 30
average fps. That might just be random bad luck. I don't know.

Hmm, yeah, maybe that was just one of the random slowdowns you've been
talking about in other threads and on IRC?

I can't reproduce any slowdown with these patches, even leaving visible
VRAM size at 256 MB.

The random slowdowns with Dirt Rally are only caused by the pressure
on visible VRAM. This whole thread is about those random slowdowns.

No, this thread is about the scenario described in the cover letter of
this patch series.



If you're saying "maybe it was just one of the random slowdowns", you're
saying "maybe it was just the visible VRAM pressure". It's only
random with Dirt Rally, which makes it difficult to believe statements
such as "I can't reproduce any slowdown".

I could say the same thing about you seeing random slowdowns... I've
never seen that, I had to artificially limit the size of visible VRAM to
64 MB to make it significantly affect the benchmark result.

How many times do you need to run the benchmark on average to hit a
random slowdown? Which desktop environment and other X clients are
running during the benchmark? Which tab is active in the Steam window
while the benchmark runs?

In my case, it's only xfwm4, xterm and steam on the Dirt Rally page in
the library.

Ubuntu Unity, Steam small mode (there are no tabs), Ultra settings in
Dirt Rally.

Every single time I run the game with this series, I get 700-1000MB/s
of TTM BO moves. There doesn't seem to be any randomness.

It was better without this series. (meaning it was sometimes OK, sometimes bad)

Thanks for the additional details. I presume that in the bad case there
are some BOs lying around in visible VRAM (e.g. from Unity), which
causes some of Dirt Rally's BOs to go back and forth between GTT on CPU
page faults and VRAM on GPU usage.

This means at least patch 2 goes out the window. I'll see if I can
salvage something out of patch 3.

I think the final solution (done in fault_reserve_notify) should be:
if (bo->num_cpu_page_faults++ > 20)
bo->preferred_domain = GTT_WC;


I more or less agree on that, but setting preferred_domain permanently 
to GTT_WC is what worries me a bit.


E.g. imagine you alt+tab from a game to your browser and back and the 
game runs way slower now because BOs are never moved back to VRAM.


What we need is a global limit of number of bytes transfered per second 
for swap operations or something like that.


Or maybe a timeout which says when a BO was moved (either by swapping it 
out or by a CPU page fault) only move it back after +n jiffies or 
something like that.


Christian.



Otherwise I think we'll be just going in circles and not get anywhere.

Marek
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-24 Thread Marek Olšák
On Wed, May 24, 2017 at 9:56 AM, Michel Dänzer  wrote:
> On 23/05/17 07:38 PM, Marek Olšák wrote:
>> On Tue, May 23, 2017 at 2:45 AM, Michel Dänzer  wrote:
>>> On 22/05/17 07:09 PM, Marek Olšák wrote:
 On Mon, May 22, 2017 at 12:00 PM, Michel Dänzer  wrote:
> On 20/05/17 06:26 PM, Marek Olšák wrote:
>> On May 20, 2017 3:26 AM, "Michel Dänzer" > > wrote:
>>
>> On 20/05/17 01:14 AM, Marek Olšák wrote:
>> > Hi Michel,
>> >
>> > I've applied your series
>>
>> Thanks for testing it.
>>
>> > and it doesn't help with low Dirt Rally performance on Fiji. I see 
>> TTM
>> > buffer moves at 800MB/s and many VRAM page faults.
>>
>> Did you see this:
>>
>> >> Note that there's only little if any improvement of the average
>> framerate
>> >> reported, but the minimum framerate as seen on the HUD goes from
>> ~10 fps
>> >> to ~17.
>>
>> I.e. it mostly affects the minimum framerate and smoothness for me
>> as well.
>>
>>
>> Without the series, I get 70 average fps. With the series, I get 30
>> average fps. That might just be random bad luck. I don't know.
>
> Hmm, yeah, maybe that was just one of the random slowdowns you've been
> talking about in other threads and on IRC?
>
> I can't reproduce any slowdown with these patches, even leaving visible
> VRAM size at 256 MB.

 The random slowdowns with Dirt Rally are only caused by the pressure
 on visible VRAM. This whole thread is about those random slowdowns.
>>>
>>> No, this thread is about the scenario described in the cover letter of
>>> this patch series.
>>>
>>>
 If you're saying "maybe it was just one of the random slowdowns", you're
 saying "maybe it was just the visible VRAM pressure". It's only
 random with Dirt Rally, which makes it difficult to believe statements
 such as "I can't reproduce any slowdown".
>>>
>>> I could say the same thing about you seeing random slowdowns... I've
>>> never seen that, I had to artificially limit the size of visible VRAM to
>>> 64 MB to make it significantly affect the benchmark result.
>>>
>>> How many times do you need to run the benchmark on average to hit a
>>> random slowdown? Which desktop environment and other X clients are
>>> running during the benchmark? Which tab is active in the Steam window
>>> while the benchmark runs?
>>>
>>> In my case, it's only xfwm4, xterm and steam on the Dirt Rally page in
>>> the library.
>>
>> Ubuntu Unity, Steam small mode (there are no tabs), Ultra settings in
>> Dirt Rally.
>>
>> Every single time I run the game with this series, I get 700-1000MB/s
>> of TTM BO moves. There doesn't seem to be any randomness.
>>
>> It was better without this series. (meaning it was sometimes OK, sometimes 
>> bad)
>
> Thanks for the additional details. I presume that in the bad case there
> are some BOs lying around in visible VRAM (e.g. from Unity), which
> causes some of Dirt Rally's BOs to go back and forth between GTT on CPU
> page faults and VRAM on GPU usage.
>
> This means at least patch 2 goes out the window. I'll see if I can
> salvage something out of patch 3.

I think the final solution (done in fault_reserve_notify) should be:
if (bo->num_cpu_page_faults++ > 20)
   bo->preferred_domain = GTT_WC;

Otherwise I think we'll be just going in circles and not get anywhere.

Marek
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-24 Thread Michel Dänzer
On 23/05/17 07:38 PM, Marek Olšák wrote:
> On Tue, May 23, 2017 at 2:45 AM, Michel Dänzer  wrote:
>> On 22/05/17 07:09 PM, Marek Olšák wrote:
>>> On Mon, May 22, 2017 at 12:00 PM, Michel Dänzer  wrote:
 On 20/05/17 06:26 PM, Marek Olšák wrote:
> On May 20, 2017 3:26 AM, "Michel Dänzer"  > wrote:
>
> On 20/05/17 01:14 AM, Marek Olšák wrote:
> > Hi Michel,
> >
> > I've applied your series
>
> Thanks for testing it.
>
> > and it doesn't help with low Dirt Rally performance on Fiji. I see 
> TTM
> > buffer moves at 800MB/s and many VRAM page faults.
>
> Did you see this:
>
> >> Note that there's only little if any improvement of the average
> framerate
> >> reported, but the minimum framerate as seen on the HUD goes from
> ~10 fps
> >> to ~17.
>
> I.e. it mostly affects the minimum framerate and smoothness for me
> as well.
>
>
> Without the series, I get 70 average fps. With the series, I get 30
> average fps. That might just be random bad luck. I don't know.

 Hmm, yeah, maybe that was just one of the random slowdowns you've been
 talking about in other threads and on IRC?

 I can't reproduce any slowdown with these patches, even leaving visible
 VRAM size at 256 MB.
>>>
>>> The random slowdowns with Dirt Rally are only caused by the pressure
>>> on visible VRAM. This whole thread is about those random slowdowns.
>>
>> No, this thread is about the scenario described in the cover letter of
>> this patch series.
>>
>>
>>> If you're saying "maybe it was just one of the random slowdowns", you're
>>> saying "maybe it was just the visible VRAM pressure". It's only
>>> random with Dirt Rally, which makes it difficult to believe statements
>>> such as "I can't reproduce any slowdown".
>>
>> I could say the same thing about you seeing random slowdowns... I've
>> never seen that, I had to artificially limit the size of visible VRAM to
>> 64 MB to make it significantly affect the benchmark result.
>>
>> How many times do you need to run the benchmark on average to hit a
>> random slowdown? Which desktop environment and other X clients are
>> running during the benchmark? Which tab is active in the Steam window
>> while the benchmark runs?
>>
>> In my case, it's only xfwm4, xterm and steam on the Dirt Rally page in
>> the library.
> 
> Ubuntu Unity, Steam small mode (there are no tabs), Ultra settings in
> Dirt Rally.
> 
> Every single time I run the game with this series, I get 700-1000MB/s
> of TTM BO moves. There doesn't seem to be any randomness.
> 
> It was better without this series. (meaning it was sometimes OK, sometimes 
> bad)

Thanks for the additional details. I presume that in the bad case there
are some BOs lying around in visible VRAM (e.g. from Unity), which
causes some of Dirt Rally's BOs to go back and forth between GTT on CPU
page faults and VRAM on GPU usage.

This means at least patch 2 goes out the window. I'll see if I can
salvage something out of patch 3.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-23 Thread Marek Olšák
On Tue, May 23, 2017 at 2:45 AM, Michel Dänzer  wrote:
> On 22/05/17 07:09 PM, Marek Olšák wrote:
>> On Mon, May 22, 2017 at 12:00 PM, Michel Dänzer  wrote:
>>> On 20/05/17 06:26 PM, Marek Olšák wrote:
 On May 20, 2017 3:26 AM, "Michel Dänzer" > wrote:

 On 20/05/17 01:14 AM, Marek Olšák wrote:
 > Hi Michel,
 >
 > I've applied your series

 Thanks for testing it.

 > and it doesn't help with low Dirt Rally performance on Fiji. I see 
 TTM
 > buffer moves at 800MB/s and many VRAM page faults.

 Did you see this:

 >> Note that there's only little if any improvement of the average
 framerate
 >> reported, but the minimum framerate as seen on the HUD goes from
 ~10 fps
 >> to ~17.

 I.e. it mostly affects the minimum framerate and smoothness for me
 as well.


 Without the series, I get 70 average fps. With the series, I get 30
 average fps. That might just be random bad luck. I don't know.
>>>
>>> Hmm, yeah, maybe that was just one of the random slowdowns you've been
>>> talking about in other threads and on IRC?
>>>
>>> I can't reproduce any slowdown with these patches, even leaving visible
>>> VRAM size at 256 MB.
>>
>> The random slowdowns with Dirt Rally are only caused by the pressure
>> on visible VRAM. This whole thread is about those random slowdowns.
>
> No, this thread is about the scenario described in the cover letter of
> this patch series.
>
>
>> If you're saying "maybe it was just one of the random slowdowns", you're
>> saying "maybe it was just the visible VRAM pressure". It's only
>> random with Dirt Rally, which makes it difficult to believe statements
>> such as "I can't reproduce any slowdown".
>
> I could say the same thing about you seeing random slowdowns... I've
> never seen that, I had to artificially limit the size of visible VRAM to
> 64 MB to make it significantly affect the benchmark result.
>
> How many times do you need to run the benchmark on average to hit a
> random slowdown? Which desktop environment and other X clients are
> running during the benchmark? Which tab is active in the Steam window
> while the benchmark runs?
>
> In my case, it's only xfwm4, xterm and steam on the Dirt Rally page in
> the library.

Ubuntu Unity, Steam small mode (there are no tabs), Ultra settings in
Dirt Rally.

Every single time I run the game with this series, I get 700-1000MB/s
of TTM BO moves. There doesn't seem to be any randomness.

It was better without this series. (meaning it was sometimes OK, sometimes bad)

Marek
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-22 Thread Michel Dänzer
On 22/05/17 07:09 PM, Marek Olšák wrote:
> On Mon, May 22, 2017 at 12:00 PM, Michel Dänzer  wrote:
>> On 20/05/17 06:26 PM, Marek Olšák wrote:
>>> On May 20, 2017 3:26 AM, "Michel Dänzer" >> > wrote:
>>>
>>> On 20/05/17 01:14 AM, Marek Olšák wrote:
>>> > Hi Michel,
>>> >
>>> > I've applied your series
>>>
>>> Thanks for testing it.
>>>
>>> > and it doesn't help with low Dirt Rally performance on Fiji. I see TTM
>>> > buffer moves at 800MB/s and many VRAM page faults.
>>>
>>> Did you see this:
>>>
>>> >> Note that there's only little if any improvement of the average
>>> framerate
>>> >> reported, but the minimum framerate as seen on the HUD goes from
>>> ~10 fps
>>> >> to ~17.
>>>
>>> I.e. it mostly affects the minimum framerate and smoothness for me
>>> as well.
>>>
>>>
>>> Without the series, I get 70 average fps. With the series, I get 30
>>> average fps. That might just be random bad luck. I don't know.
>>
>> Hmm, yeah, maybe that was just one of the random slowdowns you've been
>> talking about in other threads and on IRC?
>>
>> I can't reproduce any slowdown with these patches, even leaving visible
>> VRAM size at 256 MB.
> 
> The random slowdowns with Dirt Rally are only caused by the pressure
> on visible VRAM. This whole thread is about those random slowdowns.

No, this thread is about the scenario described in the cover letter of
this patch series.


> If you're saying "maybe it was just one of the random slowdowns", you're
> saying "maybe it was just the visible VRAM pressure". It's only
> random with Dirt Rally, which makes it difficult to believe statements
> such as "I can't reproduce any slowdown".

I could say the same thing about you seeing random slowdowns... I've
never seen that, I had to artificially limit the size of visible VRAM to
64 MB to make it significantly affect the benchmark result.

How many times do you need to run the benchmark on average to hit a
random slowdown? Which desktop environment and other X clients are
running during the benchmark? Which tab is active in the Steam window
while the benchmark runs?

In my case, it's only xfwm4, xterm and steam on the Dirt Rally page in
the library.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-22 Thread John Brooks
On Mon, May 22, 2017 at 12:09:21PM +0200, Marek Olšák wrote:
> On Mon, May 22, 2017 at 12:00 PM, Michel Dänzer  wrote:
> > On 20/05/17 06:26 PM, Marek Olšák wrote:
> >> On May 20, 2017 3:26 AM, "Michel Dänzer"  >> > wrote:
> >>
> >> On 20/05/17 01:14 AM, Marek Olšák wrote:
> >> > Hi Michel,
> >> >
> >> > I've applied your series
> >>
> >> Thanks for testing it.
> >>
> >> > and it doesn't help with low Dirt Rally performance on Fiji. I see 
> >> TTM
> >> > buffer moves at 800MB/s and many VRAM page faults.
> >>
> >> Did you see this:
> >>
> >> >> Note that there's only little if any improvement of the average
> >> framerate
> >> >> reported, but the minimum framerate as seen on the HUD goes from
> >> ~10 fps
> >> >> to ~17.
> >>
> >> I.e. it mostly affects the minimum framerate and smoothness for me
> >> as well.
> >>
> >>
> >> Without the series, I get 70 average fps. With the series, I get 30
> >> average fps. That might just be random bad luck. I don't know.
> >
> > Hmm, yeah, maybe that was just one of the random slowdowns you've been
> > talking about in other threads and on IRC?
> >
> > I can't reproduce any slowdown with these patches, even leaving visible
> > VRAM size at 256 MB.
> 
> The random slowdowns with Dirt Rally are only caused by the pressure
> on visible VRAM. This whole thread is about those random slowdowns. If
> you're saying "maybe it was just one of the random slowdowns", you're
> saying "maybe it was just the visible VRAM pressure". It's only
> random with Dirt Rally, which makes it difficult to believe statements
> such as "I can't reproduce any slowdown". It's not random with Dying
> Light.
> 
> Marek

For what it's worth, the best place to reproduce it in Dying Light is the
courtyard where Zere's trailer is. It's the first place you go after leaving
the Tower for the first time at the start of the game. The Tower lobby is
another good place. Outside of these areas the slowdown may not be apparent at
all.

John

> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-22 Thread Marek Olšák
On Mon, May 22, 2017 at 12:00 PM, Michel Dänzer  wrote:
> On 20/05/17 06:26 PM, Marek Olšák wrote:
>> On May 20, 2017 3:26 AM, "Michel Dänzer" > > wrote:
>>
>> On 20/05/17 01:14 AM, Marek Olšák wrote:
>> > Hi Michel,
>> >
>> > I've applied your series
>>
>> Thanks for testing it.
>>
>> > and it doesn't help with low Dirt Rally performance on Fiji. I see TTM
>> > buffer moves at 800MB/s and many VRAM page faults.
>>
>> Did you see this:
>>
>> >> Note that there's only little if any improvement of the average
>> framerate
>> >> reported, but the minimum framerate as seen on the HUD goes from
>> ~10 fps
>> >> to ~17.
>>
>> I.e. it mostly affects the minimum framerate and smoothness for me
>> as well.
>>
>>
>> Without the series, I get 70 average fps. With the series, I get 30
>> average fps. That might just be random bad luck. I don't know.
>
> Hmm, yeah, maybe that was just one of the random slowdowns you've been
> talking about in other threads and on IRC?
>
> I can't reproduce any slowdown with these patches, even leaving visible
> VRAM size at 256 MB.

The random slowdowns with Dirt Rally are only caused by the pressure
on visible VRAM. This whole thread is about those random slowdowns. If
you're saying "maybe it was just one of the random slowdowns", you're
saying "maybe it was just the visible VRAM pressure". It's only
random with Dirt Rally, which makes it difficult to believe statements
such as "I can't reproduce any slowdown". It's not random with Dying
Light.

Marek
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-20 Thread Marek Olšák
On May 20, 2017 3:26 AM, "Michel Dänzer"  wrote:

On 20/05/17 01:14 AM, Marek Olšák wrote:
> Hi Michel,
>
> I've applied your series

Thanks for testing it.

> and it doesn't help with low Dirt Rally performance on Fiji. I see TTM
> buffer moves at 800MB/s and many VRAM page faults.

Did you see this:

>> Note that there's only little if any improvement of the average framerate
>> reported, but the minimum framerate as seen on the HUD goes from ~10 fps
>> to ~17.

I.e. it mostly affects the minimum framerate and smoothness for me as well.


Without the series, I get 70 average fps. With the series, I get 30 average
fps. That might just be random bad luck. I don't know. In any case, 30 fps
is really bad, so I don't think the series does what you think it does.

Marek



--
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-19 Thread Michel Dänzer
On 20/05/17 01:14 AM, Marek Olšák wrote:
> Hi Michel,
> 
> I've applied your series

Thanks for testing it.

> and it doesn't help with low Dirt Rally performance on Fiji. I see TTM
> buffer moves at 800MB/s and many VRAM page faults.

Did you see this:

>> Note that there's only little if any improvement of the average framerate
>> reported, but the minimum framerate as seen on the HUD goes from ~10 fps
>> to ~17.

I.e. it mostly affects the minimum framerate and smoothness for me as well.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-19 Thread Marek Olšák
Hi Michel,

I've applied your series and it doesn't help with low Dirt Rally
performance on Fiji. I see TTM buffer moves at 800MB/s and many VRAM
page faults.

Marek

On Thu, May 18, 2017 at 11:08 AM, Michel Dänzer  wrote:
> From: Michel Dänzer 
>
> This series was developed and tested under the following scenario:
>
> Running the PTS dirt-rally benchmark (1920x1080, Ultra) on Tonga with
> 2G, with CPU visible VRAM artificially restricted to 64 MB.
>
> Without this series, there's a lot of stutter during about the first
> minute of a benchmark run. During this time there are significant amounts
> of buffer moves (starting from about 500 MB on the HUD) and evictions,
> gradually declining until the buffer moves settle around 8 MB on the HUD.
>
> With this series, there's only slight stutter during the first seconds
> after the car launches, even though the buffer move volume is about the
> same as without the series. Buffer evictions are eliminated almost
> completely, except for a few at the beginning. Buffer moves still settle
> around 8 MB on the HUD, but with less variance than before.
>
> Note that there's only little if any improvement of the average framerate
> reported, but the minimum framerate as seen on the HUD goes from ~10 fps
> to ~17.
>
>
> Patch 1 is a cleanup that I noticed along the way.
>
> Patch 2 makes the main difference for the above scenario.
>
> Patch 3 doesn't make as much difference, I'm fine with it not landing at
> least for now.
>
> Michel Dänzer (3):
>   drm/amdgpu: Drop useless loops for placement restrictions
>   drm/amdgpu: Don't evict other BOs from VRAM for page faults
>   drm/amdgpu: Try evicting from CPU visible to invisible VRAM first
>
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 42 ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 46 
> --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c|  6 +---
>  3 files changed, 51 insertions(+), 43 deletions(-)
>
> --
> 2.11.0
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-19 Thread Marek Olšák
On Fri, May 19, 2017 at 5:27 PM, John Brooks  wrote:
> On Fri, May 19, 2017 at 05:24:36PM +0200, Marek Olšák wrote:
>> Where is your "attached" patch?
>>
>> Marek
>
> It's actually a reply to my message. Sorry if that was unclear.

That's OK, but I don't see any patch from you here.

Marek
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-19 Thread John Brooks
On Fri, May 19, 2017 at 05:24:36PM +0200, Marek Olšák wrote:
> Where is your "attached" patch?
> 
> Marek

It's actually a reply to my message. Sorry if that was unclear.

> 
> On Fri, May 19, 2017 at 5:04 AM, John Brooks  wrote:
> > I'm glad this is being worked on. However, somewhat to my surprise, this 
> > patch
> > series didn't help Dying Light's BO eviction problem. For those who don't 
> > know,
> > that game performs very badly in certain areas, and it is correlated with
> > increased TTM eviction rates. Relevant screenshots of gallium HUD and 
> > sysprof:
> >
> > http://www.fastquake.com/images/screen-dlgalliumhud1-20170513-171241.png
> > http://www.fastquake.com/images/screen-dlsysprof-20170515-225919.png
> >
> > I noticed last week that adding RADEON_DOMAIN_GTT to the domains in radeonsi
> > (patch: http://www.fastquake.com/files/text/radeon-gtt.txt ) greatly 
> > improved
> > performance in these areas, to the tune of about a 30fps increase. 
> > Obviously,
> > putting GTT in every buffer's domain is not a proper solution. But it lead 
> > me
> > to believe that perhaps the problem wasn't just the swapping of resident 
> > BOs,
> > but the creation of new ones that only have VRAM in their domain, and they
> > cause existing BOs to be evicted from visible VRAM unconditionally.
> >
> > The attached patch assigns GTT as the busy placement for newly created BOs 
> > that
> > have the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag, so that they will go to
> > GTT if visible VRAM is full, instead of evicting established BOs. Since 
> > there
> > is no way to know what the usage patterns of a new BO will be, we shouldn't
> > evict established BOs (for which we have hypothetically had the opportunity 
> > to
> > gather usage data) from visible VRAM for new, unknown BOs.
> >
> > With this patch I get hugely improved performance in Dying Light just like 
> > with
> > the Mesa patch: I observed 30-40fps where I got 14 before, and 60fps where I
> > got 40 before. TTM evictions and bytes moved have dropped to zero where they
> > were exceedingly high before. Buffer evictions no longer dominate the prof
> > trace. Screenshots:
> >
> > http://www.fastquake.com/images/screen-dl-gtt_busy_only-20170518-192602.png
> > http://www.fastquake.com/images/screen-dlsysprof-gttpatch-20170518-223200.png
> >
> > --
> > John Brooks (Frogging101)
> >
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

--
John Brooks
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-19 Thread Marek Olšák
Where is your "attached" patch?

Marek

On Fri, May 19, 2017 at 5:04 AM, John Brooks  wrote:
> I'm glad this is being worked on. However, somewhat to my surprise, this patch
> series didn't help Dying Light's BO eviction problem. For those who don't 
> know,
> that game performs very badly in certain areas, and it is correlated with
> increased TTM eviction rates. Relevant screenshots of gallium HUD and sysprof:
>
> http://www.fastquake.com/images/screen-dlgalliumhud1-20170513-171241.png
> http://www.fastquake.com/images/screen-dlsysprof-20170515-225919.png
>
> I noticed last week that adding RADEON_DOMAIN_GTT to the domains in radeonsi
> (patch: http://www.fastquake.com/files/text/radeon-gtt.txt ) greatly improved
> performance in these areas, to the tune of about a 30fps increase. Obviously,
> putting GTT in every buffer's domain is not a proper solution. But it lead me
> to believe that perhaps the problem wasn't just the swapping of resident BOs,
> but the creation of new ones that only have VRAM in their domain, and they
> cause existing BOs to be evicted from visible VRAM unconditionally.
>
> The attached patch assigns GTT as the busy placement for newly created BOs 
> that
> have the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag, so that they will go to
> GTT if visible VRAM is full, instead of evicting established BOs. Since there
> is no way to know what the usage patterns of a new BO will be, we shouldn't
> evict established BOs (for which we have hypothetically had the opportunity to
> gather usage data) from visible VRAM for new, unknown BOs.
>
> With this patch I get hugely improved performance in Dying Light just like 
> with
> the Mesa patch: I observed 30-40fps where I got 14 before, and 60fps where I
> got 40 before. TTM evictions and bytes moved have dropped to zero where they
> were exceedingly high before. Buffer evictions no longer dominate the prof
> trace. Screenshots:
>
> http://www.fastquake.com/images/screen-dl-gtt_busy_only-20170518-192602.png
> http://www.fastquake.com/images/screen-dlsysprof-gttpatch-20170518-223200.png
>
> --
> John Brooks (Frogging101)
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] drm/amdgpu: Tweaks for high pressure on CPU visible VRAM

2017-05-18 Thread John Brooks
I'm glad this is being worked on. However, somewhat to my surprise, this patch
series didn't help Dying Light's BO eviction problem. For those who don't know,
that game performs very badly in certain areas, and it is correlated with
increased TTM eviction rates. Relevant screenshots of gallium HUD and sysprof:

http://www.fastquake.com/images/screen-dlgalliumhud1-20170513-171241.png
http://www.fastquake.com/images/screen-dlsysprof-20170515-225919.png

I noticed last week that adding RADEON_DOMAIN_GTT to the domains in radeonsi
(patch: http://www.fastquake.com/files/text/radeon-gtt.txt ) greatly improved
performance in these areas, to the tune of about a 30fps increase. Obviously,
putting GTT in every buffer's domain is not a proper solution. But it lead me
to believe that perhaps the problem wasn't just the swapping of resident BOs,
but the creation of new ones that only have VRAM in their domain, and they
cause existing BOs to be evicted from visible VRAM unconditionally.

The attached patch assigns GTT as the busy placement for newly created BOs that
have the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag, so that they will go to
GTT if visible VRAM is full, instead of evicting established BOs. Since there
is no way to know what the usage patterns of a new BO will be, we shouldn't
evict established BOs (for which we have hypothetically had the opportunity to
gather usage data) from visible VRAM for new, unknown BOs.

With this patch I get hugely improved performance in Dying Light just like with
the Mesa patch: I observed 30-40fps where I got 14 before, and 60fps where I
got 40 before. TTM evictions and bytes moved have dropped to zero where they
were exceedingly high before. Buffer evictions no longer dominate the prof
trace. Screenshots:

http://www.fastquake.com/images/screen-dl-gtt_busy_only-20170518-192602.png
http://www.fastquake.com/images/screen-dlsysprof-gttpatch-20170518-223200.png

--
John Brooks (Frogging101)

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx