Launchpad has imported 34 comments from the remote bug at
https://bugzilla.kernel.org/show_bug.cgi?id=201275.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2018-09-28T16:51:31+00:00 quirin.blaeser wrote:

Created attachment 278829
lsusb lspci cpuinfo config url sensors

sensors reports a power consumption of 13 W idle for 4.18.10
sensors reports a power consumption of 7 W idle for 4.18.9

Attached:
lsusb
lspci
cpuinfo
url to mainboard an graphics card
kernel config
output of sensors while idle

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/0

------------------------------------------------------------------------
On 2018-09-28T17:11:57+00:00 alexdeucher wrote:

Can you bisect?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/1

------------------------------------------------------------------------
On 2018-09-28T18:14:20+00:00 alexdeucher wrote:

Git bisect howto:
https://www.kernel.org/doc/html/v4.18/admin-guide/bug-bisect.html

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/2

------------------------------------------------------------------------
On 2018-09-28T21:51:16+00:00 quirin.blaeser wrote:

Am 28.09.18 um 20:14 schrieb bugzilla-dae...@bugzilla.kernel.org:
> https://bugzilla.kernel.org/show_bug.cgi?id=201275
> 
> --- Comment #2 from Alex Deucher (alexdeuc...@gmail.com) ---
> Git bisect howto:
> https://www.kernel.org/doc/html/v4.18/admin-guide/bug-bisect.html
> 
Sounds prakticable, but may take 1-2 days.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/3

------------------------------------------------------------------------
On 2018-09-29T02:44:41+00:00 Dieter wrote:

I can second that for RX580 (Polaris20)

It raised to 60 W 'idle' (from ~31/32 W with 4.18.9)

bisect? - Not so fast 'cause I use openSUSE Tumbleweed 'Kernel:stable'
when 'amd-staging-drm-next' do NOT work for me --- and it do NOT work
for me since 21/22 August.... but that come with another ticket.

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:       +1.20 V  
fan1:         886 RPM
temp1:        +43.0°C  (crit = +94.0°C, hyst = -273.1°C)
power1:       59.16 W  (cap = 175.00 W)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/4

------------------------------------------------------------------------
On 2018-09-29T21:20:33+00:00 quirin.blaeser wrote:

Created attachment 278841
bisect result

Includes bisect steps + sensors output

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/5

------------------------------------------------------------------------
On 2018-09-29T21:28:30+00:00 quirin.blaeser wrote:

RX560 is Polaris11, so Bug may be ported from Polaris20

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/6

------------------------------------------------------------------------
On 2018-09-30T14:07:38+00:00 quirin.blaeser wrote:

For now i have resolved this problem by simply removing patch for
4.18.11

git diff 
93b100ddda3be284be160e9ccba28c7f8f21ab73^1..93b100ddda3be284be160e9ccba28c7f8f21ab73
and patch -R -p1

or without "-R":

git diff
93b100ddda3be284be160e9ccba28c7f8f21ab73..93b100ddda3be284be160e9ccba28c7f8f21ab73^1

Maybe the specs for Vega and Polaris just differ at this point?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/7

------------------------------------------------------------------------
On 2018-10-04T17:56:53+00:00 quirin.blaeser wrote:

4.18.12
Bug still present, removing 93b100ddda3be284be160e9ccba28c7f8f21ab73 solves 
this problem for now.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/8

------------------------------------------------------------------------
On 2018-10-04T20:21:41+00:00 alexdeucher wrote:

I don't think this is a bug.  The problem is, prior to that patch, the
display component was requesting minimum clocks that were 10x too low.
This saved power, but led to display problems on some systems because
the clocks were too low to sustain the display requirements.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/9

------------------------------------------------------------------------
On 2018-10-04T22:48:47+00:00 Dieter wrote:

(In reply to Alex Deucher from comment #9)
> I don't think this is a bug.  The problem is, prior to that patch, the
> display component was requesting minimum clocks that were 10x too low.  This
> saved power, but led to display problems on some systems because the clocks
> were too low to sustain the display requirements.

Sorry Alex,
what?

_All_ was fine _before_ this commit for ages with stable upstream and
all 'amd-staging-drm-next'.

Now, I get ~60 W raised from ~30 W with 1920x1080 (even dual display was
good before).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/10

------------------------------------------------------------------------
On 2018-10-05T02:37:12+00:00 alexdeucher wrote:

Can you attach the output of `cat
/sys/kernel/debug/dri/0/amdgpu_pm_info` before and after the patch?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/11

------------------------------------------------------------------------
On 2018-10-05T03:04:04+00:00 Dieter wrote:

openSUSE Tumbleweed Kernel:stable 4.18.12-2.ga880bd8-default

After the patch.
(For 'before' I have to reboot to broken 'amd-staging-drm-next')
https://bugs.freedesktop.org/show_bug.cgi?id=108096

Clock Gating Flags Mask: 0x37bcf
        Graphics Medium Grain Clock Gating: On
        Graphics Medium Grain memory Light Sleep: On
        Graphics Coarse Grain Clock Gating: On
        Graphics Coarse Grain memory Light Sleep: On
        Graphics Coarse Grain Tree Shader Clock Gating: Off
        Graphics Coarse Grain Tree Shader Light Sleep: Off
        Graphics Command Processor Light Sleep: On
        Graphics Run List Controller Light Sleep: On
        Graphics 3D Coarse Grain Clock Gating: Off
        Graphics 3D Coarse Grain memory Light Sleep: Off
        Memory Controller Light Sleep: On
        Memory Controller Medium Grain Clock Gating: On
        System Direct Memory Access Light Sleep: Off
        System Direct Memory Access Medium Grain Clock Gating: On
        Bus Interface Medium Grain Clock Gating: Off
        Bus Interface Light Sleep: On
        Unified Video Decoder Medium Grain Clock Gating: On
        Video Compression Engine Medium Grain Clock Gating: On
        Host Data Path Light Sleep: Off
        Host Data Path Medium Grain Clock Gating: On
        Digital Right Management Medium Grain Clock Gating: Off
        Digital Right Management Light Sleep: Off
        Rom Medium Grain Clock Gating: On
        Data Fabric Medium Grain Clock Gating: Off

GFX Clocks and Power:
        2000 MHz (MCLK)
        1411 MHz (SCLK)
        600 MHz (PSTATE_SCLK)
        1000 MHz (PSTATE_MCLK)
        1200 mV (VDDGFX)
        61.254 W (average GPU)

GPU Temperature: 44 C
GPU Load: 0 %

UVD: Disabled

VCE: Disabled

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/12

------------------------------------------------------------------------
On 2018-10-05T03:41:47+00:00 Dieter wrote:

amd-staging-drm-next (with broken SDDM and then 'init 3')

Why is 'GPU Load:' so hight?
 => take it with a drain of salt.

Clock Gating Flags Mask: 0x3fbcf
        Graphics Medium Grain Clock Gating: On
        Graphics Medium Grain memory Light Sleep: On
        Graphics Coarse Grain Clock Gating: On
        Graphics Coarse Grain memory Light Sleep: On
        Graphics Coarse Grain Tree Shader Clock Gating: Off
        Graphics Coarse Grain Tree Shader Light Sleep: Off
        Graphics Command Processor Light Sleep: On
        Graphics Run List Controller Light Sleep: On
        Graphics 3D Coarse Grain Clock Gating: Off
        Graphics 3D Coarse Grain memory Light Sleep: Off
        Memory Controller Light Sleep: On
        Memory Controller Medium Grain Clock Gating: On
        System Direct Memory Access Light Sleep: Off
        System Direct Memory Access Medium Grain Clock Gating: On
        Bus Interface Medium Grain Clock Gating: Off
        Bus Interface Light Sleep: On
        Unified Video Decoder Medium Grain Clock Gating: On
        Video Compression Engine Medium Grain Clock Gating: On
        Host Data Path Light Sleep: On
        Host Data Path Medium Grain Clock Gating: On
        Digital Right Management Medium Grain Clock Gating: Off
        Digital Right Management Light Sleep: Off
        Rom Medium Grain Clock Gating: On
        Data Fabric Medium Grain Clock Gating: Off

GFX Clocks and Power:
        300 MHz (MCLK)
        303 MHz (SCLK)
        600 MHz (PSTATE_SCLK)
        1000 MHz (PSTATE_MCLK)
        831 mV (VDDGFX)
        32.176 W (average GPU)

GPU Temperature: 29 C
GPU Load: 84 %

UVD: Disabled

VCE: Disabled

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/13

------------------------------------------------------------------------
On 2018-10-05T03:43:19+00:00 Dieter wrote:

Diff !!!

BAD
Host Data Path Light Sleep: Off

GOOD
Host Data Path Light Sleep: On

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/14

------------------------------------------------------------------------
On 2018-10-05T03:52:06+00:00 Dieter wrote:

(In reply to Dieter Nützel from comment #14)
> Diff !!!
> 
> BAD
> Host Data Path Light Sleep: Off

card0/device> cat pp_dpm_mclk 
0: 300Mhz 
1: 1000Mhz 
2: 2000Mhz *

card0/device> cat pp_dpm_sclk 
0: 300Mhz 
1: 600Mhz 
2: 900Mhz 
3: 1145Mhz 
4: 1215Mhz 
5: 1257Mhz 
6: 1300Mhz 
7: 1411Mhz *


> GOOD
> Host Data Path Light Sleep: On

card0/device cat pp_dpm_mclk
0: 300Mhz *
1: 1000Mhz
2: 2000Mhz

card0/device cat pp_dpm_sclk
0: 300Mhz
1: 600Mhz *
2: 900Mhz
3: 1145Mhz
4: 1215Mhz
5: 1257Mhz
6: 1300Mhz
7: 1411Mhz

But SCLK changed much.

Need badly some sleep.
Saturday morning off for family vacation.

Greetings,
Dieter

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/15

------------------------------------------------------------------------
On 2018-10-05T08:25:45+00:00 quirin.blaeser wrote:

Created attachment 278933
amdgpu_pm_info

content of /sys/kernel/debug/dri/1/amdgpu_pm_info
for 4.18.12 +/- 93b100ddda3be284be160e9ccba28c7f8f21ab73

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/16

------------------------------------------------------------------------
On 2018-10-05T11:52:39+00:00 grmat wrote:

I can confirm the issue with Polaris10. Power consumption is roughly 30
Watts higher in idle compared to what it used to be and compared to
Windows. DPM are stuck in highest power modes for both s and m.

The reporter has already bisected so I haven't. If you still need more
info, please ping.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/17

------------------------------------------------------------------------
On 2018-10-05T13:50:25+00:00 quirin.blaeser wrote:

(In reply to Alex Deucher from comment #9)
> I don't think this is a bug.  The problem is, prior to that patch, the
> display component was requesting minimum clocks that were 10x too low.  This
> saved power, but led to display problems on some systems because the clocks
> were too low to sustain the display requirements.

so
 - 93b100ddd... fools DC
 + 93b100ddd... fools PM

from my point of view scaling clock values just happens at the wrong place.
So we may have to find different points in code from where smuX_get_XXX gets 
called by PM _or_ DC, may be in Firmware.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/18

------------------------------------------------------------------------
On 2018-10-05T14:44:38+00:00 michel wrote:

People on the Phoronix forum mentioned that this doesn't seem to happen
with 4.19-rc kernels. If people here can confirm that, maybe there are
other corresponding changes that need to be backported as well.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/19

------------------------------------------------------------------------
On 2018-10-05T22:18:46+00:00 quirin.blaeser wrote:

(In reply to Michel Dänzer from comment #19)
> People on the Phoronix forum mentioned that this doesn't seem to happen with
> 4.19-rc kernels. If people here can confirm that, maybe there are other
> corresponding changes that need to be backported as well.

4.19-rc1:

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:       +0.81 V  
fan1:        1602 RPM
temp1:        +22.0°C  (crit = +94.0°C, hyst = -273.1°C)
power1:        6.10 W  (cap =  48.00 W)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/20

------------------------------------------------------------------------
On 2018-10-06T10:43:21+00:00 thomas-lange2 wrote:

I can confirm that only 4.18.x (x > 9) is affected. 4.19-rc6 reports the
same clock and power values as with 4.18.9. At least that's the case for
my RX 560.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/21

------------------------------------------------------------------------
On 2018-10-07T00:34:38+00:00 quirin.blaeser wrote:

Created attachment 278939
bisect

Author of 93b100ddda3be284be160e9ccba28c7f8f21ab73 simply forgot to remove 
scaling values for powerplay.
Have a look at 23ec3d1479fd79658cd52c47618d8ddd2f32550b where the same scaling 
applied to vega.
You may have to patch drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_services.c
too.
Have a look at needed_patch.txt

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/22

------------------------------------------------------------------------
On 2018-10-07T09:20:01+00:00 quirin.blaeser wrote:

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_services.c (v4.18.12)
There is a lot of work to do:

      230         for (i = 0; i < dc_clks->num_levels; i++) {
      231                 DRM_INFO("DM_PPLIB:\t %d\n", pp_clks->clock[i]);
      232                 /* translate 10kHz to kHz */
      233                 dc_clks->clocks_in_khz[i] = pp_clks->clock[i] * 10;
      234         }

      257         for (i = 0; i < clk_level_info->num_levels; i++) {
      258                 DRM_DEBUG("DM_PPLIB:\t %d in 10kHz\n", 
pp_clks->data[i].clocks_in_khz);
      259                 /* translate 10kHz to kHz */
      260                 clk_level_info->data[i].clocks_in_khz
                            = pp_clks->data[i].clocks_in_khz * 10;
      261                 clk_level_info->data[i].latency_in_us
                            = pp_clks->data[i].latency_in_us;
      262         }

and maybe

      306         /* Translate 10 kHz to kHz. */
      307         validation_clks.engine_max_clock *= 10;
      308         validation_clks.memory_max_clock *= 10;


since 2017-09-12 15:58:20

bool dm_pp_get_clock_levels_by_type_with_voltage(
        const struct dc_context *ctx,
        enum dm_pp_clock_type clk_type,
        struct dm_pp_clock_levels_with_voltage *clk_level_info)
{
        /* TODO: to be implemented */
        return false;
}

bool dm_pp_notify_wm_clock_changes(
        const struct dc_context *ctx,
        struct dm_pp_wm_sets_with_clock_ranges *wm_with_clock_ranges)
{
        /* TODO: to be implemented */
        return false;
}

bool dm_pp_apply_power_level_change_request(
        const struct dc_context *ctx,
        struct dm_pp_power_level_change_request *level_change_req)
{
        /* TODO: to be implemented */
        return false;
}

bool dm_pp_apply_clock_for_voltage_request(
        const struct dc_context *ctx,
        struct dm_pp_clock_for_voltage_req *clock_for_voltage_req)
{
        /* TODO: to be implemented */
        return false;
}

bool dm_pp_get_static_clocks(
        const struct dc_context *ctx,
        struct dm_pp_static_clock_info *static_clk_info)
{
        /* TODO: to be implemented */
        return false;
}

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/23

------------------------------------------------------------------------
On 2018-10-08T14:26:57+00:00 alexdeucher wrote:

This patch shouldn't have been applied to 4.18.  It looks like it was 
autoselected:
https://lkml.org/lkml/2018/9/15/172
It should be reverted.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/24

------------------------------------------------------------------------
On 2018-10-08T19:34:55+00:00 quirin.blaeser wrote:

(In reply to Alex Deucher from comment #24)
> This patch shouldn't have been applied to 4.18.  It looks like it was
> autoselected:
> https://lkml.org/lkml/2018/9/15/172
> It should be reverted.

So "[...]the display component was requesting minimum clocks[...]" isn´t an 
issue with Polaris?
Is there any QA left?
Avoiding unusual units is a good idea generally, but it should happen very 
early in development.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/29

------------------------------------------------------------------------
On 2018-10-08T19:42:16+00:00 alexdeucher wrote:

(In reply to quirin.blaeser from comment #25)
> (In reply to Alex Deucher from comment #24)
> > This patch shouldn't have been applied to 4.18.  It looks like it was
> > autoselected:
> > https://lkml.org/lkml/2018/9/15/172
> > It should be reverted.
> 
> So "[...]the display component was requesting minimum clocks[...]" isn´t an
> issue with Polaris?
> Is there any QA left?
> Avoiding unusual units is a good idea generally, but it should happen very
> early in development.

It was a fix for fallout from an interface refactor we did in 4.19 that
mixed up the units between display and power.  We did not intend to have
the patch applied to 4.18 and we did not flag the patch for 4.18, it was
flagged for 4.18 by someone else outside of AMD.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/30

------------------------------------------------------------------------
On 2018-10-08T19:49:14+00:00 quirin.blaeser wrote:

I can´t find "c3df50abc84b" from https://lkml.org/lkml/2018/9/15/172
but "drm/amd/pp: Convert clock unit to KHz as defined" is 
23ec3d1479fd79658cd52c47618d8ddd2f32550b

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/31

------------------------------------------------------------------------
On 2018-10-08T19:59:04+00:00 alexdeucher wrote:

(In reply to quirin.blaeser from comment #27)
> I can´t find "c3df50abc84b" from https://lkml.org/lkml/2018/9/15/172
> but "drm/amd/pp: Convert clock unit to KHz as defined" is
> 23ec3d1479fd79658cd52c47618d8ddd2f32550b

That was the commit id in our amd-staging-drm-next branch:
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-next&id=c3df50abc84b289be8e7b96968d7d7e006576880

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/32

------------------------------------------------------------------------
On 2018-10-08T20:58:38+00:00 quirin.blaeser wrote:

Created attachment 278965
git diff -p 
93b100ddda3be284be160e9ccba28c7f8f21ab73..93b100ddda3be284be160e9ccba28c7f8f21ab73^1

git diff -p
93b100ddda3be284be160e9ccba28c7f8f21ab73..93b100ddda3be284be160e9ccba28c7f8f21ab73^1

Apply to v4.18.10 .. v4.18.12 to revert

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/33

------------------------------------------------------------------------
On 2018-10-08T20:59:36+00:00 quirin.blaeser wrote:

(In reply to Alex Deucher from comment #28)
> (In reply to quirin.blaeser from comment #27)
> > I can´t find "c3df50abc84b" from https://lkml.org/lkml/2018/9/15/172
> > but "drm/amd/pp: Convert clock unit to KHz as defined" is
> > 23ec3d1479fd79658cd52c47618d8ddd2f32550b
> 
> That was the commit id in our amd-staging-drm-next branch:
> https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-
> next&id=c3df50abc84b289be8e7b96968d7d7e006576880

thx

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/34

------------------------------------------------------------------------
On 2018-10-09T18:29:04+00:00 caravena wrote:

Hello,

This error is also registered on launchpad.net
https://bugs.launchpad.net/bugs/1796720

Best regards,
--
Cristian Aravena Romero (caravena)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/40

------------------------------------------------------------------------
On 2018-10-10T11:39:14+00:00 caravena wrote:

Hello,

The kernel 4.18.13 works correctly?

Best regards,
--
Cristian Aravena Romero (caravena)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/42

------------------------------------------------------------------------
On 2018-10-10T13:42:12+00:00 harry.wentland wrote:

Yes, it should. GregKH reverted the offending commit in  4.18.13.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1796720/comments/43


** Changed in: linux
       Status: Unknown => Fix Released

** Changed in: linux
   Importance: Unknown => Low

** Bug watch added: freedesktop.org Bugzilla #108096
   https://bugs.freedesktop.org/show_bug.cgi?id=108096

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1796720

Title:
  High GPU temperature after kernel 4.18.0.9.10 on AMD RX460

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1796720/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to