[PATCH] drm/i915/fbc: Add sizes to info message about reducing fb size

2024-05-13 Thread Paul Menzel
The info message currently does not contain any information, how much
the stolen memory size should be increased.

[drm] Reducing the compressed framebuffer size. This may lead to less power 
savings than a non-reduced-size. Try to increase stolen memory size if 
available in BIOS.

To be more useful to the user, add the sizes to the message.

Signed-off-by: Paul Menzel 
---
 drivers/gpu/drm/i915/display/intel_fbc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c 
b/drivers/gpu/drm/i915/display/intel_fbc.c
index b453fcbd67da..e4a5d251013f 100644
--- a/drivers/gpu/drm/i915/display/intel_fbc.c
+++ b/drivers/gpu/drm/i915/display/intel_fbc.c
@@ -801,7 +801,8 @@ static int intel_fbc_alloc_cfb(struct intel_fbc *fbc,
goto err_llb;
else if (ret > min_limit)
drm_info_once(>drm,
- "Reducing the compressed framebuffer size. This 
may lead to less power savings than a non-reduced-size. Try to increase stolen 
memory size if available in BIOS.\n");
+ "Reducing the compressed framebuffer size from %d 
bytes to %d bytes. This may lead to less power savings than a non-reduced-size. 
Try to increase stolen memory size if available in BIOS.\n",
+ min_limit, ret);
 
fbc->limit = ret;
 
-- 
2.43.0



Re: Powered off Philips TV sends corrupt EDID causing flickering

2023-11-23 Thread Paul Menzel

Dear Jani,


Thank you for your reply.

Am 22.11.23 um 11:38 schrieb Jani Nikula:

On Tue, 21 Nov 2023, Paul Menzel wrote:



Connecting a USB Type-C port replicator [1] to the only USB Type-C port
of the Dell XPS 13 9360 with Debian sid/unstable and Debian’s Linux
kernel 6.10.5, and then connecting a Philips 40PFL5206H/12 TV device,
that is powered off or in standby, to the HDMI port, Linux logs:


[…]


Depending on how the port replicator works, this may not come from the
TV at all.

And all of this probably depends on GPU and driver, which are not
mentioned.


Sorry for just mentioning the laptop model. It uses the device below:

00:02.0 VGA compatible controller [0300]: Intel Corporation HD 
Graphics 620 [8086:5916] (rev 02) (prog-if 00 [VGA controller])



If it's i915, please see [1] on how to file a bug.


Thank you for taking the time to tell me the proper forum. I created the 
two issues below:


1.  EDID has corrupt header [2]
2.  No image on Philips TV when turning on while connected over 
HDMI/USB-C port replicator (`[drm] *ERROR* Link Training Unsuccessful`) [3]



Kind regards,

Paul



[1] https://drm.pages.freedesktop.org/intel-docs/how-to-file-i915-bugs.html

[2]: https://gitlab.freedesktop.org/drm/intel/-/issues/9705
[3]: https://gitlab.freedesktop.org/drm/intel/-/issues/9707


Powered off Philips TV sends corrupt EDID causing flickering

2023-11-21 Thread Paul Menzel

Dear Linux folks,


Connecting a USB Type-C port replicator [1] to the only USB Type-C port 
of the Dell XPS 13 9360 with Debian sid/unstable and Debian’s Linux 
kernel 6.10.5, and then connecting a Philips 40PFL5206H/12 TV device, 
that is powered off or in standby, to the HDMI port, Linux logs:


```
[0.00] Linux version 6.5.0-4-amd64 
(debian-ker...@lists.debian.org) (gcc-13 (Debian 13.2.0-6) 13.2.0, GNU 
ld (GNU Binutils for Debian) 2.41) #1 SMP PREEMPT_DYNAMIC Debian 
6.5.10-1 (2023-11-03)

[…]
[0.00] DMI: Dell Inc. XPS 13 9360/0596KF, BIOS 2.21.0 06/02/2022
[…]
[  160.004836] EDID has corrupt header
[  160.004866]  [00] BAD  00 00 00 00 00 00 00 00 00 7f ff ff ff ff ff ff
[  160.004875]  [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  160.004881]  [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  160.004886]  [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  160.004905]  [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  160.004911]  [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  160.004917]  [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  160.004921]  [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  160.013662] Registered IR keymap rc-cec
[  160.014001] rc rc0: DP-1 as /devices/pci:00/:00:02.0/rc/rc0
[  160.014305] input: DP-1 as 
/devices/pci:00/:00:02.0/rc/rc0/input33

[  160.228342] EDID has corrupt header
[  160.408917] EDID has corrupt header
```

The internal display of the laptop also switches off shortly, but comes 
back after two or three seconds. This is very likely due to me 
configuring the internal display to turn off once an external display is 
used.


No idea, why the TV is able to transmit anything at all, when it is 
turned/powered off. To avoid the flickering, would it be possible to 
ignore events from displays sending such incorrect EDID?



Kind regards,

Paul


mt8183 (google/krane): Resolution over USB-C adapter limited to 1280x720

2023-08-17 Thread Paul Menzel

Dear Linux folks,


Using the ten inch tablet Lenovo IdeaPad Duet Chromebook 2in1 with 
recent ChromeOS, connecting a Dell DA200 (strange display chip inside) 
or Dell DA300z the available resolutions are limited to 1280x720 and not 
the supported 1920x1080. The Dell monitor is connected over HDMI to the 
adapter.


[0.00] Linux version 5.10.180-22631-gc8e37fc5f0ab 
(chrome-bot@chromeos-release-builder-us-central1-b-x32-66-okmh) 
(Chromium OS 17.0_pre496208_p20230501-r6 clang version 17.0.0 
(/mnt/host/source/src/third_party/llvm-project 
98f5a340975bc00197c57e39eb4ca26e2da0e8a2), LLD 17.0.0) #1 SMP PREEMPT 
Wed Jul 26 19:01:55 PDT 2023

[…]
[0.00] Machine model: MediaTek krane sku176 board

Please find the full output from `dmesg` attached. At 144.607419 the 
Dell DA200 is connected. At 691.133117 the Dell DA300z.


I also reported the issue to the Google bug tracker [1].


Kind regards,

Paul


[1]: https://issuetracker.google.com/issues/295666708crosh> dmesg
[0.00] Booting Linux on physical CPU 0x00 [0x410fd034]
[0.00] Linux version 5.10.180-22631-gc8e37fc5f0ab 
(chrome-bot@chromeos-release-builder-us-central1-b-x32-66-okmh) (Chromium OS 
17.0_pre496208_p20230501-r6 clang version 17.0.0 
(/mnt/host/source/src/third_party/llvm-project 
98f5a340975bc00197c57e39eb4ca26e2da0e8a2), LLD 17.0.0) #1 SMP PREEMPT Wed Jul 
26 19:01:55 PDT 2023
[0.00] random: crng init done
[0.00] Machine model: MediaTek krane sku176 board
[0.00] Malformed early option 'console'
[0.00] Reserved memory: created DMA memory pool at 0x5000, 
size 41 MiB
[0.00] OF: reserved mem: initialized node scp_mem_region, compatible id 
shared-dma-pool
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x4000-0x]
[0.00]   DMA32empty
[0.00]   Normal   [mem 0x0001-0x00013fff]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x4000-0x4fff]
[0.00]   node   0: [mem 0x5000-0x528f]
[0.00]   node   0: [mem 0x5290-0x545f]
[0.00]   node   0: [mem 0x5470-0xffdf]
[0.00]   node   0: [mem 0x0001-0x00013fff]
[0.00] Initmem setup node 0 [mem 0x4000-0x00013fff]
[0.00] On node 0 totalpages: 1047808
[0.00]   DMA zone: 12288 pages used for memmap
[0.00]   DMA zone: 0 pages reserved
[0.00]   DMA zone: 785664 pages, LIFO batch:63
[0.00]   Normal zone: 4096 pages used for memmap
[0.00]   Normal zone: 262144 pages, LIFO batch:63
[0.00] On node 0, zone DMA: 256 pages in unavailable ranges
[0.00] On node 0, zone Normal: 512 pages in unavailable ranges
[0.00] psci: probing for conduit method from DT.
[0.00] psci: PSCIv1.1 detected in firmware.
[0.00] psci: Using standard PSCI v0.2 function IDs
[0.00] psci: MIGRATE_INFO_TYPE not supported.
[0.00] psci: SMC Calling Convention v1.2
[0.00] percpu: Embedded 32 pages/cpu s93336 r8192 d29544 u131072
[0.00] pcpu-alloc: s93336 r8192 d29544 u131072 alloc=32*4096
[0.00] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 [0] 4 [0] 5 [0] 6 [0] 7 
[0.00] Detected VIPT I-cache on CPU0
[0.00] CPU features: detected: ARM erratum 845719
[0.00] CPU features: detected: GIC system register CPU interface
[0.00] CPU features: kernel page table isolation forced ON by KASLR
[0.00] CPU features: detected: Kernel page table isolation (KPTI)
[0.00] Built 1 zonelists, mobility grouping on.  Total pages: 1031424
[0.00] Kernel command line: cros_secure console= loglevel=7 
init=/sbin/init cros_secure drm.trace=0x106 root=/dev/dm-0 rootwait ro 
dm_verity.error_behavior=3 dm_verity.max_bios=-1 dm_verity.dev_wait=1 dm="1 
vroot none ro 1,0 6144000 verity 
payload=PARTUUID=6054c8b5-2c17-4e3e-9f5f-bc2a96bf764b/PARTNROFF=1 
hashtree=PARTUUID=6054c8b5-2c17-4e3e-9f5f-bc2a96bf764b/PARTNROFF=1 
hashstart=6144000 alg=sha256 
root_hexdigest=8e442d4eb6f01e3bd21b5b42ede889cd67a316fd949cf05b380e2142181e440f 
salt=23867f54c1aa0a406880edb52304b63453d25a71950b6d213fce0b11185a03fb" noinitrd 
vt.global_cursor_default=0 kern_guid=6054c8b5-2c17-4e3e-9f5f-bc2a96bf764b 
cpuidle.governor=teo  
[0.00] Dentry cache hash table entries: 524288 (order: 10, 4194304 
bytes, linear)
[0.00] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes, 
linear)
[0.00] mem auto-init: stack:all(zero), heap alloc:on, heap free:off
[0.00] software IO TLB: mapped [mem 
0xfbe0-0xffe0] (64MB)
[0.00] Memory: 3977196K/4191232K available (1K kernel code, 2488K 
rwdata, 5172K rodata, 1408K init, 1073K bss, 214036K reserved, 0K cma-reserved)
[  

[PATCH] drm/amdgpu: Log if device is unsupported

2023-06-05 Thread Paul Menzel
Since there is overlap in supported devices, both modules load, but only
one will bind to a particular device depending on the model and user's
configuration.

amdgpu binds to all display class devices with VID 0x1002 and then
determines whether or not to bind to a device based on whether the
individual device is supported by the driver or not. Log that case, so
users looking at the logs know what is going on.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2608
Signed-off-by: Paul Menzel 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 86fbb4138285..410ff918c350 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2062,8 +2062,10 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
 
/* skip devices which are owned by radeon */
for (i = 0; i < ARRAY_SIZE(amdgpu_unsupported_pciidlist); i++) {
-   if (amdgpu_unsupported_pciidlist[i] == pdev->device)
+   if (amdgpu_unsupported_pciidlist[i] == pdev->device) {
+   DRM_INFO("This hardware is only supported by radeon.");
return -ENODEV;
+   }
}
 
if (amdgpu_aspm == -1 && !pcie_aspm_enabled(pdev))
-- 
2.40.1



Problems with delivery to dal...@libc.org (was: [PATCH v2] matroxfb: G200eW: Increase max memory from 1 MB to 16 MB)

2023-01-02 Thread Paul Menzel

[Cc: Back to dal...@libc.org]

Dear Linux folks,


Please ignore version 2.


Am 02.01.23 um 15:02 schrieb Paul Menzel:

[…]


---
Update Rich’s address.


I should have read the undelivered message better:

```
: host brightrain.aerifal.cx[216.12.86.13] said: 
550-Message
blocked for policy reasons: 550-Your mail system is forging its 
hostname.

550 Message could not be delivered (in reply to end of DATA command)
```

I got the same for dal...@aerifal.cx. No idea, what is supposedly wrong 
with our setup:


$ host mx.molgen.mpg.de
mx.molgen.mpg.de has address 141.14.17.8
$ host 141.14.17.8
8.17.14.141.in-addr.arpa domain name pointer mx.molgen.mpg.de.


Kind regards,

Paul


[PATCH v2] matroxfb: G200eW: Increase max memory from 1 MB to 16 MB

2023-01-02 Thread Paul Menzel
Commit 62d89a7d49af ("video: fbdev: matroxfb: set maxvram of vbG200eW to
the same as vbG200 to avoid black screen") accidently decreases the
maximum memory size for the Matrox G200eW (102b:0532) from 8 MB to 1 MB
by missing one zero. This caused the driver initialization to fail with
the messages below, as the minimum required VRAM size is 2 MB:

 [9.436420] matroxfb: Matrox MGA-G200eW (PCI) detected
 [9.444502] matroxfb: cannot determine memory size
 [9.449316] matroxfb: probe of :0a:03.0 failed with error -1

So, add the missing 0 to make it the intended 16 MB. Successfully tested on
the Dell PowerEdge R910/0KYD3D, BIOS 2.10.0 08/29/2013, that the warning is
gone.

While at it, add a leading 0 to the maxdisplayable entry, so it’s aligned
properly. The value could probably also be increased from 8 MB to 16 MB, as
the G200 uses the same values, but I have not checked any datasheet.

Note, matroxfb is obsolete and superseded by the maintained DRM driver
mga200, which is used by default on most systems where both drivers are
available. Therefore, on most systems it was only a cosmetic issue.

Fixes: 62d89a7d49af ("video: fbdev: matroxfb: set maxvram of vbG200eW to the 
same as vbG200 to avoid black screen")
Link: 
https://lore.kernel.org/linux-fbdev/972999d3-b75d-5680-fcef-6e6905c52...@suse.de/T/#mb6953a9995ebd18acc8552f99d6db39787aec775
Cc: it+linux-fb...@molgen.mpg.de
Cc: Z. Liu 
Cc: Rich Felker 
Cc: sta...@vger.kernel.org
Signed-off-by: Paul Menzel 
---
Update Rich’s address.

 drivers/video/fbdev/matrox/matroxfb_base.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/video/fbdev/matrox/matroxfb_base.c 
b/drivers/video/fbdev/matrox/matroxfb_base.c
index 0d3cee7ae7268..a043a737ea9f7 100644
--- a/drivers/video/fbdev/matrox/matroxfb_base.c
+++ b/drivers/video/fbdev/matrox/matroxfb_base.c
@@ -1378,8 +1378,8 @@ static struct video_board vbG200 = {
.lowlevel = _G100
 };
 static struct video_board vbG200eW = {
-   .maxvram = 0x10,
-   .maxdisplayable = 0x80,
+   .maxvram = 0x100,
+   .maxdisplayable = 0x080,
.accelID = FB_ACCEL_MATROX_MGAG200,
.lowlevel = _G100
 };
-- 
2.39.0



[PATCH] matroxfb: G200eW: Increase max memory from 1 MB to 16 MB

2023-01-02 Thread Paul Menzel
Commit 62d89a7d49af ("video: fbdev: matroxfb: set maxvram of vbG200eW to
the same as vbG200 to avoid black screen") accidently decreases the
maximum memory size for the Matrox G200eW (102b:0532) from 8 MB to 1 MB
by missing one zero. This caused the driver initialization to fail with
the messages below, as the minimum required VRAM size is 2 MB:

 [9.436420] matroxfb: Matrox MGA-G200eW (PCI) detected
 [9.444502] matroxfb: cannot determine memory size
 [9.449316] matroxfb: probe of :0a:03.0 failed with error -1

So, add the missing 0 to make it the intended 16 MB. Successfully tested on
the Dell PowerEdge R910/0KYD3D, BIOS 2.10.0 08/29/2013, that the warning is
gone.

While at it, add a leading 0 to the maxdisplayable entry, so it’s aligned
properly. The value could probably also be increased from 8 MB to 16 MB, as
the G200 uses the same values, but I have not checked any datasheet.

Note, matroxfb is obsolete and superseded by the maintained DRM driver
mga200, which is used by default on most systems where both drivers are
available. Therefore, on most systems it was only a cosmetic issue.

Fixes: 62d89a7d49af ("video: fbdev: matroxfb: set maxvram of vbG200eW to the 
same as vbG200 to avoid black screen")
Link: 
https://lore.kernel.org/linux-fbdev/972999d3-b75d-5680-fcef-6e6905c52...@suse.de/T/#mb6953a9995ebd18acc8552f99d6db39787aec775
Cc: it+linux-fb...@molgen.mpg.de
Cc: Z. Liu 
Cc: Rich Felker 
Cc: sta...@vger.kernel.org
Signed-off-by: Paul Menzel 
---
 drivers/video/fbdev/matrox/matroxfb_base.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/video/fbdev/matrox/matroxfb_base.c 
b/drivers/video/fbdev/matrox/matroxfb_base.c
index 0d3cee7ae7268..a043a737ea9f7 100644
--- a/drivers/video/fbdev/matrox/matroxfb_base.c
+++ b/drivers/video/fbdev/matrox/matroxfb_base.c
@@ -1378,8 +1378,8 @@ static struct video_board vbG200 = {
.lowlevel = _G100
 };
 static struct video_board vbG200eW = {
-   .maxvram = 0x10,
-   .maxdisplayable = 0x80,
+   .maxvram = 0x100,
+   .maxdisplayable = 0x080,
.accelID = FB_ACCEL_MATROX_MGAG200,
.lowlevel = _G100
 };
-- 
2.39.0



Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems

2022-05-03 Thread Paul Menzel

Dear Daniel,


Am 03.05.22 um 14:25 schrieb Daniel Stone:

On Sun, 1 May 2022 at 08:08, Paul Menzel  wrote:

Am 26.04.22 um 15:53 schrieb Gong, Richard:

I think so. We captured dmesg log.


Then the (whole) system did *not* freeze, if you could still log in
(maybe over network) and execute `dmesg`. Please also paste the
amdgpu(?) error logs in the commit message.


As mentioned early we need support from Intel on how to get ASPM working
for VI generation on Intel Alder Lake, but we don't know where things
currently stand.


Who is working on this, and knows?


This has gone beyond the point of a reasonable request. The amount of
detail you're demanding is completely unnecessary.


If a quirk is introduced possibly leading to higher power consumption, 
especially on systems nobody has access to yet, then the detail, where 
the system hangs/freezes is not unreasonable at all.


In the Linux logs from the issue there are messages like

[   58.101385] Freezing of tasks failed after 20.003 seconds (4 
tasks refusing to freeze, wq_busy=0):


[   78.278403] Freezing of tasks failed after 20.008 seconds (4 
tasks refusing to freeze, wq_busy=0):


and it looks like several suspend/resume cycles were done.

I see a lot of commit messages over the whole Linux kernel, where this 
level of detail is provided (by default), and


The second question was not for the commit message, but just for 
documentation purpose when the problem is going to be fixed properly. 
And it looks like (at least publicly) analyzing the root cause is not 
happening, and once the quirk lands, nobody is going to feel the 
pressure to work on it, as everyone’s plates are full.



Kind regards,

Paul


Re: [PATCHv5] drm/amdgpu: vi: disable ASPM on Intel Alder Lake based systems

2022-05-01 Thread Paul Menzel

Dear Richard,


Am 29.04.22 um 18:06 schrieb Richard Gong:

Active State Power Management (ASPM) feature is enabled since kernel 5.14.
There are some AMD Volcanic Islands (VI) GFX cards, such as the WX3200 and
RX640, that do not work with ASPM-enabled Intel Alder Lake based systems.
Using these GFX cards as video/display output, Intel Alder Lake based
systems will freeze after suspend/resume.


As replied in v4 just now, “freeze” is misleading if you can still run 
`dmesg` after resume.



Kind regards,

Paul



The issue was originally reported on one system (Dell Precision 3660 with
BIOS version 0.14.81), but was later confirmed to affect at least 4
pre-production Alder Lake based systems.

Add an extra check to disable ASPM on Intel Alder Lake based systems with
the problematic AMD Volcanic Islands GFX cards.

Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
Reported-by: kernel test robot 
Signed-off-by: Richard Gong 
---
v5: added vi to commit header and updated commit message
 rolled back guard with the preprocessor as did in v2 to correct build
 error on non-x86 systems
v4: s/CONFIG_X86_64/CONFIG_X86
 enhanced check logic
v3: s/intel_core_aspm_chk/aspm_support_quirk_check
 correct build error with W=1 option
v2: correct commit description
 move the check from chip family to problematic platform
---
  drivers/gpu/drm/amd/amdgpu/vi.c | 17 -
  1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 039b90cdc3bc..45f0188c4273 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -81,6 +81,10 @@
  #include "mxgpu_vi.h"
  #include "amdgpu_dm.h"
  
+#if IS_ENABLED(CONFIG_X86)

+#include 
+#endif
+
  #define ixPCIE_LC_L1_PM_SUBSTATE  0x100100C6
  #define PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK  
0x0001L
  #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK  0x0002L
@@ -1134,13 +1138,24 @@ static void vi_enable_aspm(struct amdgpu_device *adev)
WREG32_PCIE(ixPCIE_LC_CNTL, data);
  }
  
+static bool aspm_support_quirk_check(void)

+{
+#if IS_ENABLED(CONFIG_X86)
+   struct cpuinfo_x86 *c = _data(0);
+
+   return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
+#else
+   return true;
+#endif
+}
+
  static void vi_program_aspm(struct amdgpu_device *adev)
  {
u32 data, data1, orig;
bool bL1SS = false;
bool bClkReqSupport = true;
  
-	if (!amdgpu_device_should_use_aspm(adev))

+   if (!amdgpu_device_should_use_aspm(adev) || !aspm_support_quirk_check())
return;
  
  	if (adev->flags & AMD_IS_APU ||


Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems

2022-05-01 Thread Paul Menzel

Dear Richard,


Sorry for the late reply.

Am 26.04.22 um 15:53 schrieb Gong, Richard:


On 4/21/2022 12:35 AM, Paul Menzel wrote:



Am 21.04.22 um 03:12 schrieb Gong, Richard:


On 4/20/2022 3:29 PM, Paul Menzel wrote:



Am 19.04.22 um 23:46 schrieb Gong, Richard:


On 4/14/2022 2:52 AM, Paul Menzel wrote:

[Cc: -kernel test robot ]


[…]


Am 13.04.22 um 15:00 schrieb Alex Deucher:

On Wed, Apr 13, 2022 at 3:43 AM Paul Menzel wrote:



Thank you for sending out v4.

Am 12.04.22 um 23:50 schrieb Richard Gong:


[…]

I am still not clear, what “hang during suspend/resume” means. I 
guess
suspending works fine? During resume (S3 or S0ix?), where does 
it hang?

The system is functional, but there are only display problems?

System freeze after suspend/resume.


But you see certain messages still? At what point does it freeze 
exactly? In the bug report you posted Linux messages.


No, the system freeze then users have to recycle power to recover.


Then I misread the issue? Did you capture the messages over serial log 
then?


I think so. We captured dmesg log.


Then the (whole) system did *not* freeze, if you could still log in 
(maybe over network) and execute `dmesg`. Please also paste the 
amdgpu(?) error logs in the commit message.


As mentioned early we need support from Intel on how to get ASPM working 
for VI generation on Intel Alder Lake, but we don't know where things 
currently stand.


Who is working on this, and knows?


Kind regards,

Paul


Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems

2022-04-20 Thread Paul Menzel

Dear Richard,


Am 21.04.22 um 03:12 schrieb Gong, Richard:


On 4/20/2022 3:29 PM, Paul Menzel wrote:



Am 19.04.22 um 23:46 schrieb Gong, Richard:


On 4/14/2022 2:52 AM, Paul Menzel wrote:

[Cc: -kernel test robot ]


[…]


Am 13.04.22 um 15:00 schrieb Alex Deucher:

On Wed, Apr 13, 2022 at 3:43 AM Paul Menzel wrote:



Thank you for sending out v4.

Am 12.04.22 um 23:50 schrieb Richard Gong:
Active State Power Management (ASPM) feature is enabled since 
kernel 5.14.
There are some AMD GFX cards (such as WX3200 and RX640) that 
won't work
with ASPM-enabled Intel Alder Lake based systems. Using these GFX 
cards as
video/display output, Intel Alder Lake based systems will hang 
during

suspend/resume.


[Your email program wraps lines in cited text for some reason, making 
the citation harder to read.]


Not sure why, I am using Mozila Thunderbird for email. I am not using MS 
Outlook for upstream email.


Strange. No idea if there were bugs in Mozilla Thunderbird 91.2.0, 
released over half year ago. The current version is 91.8.1. [1]


I am still not clear, what “hang during suspend/resume” means. I 
guess
suspending works fine? During resume (S3 or S0ix?), where does it 
hang?

The system is functional, but there are only display problems?

System freeze after suspend/resume.


But you see certain messages still? At what point does it freeze 
exactly? In the bug report you posted Linux messages.


No, the system freeze then users have to recycle power to recover.


Then I misread the issue? Did you capture the messages over serial log then?

The issue was initially reported on one system (Dell Precision 
3660 with
BIOS version 0.14.81), but was later confirmed to affect at least 
4 Alder

Lake based systems.

Add extra check to disable ASPM on Intel Alder Lake based systems.

Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1885data=05%7C01%7Crichard.gong%40amd.com%7Cce01de048c61456174ff08da230c750d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637860833680922036%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=vqhh3dTc%2FgBt7GrP9hKppWlrFy2F7DaivkNEuGekl0g%3Dreserved=0 



Thank you Microsoft Outlook for keeping us safe. :(

I am not using MS Outlook for the email exchanges.


I guess, it’s not the client but the Microsoft email service (Exchange?) 
no idea adding these protection links. (Making it even harder for users 
to actually verify domain. No idea who comes up with these ideas, and 
customers actually accepting those.)




Reported-by: kernel test robot 


This tag is a little confusing. Maybe clarify that it was for an 
issue

in a previous patch iteration?


I did describe in change-list version 3 below, which corrected the 
build error with W=1 option.


It is not good idea to add the description for that to the commit 
message, this is why I add descriptions on change-list version 3.


Do as you wish, but the current style is confusing, and readers of the 
commit are going to think, the kernel test robot reported the problem 
with AMD VI ASICs and Intel Alder Lake systems.





Signed-off-by: Richard Gong 
---
v4: s/CONFIG_X86_64/CONFIG_X86
  enhanced check logic
v3: s/intel_core_asom_chk/aspm_support_quirk_check
  correct build error with W=1 option
v2: correct commit description
  move the check from chip family to problematic platform
---
   drivers/gpu/drm/amd/amdgpu/vi.c | 17 -
   1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c 
b/drivers/gpu/drm/amd/amdgpu/vi.c

index 039b90cdc3bc..b33e0a9bee65 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -81,6 +81,10 @@
   #include "mxgpu_vi.h"
   #include "amdgpu_dm.h"

+#if IS_ENABLED(CONFIG_X86)
+#include 
+#endif
+
   #define ixPCIE_LC_L1_PM_SUBSTATE    0x100100C6
   #define 
PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK 0x0001L
   #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK 
0x0002L
@@ -1134,13 +1138,24 @@ static void vi_enable_aspm(struct 
amdgpu_device *adev)

   WREG32_PCIE(ixPCIE_LC_CNTL, data);
   }

+static bool aspm_support_quirk_check(void)
+{
+ if (IS_ENABLED(CONFIG_X86)) {
+ struct cpuinfo_x86 *c = _data(0);
+
+ return !(c->x86 == 6 && c->x86_model == 
INTEL_FAM6_ALDERLAKE);

+ }
+
+ return true;
+}
+
   static void vi_program_aspm(struct amdgpu_device *adev)
   {
   u32 data, data1, orig;
   bool bL1SS = false;
   bool bClkReqSupport = true;

- if (!amdgpu_device_should_use_aspm(adev))
+ if (!amdgpu_device_should_use_aspm(adev) || 
!aspm_support_quirk_check())

   return;


Can users still forcefully enable ASPM with the parameter 
`amdgpu.aspm`?


As Mario mentioned in 

Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems

2022-04-20 Thread Paul Menzel

Dear Richard,


Am 20.04.22 um 22:56 schrieb Gong, Richard:


On 4/20/2022 3:48 PM, Paul Menzel wrote:



Am 20.04.22 um 22:40 schrieb Alex Deucher:
On Wed, Apr 20, 2022 at 4:29 PM Paul Menzel  
wrote:



Am 19.04.22 um 23:46 schrieb Gong, Richard:


On 4/14/2022 2:52 AM, Paul Menzel wrote:

[Cc: -kernel test robot ]


[…]


Am 13.04.22 um 15:00 schrieb Alex Deucher:

On Wed, Apr 13, 2022 at 3:43 AM Paul Menzel wrote:



Thank you for sending out v4.

Am 12.04.22 um 23:50 schrieb Richard Gong:

Active State Power Management (ASPM) feature is enabled since
kernel 5.14.
There are some AMD GFX cards (such as WX3200 and RX640) that won't
work
with ASPM-enabled Intel Alder Lake based systems. Using these GFX
cards as
video/display output, Intel Alder Lake based systems will hang 
during

suspend/resume.


[Your email program wraps lines in cited text for some reason, making
the citation harder to read.]



I am still not clear, what “hang during suspend/resume” means. I 
guess
suspending works fine? During resume (S3 or S0ix?), where does 
it hang?

The system is functional, but there are only display problems?

System freeze after suspend/resume.


But you see certain messages still? At what point does it freeze
exactly? In the bug report you posted Linux messages.

The issue was initially reported on one system (Dell Precision 
3660

with
BIOS version 0.14.81), but was later confirmed to affect at 
least 4

Alder
Lake based systems.

Add extra check to disable ASPM on Intel Alder Lake based systems.

Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1885data=05%7C01%7Crichard.gong%40amd.com%7C487aaa63098b462e146a08da230f2319%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637860845178176835%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=3IVldn05qNa2XVp1Lu58SriS8k9mk4U9K9p3F3IYPe0%3Dreserved=0 



Thank you Microsoft Outlook for keeping us safe. :(



Reported-by: kernel test robot 


This tag is a little confusing. Maybe clarify that it was for an 
issue

in a previous patch iteration?


I did describe in change-list version 3 below, which corrected the 
build

error with W=1 option.

It is not good idea to add the description for that to the commit
message, this is why I add descriptions on change-list version 3.


Do as you wish, but the current style is confusing, and readers of the
commit are going to think, the kernel test robot reported the problem
with AMD VI ASICs and Intel Alder Lake systems.




Signed-off-by: Richard Gong 
---
v4: s/CONFIG_X86_64/CONFIG_X86
   enhanced check logic
v3: s/intel_core_asom_chk/aspm_support_quirk_check
   correct build error with W=1 option
v2: correct commit description
   move the check from chip family to problematic platform
---
    drivers/gpu/drm/amd/amdgpu/vi.c | 17 -
    1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c
b/drivers/gpu/drm/amd/amdgpu/vi.c
index 039b90cdc3bc..b33e0a9bee65 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -81,6 +81,10 @@
    #include "mxgpu_vi.h"
    #include "amdgpu_dm.h"

+#if IS_ENABLED(CONFIG_X86)
+#include 
+#endif
+
    #define ixPCIE_LC_L1_PM_SUBSTATE    0x100100C6
    #define 
PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK

0x0001L
    #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK
0x0002L
@@ -1134,13 +1138,24 @@ static void vi_enable_aspm(struct
amdgpu_device *adev)
    WREG32_PCIE(ixPCIE_LC_CNTL, data);
    }

+static bool aspm_support_quirk_check(void)
+{
+ if (IS_ENABLED(CONFIG_X86)) {
+ struct cpuinfo_x86 *c = _data(0);
+
+ return !(c->x86 == 6 && c->x86_model ==
INTEL_FAM6_ALDERLAKE);
+ }
+
+ return true;
+}
+
    static void vi_program_aspm(struct amdgpu_device *adev)
    {
    u32 data, data1, orig;
    bool bL1SS = false;
    bool bClkReqSupport = true;

- if (!amdgpu_device_should_use_aspm(adev))
+ if (!amdgpu_device_should_use_aspm(adev) ||
!aspm_support_quirk_check())
    return;


Can users still forcefully enable ASPM with the parameter
`amdgpu.aspm`?

As Mario mentioned in a separate reply, we can't forcefully enable 
ASPM

with the parameter 'amdgpu.aspm'.


That would be a regression on systems where ASPM used to work. Hmm. I
guess, you could say, there are no such systems.



    if (adev->flags & AMD_IS_APU ||


If I remember correctly, there were also newer cards, where ASPM 
worked
with Intel Alder Lake, right? Can only the problematic 
generations for

WX3200 and RX640 be excluded from ASPM?


This patch only disables it for the generatioaon that was 
problematic.


Could that please be made clear in the commit message summary, and
m

Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems

2022-04-20 Thread Paul Menzel

Dear Alex,


Am 20.04.22 um 22:40 schrieb Alex Deucher:

On Wed, Apr 20, 2022 at 4:29 PM Paul Menzel  wrote:



Am 19.04.22 um 23:46 schrieb Gong, Richard:


On 4/14/2022 2:52 AM, Paul Menzel wrote:

[Cc: -kernel test robot ]


[…]


Am 13.04.22 um 15:00 schrieb Alex Deucher:

On Wed, Apr 13, 2022 at 3:43 AM Paul Menzel wrote:



Thank you for sending out v4.

Am 12.04.22 um 23:50 schrieb Richard Gong:

Active State Power Management (ASPM) feature is enabled since
kernel 5.14.
There are some AMD GFX cards (such as WX3200 and RX640) that won't
work
with ASPM-enabled Intel Alder Lake based systems. Using these GFX
cards as
video/display output, Intel Alder Lake based systems will hang during
suspend/resume.


[Your email program wraps lines in cited text for some reason, making
the citation harder to read.]



I am still not clear, what “hang during suspend/resume” means. I guess
suspending works fine? During resume (S3 or S0ix?), where does it hang?
The system is functional, but there are only display problems?

System freeze after suspend/resume.


But you see certain messages still? At what point does it freeze
exactly? In the bug report you posted Linux messages.


The issue was initially reported on one system (Dell Precision 3660
with
BIOS version 0.14.81), but was later confirmed to affect at least 4
Alder
Lake based systems.

Add extra check to disable ASPM on Intel Alder Lake based systems.

Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1885data=04%7C01%7Crichard.gong%40amd.com%7Ce7febed5d6a441c3a58008da1debb99c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637855195670542145%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=7cEnE%2BSM9e5IGFxSLloCLtCOxovBpaPz0Ns0Ta2vVlc%3Dreserved=0


Thank you Microsoft Outlook for keeping us safe. :(



Reported-by: kernel test robot 


This tag is a little confusing. Maybe clarify that it was for an issue
in a previous patch iteration?


I did describe in change-list version 3 below, which corrected the build
error with W=1 option.

It is not good idea to add the description for that to the commit
message, this is why I add descriptions on change-list version 3.


Do as you wish, but the current style is confusing, and readers of the
commit are going to think, the kernel test robot reported the problem
with AMD VI ASICs and Intel Alder Lake systems.




Signed-off-by: Richard Gong 
---
v4: s/CONFIG_X86_64/CONFIG_X86
   enhanced check logic
v3: s/intel_core_asom_chk/aspm_support_quirk_check
   correct build error with W=1 option
v2: correct commit description
   move the check from chip family to problematic platform
---
drivers/gpu/drm/amd/amdgpu/vi.c | 17 -
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c
b/drivers/gpu/drm/amd/amdgpu/vi.c
index 039b90cdc3bc..b33e0a9bee65 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -81,6 +81,10 @@
#include "mxgpu_vi.h"
#include "amdgpu_dm.h"

+#if IS_ENABLED(CONFIG_X86)
+#include 
+#endif
+
#define ixPCIE_LC_L1_PM_SUBSTATE0x100100C6
#define PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK
0x0001L
#define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK
0x0002L
@@ -1134,13 +1138,24 @@ static void vi_enable_aspm(struct
amdgpu_device *adev)
WREG32_PCIE(ixPCIE_LC_CNTL, data);
}

+static bool aspm_support_quirk_check(void)
+{
+ if (IS_ENABLED(CONFIG_X86)) {
+ struct cpuinfo_x86 *c = _data(0);
+
+ return !(c->x86 == 6 && c->x86_model ==
INTEL_FAM6_ALDERLAKE);
+ }
+
+ return true;
+}
+
static void vi_program_aspm(struct amdgpu_device *adev)
{
u32 data, data1, orig;
bool bL1SS = false;
bool bClkReqSupport = true;

- if (!amdgpu_device_should_use_aspm(adev))
+ if (!amdgpu_device_should_use_aspm(adev) ||
!aspm_support_quirk_check())
return;


Can users still forcefully enable ASPM with the parameter
`amdgpu.aspm`?


As Mario mentioned in a separate reply, we can't forcefully enable ASPM
with the parameter 'amdgpu.aspm'.


That would be a regression on systems where ASPM used to work. Hmm. I
guess, you could say, there are no such systems.



if (adev->flags & AMD_IS_APU ||


If I remember correctly, there were also newer cards, where ASPM worked
with Intel Alder Lake, right? Can only the problematic generations for
WX3200 and RX640 be excluded from ASPM?


This patch only disables it for the generatioaon that was problematic.


Could that please be made clear in the commit message summary, and
message?


Are you ok with the commit messages below?


Please change the commit message summary. Maybe:

drm/amdgpu: VI: Disa

Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems

2022-04-20 Thread Paul Menzel

Dear Richard,


Am 19.04.22 um 23:46 schrieb Gong, Richard:


On 4/14/2022 2:52 AM, Paul Menzel wrote:

[Cc: -kernel test robot ]


[…]


Am 13.04.22 um 15:00 schrieb Alex Deucher:

On Wed, Apr 13, 2022 at 3:43 AM Paul Menzel wrote:



Thank you for sending out v4.

Am 12.04.22 um 23:50 schrieb Richard Gong:
Active State Power Management (ASPM) feature is enabled since 
kernel 5.14.
There are some AMD GFX cards (such as WX3200 and RX640) that won't 
work
with ASPM-enabled Intel Alder Lake based systems. Using these GFX 
cards as

video/display output, Intel Alder Lake based systems will hang during
suspend/resume.


[Your email program wraps lines in cited text for some reason, making 
the citation harder to read.]




I am still not clear, what “hang during suspend/resume” means. I guess
suspending works fine? During resume (S3 or S0ix?), where does it hang?
The system is functional, but there are only display problems?

System freeze after suspend/resume.


But you see certain messages still? At what point does it freeze 
exactly? In the bug report you posted Linux messages.


The issue was initially reported on one system (Dell Precision 3660 
with
BIOS version 0.14.81), but was later confirmed to affect at least 4 
Alder

Lake based systems.

Add extra check to disable ASPM on Intel Alder Lake based systems.

Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1885data=04%7C01%7Crichard.gong%40amd.com%7Ce7febed5d6a441c3a58008da1debb99c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637855195670542145%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=7cEnE%2BSM9e5IGFxSLloCLtCOxovBpaPz0Ns0Ta2vVlc%3Dreserved=0


Thank you Microsoft Outlook for keeping us safe. :(



Reported-by: kernel test robot 


This tag is a little confusing. Maybe clarify that it was for an issue
in a previous patch iteration?


I did describe in change-list version 3 below, which corrected the build 
error with W=1 option.


It is not good idea to add the description for that to the commit 
message, this is why I add descriptions on change-list version 3.


Do as you wish, but the current style is confusing, and readers of the 
commit are going to think, the kernel test robot reported the problem 
with AMD VI ASICs and Intel Alder Lake systems.





Signed-off-by: Richard Gong 
---
v4: s/CONFIG_X86_64/CONFIG_X86
  enhanced check logic
v3: s/intel_core_asom_chk/aspm_support_quirk_check
  correct build error with W=1 option
v2: correct commit description
  move the check from chip family to problematic platform
---
   drivers/gpu/drm/amd/amdgpu/vi.c | 17 -
   1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c 
b/drivers/gpu/drm/amd/amdgpu/vi.c

index 039b90cdc3bc..b33e0a9bee65 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -81,6 +81,10 @@
   #include "mxgpu_vi.h"
   #include "amdgpu_dm.h"

+#if IS_ENABLED(CONFIG_X86)
+#include 
+#endif
+
   #define ixPCIE_LC_L1_PM_SUBSTATE    0x100100C6
   #define PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK 
0x0001L
   #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK 
0x0002L
@@ -1134,13 +1138,24 @@ static void vi_enable_aspm(struct 
amdgpu_device *adev)

   WREG32_PCIE(ixPCIE_LC_CNTL, data);
   }

+static bool aspm_support_quirk_check(void)
+{
+ if (IS_ENABLED(CONFIG_X86)) {
+ struct cpuinfo_x86 *c = _data(0);
+
+ return !(c->x86 == 6 && c->x86_model == 
INTEL_FAM6_ALDERLAKE);

+ }
+
+ return true;
+}
+
   static void vi_program_aspm(struct amdgpu_device *adev)
   {
   u32 data, data1, orig;
   bool bL1SS = false;
   bool bClkReqSupport = true;

- if (!amdgpu_device_should_use_aspm(adev))
+ if (!amdgpu_device_should_use_aspm(adev) || 
!aspm_support_quirk_check())

   return;


Can users still forcefully enable ASPM with the parameter 
`amdgpu.aspm`?


As Mario mentioned in a separate reply, we can't forcefully enable ASPM 
with the parameter 'amdgpu.aspm'.


That would be a regression on systems where ASPM used to work. Hmm. I 
guess, you could say, there are no such systems.




   if (adev->flags & AMD_IS_APU ||


If I remember correctly, there were also newer cards, where ASPM worked
with Intel Alder Lake, right? Can only the problematic generations for
WX3200 and RX640 be excluded from ASPM?


This patch only disables it for the generatioaon that was problematic.


Could that please be made clear in the commit message summary, and 
message?


Are you ok with the commit messages below?


Please change the commit message summary. Maybe:

drm/amdgpu: VI: Disable ASPM on Intel Alder Lake based systems


Active State Power Management (ASPM) featur

Re: [PATCH 1/2] Documentation/gpu: Add entries to amdgpu glossary

2022-04-17 Thread Paul Menzel

Dear Tales,


Thank you for your patch.

Am 15.04.22 um 21:50 schrieb Tales Lelo da Aparecida:

Add missing acronyms to the amdgppu glossary.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1939#note_1309737
Signed-off-by: Tales Lelo da Aparecida 
---
  Documentation/gpu/amdgpu/amdgpu-glossary.rst | 13 +
  1 file changed, 13 insertions(+)

diff --git a/Documentation/gpu/amdgpu/amdgpu-glossary.rst 
b/Documentation/gpu/amdgpu/amdgpu-glossary.rst
index 859dcec6c6f9..48829d097f40 100644
--- a/Documentation/gpu/amdgpu/amdgpu-glossary.rst
+++ b/Documentation/gpu/amdgpu/amdgpu-glossary.rst
@@ -8,12 +8,19 @@ we have a dedicated glossary for Display Core at
  
  .. glossary::
  
+active_cu_number

+  The number of CUs that are active on the system.  The number of active
+  CUs may be less than SE * SH * CU depending on the board configuration.
+
  CP
Command Processor
  
  CPLIB

Content Protection Library
  
+CU

+  Compute unit


Capitalize the U in *unit* as seems to be done in the rest of the files?


+
  DFS
Digital Frequency Synthesizer
  
@@ -74,6 +81,12 @@ we have a dedicated glossary for Display Core at

  SDMA
System DMA
  
+SE

+  Shader Engine
+
+SH
+  SHader array


No idea if the H should be capitalized.


+
  SMU
System Management Unit
  



Kind regards,

Paul


Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems

2022-04-14 Thread Paul Menzel

[Cc: -kernel test robot ]

Dear Alex, dear Richard,


Am 13.04.22 um 15:00 schrieb Alex Deucher:

On Wed, Apr 13, 2022 at 3:43 AM Paul Menzel wrote:



Thank you for sending out v4.

Am 12.04.22 um 23:50 schrieb Richard Gong:

Active State Power Management (ASPM) feature is enabled since kernel 5.14.
There are some AMD GFX cards (such as WX3200 and RX640) that won't work
with ASPM-enabled Intel Alder Lake based systems. Using these GFX cards as
video/display output, Intel Alder Lake based systems will hang during
suspend/resume.


I am still not clear, what “hang during suspend/resume” means. I guess
suspending works fine? During resume (S3 or S0ix?), where does it hang?
The system is functional, but there are only display problems?


The issue was initially reported on one system (Dell Precision 3660 with
BIOS version 0.14.81), but was later confirmed to affect at least 4 Alder
Lake based systems.

Add extra check to disable ASPM on Intel Alder Lake based systems.

Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
Reported-by: kernel test robot 


This tag is a little confusing. Maybe clarify that it was for an issue
in a previous patch iteration?


Signed-off-by: Richard Gong 
---
v4: s/CONFIG_X86_64/CONFIG_X86
  enhanced check logic
v3: s/intel_core_asom_chk/aspm_support_quirk_check
  correct build error with W=1 option
v2: correct commit description
  move the check from chip family to problematic platform
---
   drivers/gpu/drm/amd/amdgpu/vi.c | 17 -
   1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 039b90cdc3bc..b33e0a9bee65 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -81,6 +81,10 @@
   #include "mxgpu_vi.h"
   #include "amdgpu_dm.h"

+#if IS_ENABLED(CONFIG_X86)
+#include 
+#endif
+
   #define ixPCIE_LC_L1_PM_SUBSTATE0x100100C6
   #define PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK
0x0001L
   #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK
0x0002L
@@ -1134,13 +1138,24 @@ static void vi_enable_aspm(struct amdgpu_device *adev)
   WREG32_PCIE(ixPCIE_LC_CNTL, data);
   }

+static bool aspm_support_quirk_check(void)
+{
+ if (IS_ENABLED(CONFIG_X86)) {
+ struct cpuinfo_x86 *c = _data(0);
+
+ return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
+ }
+
+ return true;
+}
+
   static void vi_program_aspm(struct amdgpu_device *adev)
   {
   u32 data, data1, orig;
   bool bL1SS = false;
   bool bClkReqSupport = true;

- if (!amdgpu_device_should_use_aspm(adev))
+ if (!amdgpu_device_should_use_aspm(adev) || !aspm_support_quirk_check())
   return;


Can users still forcefully enable ASPM with the parameter `amdgpu.aspm`?



   if (adev->flags & AMD_IS_APU ||


If I remember correctly, there were also newer cards, where ASPM worked
with Intel Alder Lake, right? Can only the problematic generations for
WX3200 and RX640 be excluded from ASPM?


This patch only disables it for the generation that was problematic.


Could that please be made clear in the commit message summary, and message?

Loosely related, is there a public (or internal issue) to analyze how to 
get ASPM working for VI generation devices with Intel Alder Lake?



Kind regards,

Paul


Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems

2022-04-13 Thread Paul Menzel

Dear Richard,


Thank you for sending out v4.

Am 12.04.22 um 23:50 schrieb Richard Gong:

Active State Power Management (ASPM) feature is enabled since kernel 5.14.
There are some AMD GFX cards (such as WX3200 and RX640) that won't work
with ASPM-enabled Intel Alder Lake based systems. Using these GFX cards as
video/display output, Intel Alder Lake based systems will hang during
suspend/resume.


I am still not clear, what “hang during suspend/resume” means. I guess 
suspending works fine? During resume (S3 or S0ix?), where does it hang? 
The system is functional, but there are only display problems?



The issue was initially reported on one system (Dell Precision 3660 with
BIOS version 0.14.81), but was later confirmed to affect at least 4 Alder
Lake based systems.

Add extra check to disable ASPM on Intel Alder Lake based systems.

Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
Reported-by: kernel test robot 


This tag is a little confusing. Maybe clarify that it was for an issue 
in a previous patch iteration?



Signed-off-by: Richard Gong 
---
v4: s/CONFIG_X86_64/CONFIG_X86
 enhanced check logic
v3: s/intel_core_asom_chk/aspm_support_quirk_check
 correct build error with W=1 option
v2: correct commit description
 move the check from chip family to problematic platform
---
  drivers/gpu/drm/amd/amdgpu/vi.c | 17 -
  1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 039b90cdc3bc..b33e0a9bee65 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -81,6 +81,10 @@
  #include "mxgpu_vi.h"
  #include "amdgpu_dm.h"
  
+#if IS_ENABLED(CONFIG_X86)

+#include 
+#endif
+
  #define ixPCIE_LC_L1_PM_SUBSTATE  0x100100C6
  #define PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK  
0x0001L
  #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK  0x0002L
@@ -1134,13 +1138,24 @@ static void vi_enable_aspm(struct amdgpu_device *adev)
WREG32_PCIE(ixPCIE_LC_CNTL, data);
  }
  
+static bool aspm_support_quirk_check(void)

+{
+   if (IS_ENABLED(CONFIG_X86)) {
+   struct cpuinfo_x86 *c = _data(0);
+
+   return !(c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
+   }
+
+   return true;
+}
+
  static void vi_program_aspm(struct amdgpu_device *adev)
  {
u32 data, data1, orig;
bool bL1SS = false;
bool bClkReqSupport = true;
  
-	if (!amdgpu_device_should_use_aspm(adev))

+   if (!amdgpu_device_should_use_aspm(adev) || !aspm_support_quirk_check())
return;


Can users still forcefully enable ASPM with the parameter `amdgpu.aspm`?

  
  	if (adev->flags & AMD_IS_APU ||


If I remember correctly, there were also newer cards, where ASPM worked 
with Intel Alder Lake, right? Can only the problematic generations for 
WX3200 and RX640 be excluded from ASPM?



Kind regards,

Paul


AMD Display Core (DC) patches (was: [PATCH 13/16] drm/amd/display: Revert FEC check in validation)

2022-04-12 Thread Paul Menzel
[Cc: +dri-devel@lists.freedesktop.org, +Daniel Vetter, +Alexander 
Deucher, +Greg KH]



Dear Alex,


I am a little confused and upset about how Display Core patches are 
handled in the Linux kernel.



Am 25.03.22 um 23:53 schrieb Alex Hung:

From: Martin Leung 


git puts a line “This reverts commit …” into the commit message, when 
something is reverted. Why isn’t this here? Right now, commit 
7d56a154e22f, reverted here, is proposed for the stable series. I guess, 
because these indicators and meta data are missing.



why and how:
causes failure on install on certain machines


Why are such kind of commit messages accepted? What does “failure on 
install” even mean? Why can’t the machine configuration be documented so 
it can be reproduced, when necessary.


No less confusing, the date you posted it on amd-gfx is from March 25th, 
2022, but the author date of the commit in agd5f/amd-staging-drm-next is 
`Fri Mar 18 11:12:36 2022 -0400`. Why is the patch missing the Date 
field then?



Reviewed-by: George Shen 
Acked-by: Alex Hung 
Signed-off-by: Martin Leung 


Shouldn’t the Signed-off-by line by the author go first?

You committed this on `Mon Mar 28 08:26:48 2022 -0600`, while you posted 
the patch on amd-gfx on Friday. How should *proper* review happen over 
the weekend?



---
  drivers/gpu/drm/amd/display/dc/core/dc.c | 4 
  1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index f2ad8f58e69c..c436db416708 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1496,10 +1496,6 @@ bool dc_validate_boot_timing(const struct dc *dc,
if (!link->link_enc->funcs->is_dig_enabled(link->link_enc))
return false;
  
-	/* Check for FEC status*/

-   if (link->link_enc->funcs->fec_is_active(link->link_enc))
-   return false;
-
enc_inst = link->link_enc->funcs->get_dig_frontend(link->link_enc);
  
  	if (enc_inst == ENGINE_ID_UNKNOWN)


The patch reverted here, also lacked proper review, had a to-be desired 
commit message, did not follow the Linux kernel coding style (missing 
space before the comment terminator), so should not have been committed 
in the first place.


Seeing how many people are in the Cc list, I would have hoped, that 
somebody noticed and commented. The current state also makes it really 
hard for non-AMD employees to get the necessary information to do proper 
reviews as the needed documentation and information is non-public. So 
good/excellent commit messages are a must. I think to remember, you 
replied to me once, that Display Core patches are shared also with the 
Microsoft Windows driver, restricting the workflow options. But I think 
the issues I mentioned are unrelated. I know graphics hardware is very 
complex, but if quality of the commits and review would be improved, 
hopefully it saves time for everyone in the end, as less bugs are 
introduced.


Could AMD team please address these issues as soon as possible?


Kind regards,

Paul


Re: [PATCHv2] drm/amdgpu: disable ASPM on Intel AlderLake based systems

2022-04-11 Thread Paul Menzel

Dear Richard,


Am 11.04.22 um 13:38 schrieb Gong, Richard:


On 4/11/2022 2:41 AM, Paul Menzel wrote:

[Cc: +]



Am 11.04.22 um 02:27 schrieb Gong, Richard:


On 4/8/2022 7:19 PM, Paul Menzel wrote:



Am 08.04.22 um 21:05 schrieb Richard Gong:
Active State Power Management (ASPM) feature is enabled since 
kernel 5.14.

There are some AMD GFX cards (such as WX3200 and RX640) that cannot be
used with Intel AlderLake based systems to enable ASPM. Using these 
GFX


Alder Lake

will correct in the next version.



cards as video/display output, Intel Alder Lake based systems will hang
during suspend/resume.


Please reflow for 75 characters per line.

Also please mention the exact system you had problems with (also 
firmware versions).




Add extra check to disable ASPM on Intel AlderLake based systems.


Is that a problem with Intel Alder Lake or the Dell system? 
Shouldn’t ASPM just be disabled for the problematic cards for the 
Dell system. You write newer cards worked fine.


There is a problem with Dell system (Dell Precision DT workstation), 
which is based on Intel Alder Lake.


ASPM works just fine on these GPU's. It's more of an issue with 
whether the underlying platform supports ASPM or not.


At least you didn’t document what the real issue is,


You can refer to bug tag from the comment messages.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885


No, the commit message should be self-contained, and reviewers and 
readers of the commit message not required to read comments of bug 
reports. Please add the necessary information to the commit message.



Kind regards,

Paul


that ASPM does not work. With current information (some GPU graphics 
card with the the Dell system and others don’t), it could be the GPU, 
the Dell system (firmware, …), a problem with Alder Lake SOC, or 
another bug. I hope you are in contact with Dell to analyze it, so 
ASPM can be enabled again.


[…]


Kind regards,

Paul


Re: [PATCHv2] drm/amdgpu: disable ASPM on Intel AlderLake based systems

2022-04-11 Thread Paul Menzel

[Cc: +]

Dear Richard,


Am 11.04.22 um 02:27 schrieb Gong, Richard:


On 4/8/2022 7:19 PM, Paul Menzel wrote:



Am 08.04.22 um 21:05 schrieb Richard Gong:

Active State Power Management (ASPM) feature is enabled since kernel 5.14.
There are some AMD GFX cards (such as WX3200 and RX640) that cannot be
used with Intel AlderLake based systems to enable ASPM. Using these GFX


Alder Lake

will correct in the next version.



cards as video/display output, Intel Alder Lake based systems will hang
during suspend/resume.


Please reflow for 75 characters per line.

Also please mention the exact system you had problems with (also 
firmware versions).




Add extra check to disable ASPM on Intel AlderLake based systems.


Is that a problem with Intel Alder Lake or the Dell system? Shouldn’t 
ASPM just be disabled for the problematic cards for the Dell system. 
You write newer cards worked fine.


There is a problem with Dell system (Dell Precision DT workstation), 
which is based on Intel Alder Lake.


ASPM works just fine on these GPU's. It's more of an issue with whether 
the underlying platform supports ASPM or not.


At least you didn’t document what the real issue is, that ASPM does not 
work. With current information (some GPU graphics card with the the Dell 
system and others don’t), it could be the GPU, the Dell system 
(firmware, …), a problem with Alder Lake SOC, or another bug. I hope you 
are in contact with Dell to analyze it, so ASPM can be enabled again.


[…]


Kind regards,

Paul


Re: [PATCHv2] drm/amdgpu: disable ASPM on Intel AlderLake based systems

2022-04-11 Thread Paul Menzel

Dear Richard,


Thank you for your response, but please reply to your own reply next time.

Am 11.04.22 um 02:37 schrieb Gong, Richard:


On 4/8/2022 7:19 PM, Paul Menzel wrote:



Thank you for your patch.

Am 08.04.22 um 21:05 schrieb Richard Gong:
Active State Power Management (ASPM) feature is enabled since kernel 
5.14.

There are some AMD GFX cards (such as WX3200 and RX640) that cannot be
used with Intel AlderLake based systems to enable ASPM. Using these GFX


Alder Lake
Actually there are 2 formats (one with space, another is w/o space) in 
the upstream sources, so I will keep that unchanged and use the format 
w/o space.


Do you mean the Linux kernel sources? Anyway, please use the correct 
spelling [1].



Kind regards,

Paul


[1]: 
https://ark.intel.com/content/www/us/en/ark/products/codename/147470/products-formerly-alder-lake.html

[2]: https://en.wikipedia.org/wiki/Alder_Lake


Re: [PATCHv2] drm/amdgpu: disable ASPM on Intel AlderLake based systems

2022-04-08 Thread Paul Menzel

Dear Richard,


Thank you for your patch.

Am 08.04.22 um 21:05 schrieb Richard Gong:

Active State Power Management (ASPM) feature is enabled since kernel 5.14.
There are some AMD GFX cards (such as WX3200 and RX640) that cannot be
used with Intel AlderLake based systems to enable ASPM. Using these GFX


Alder Lake


cards as video/display output, Intel Alder Lake based systems will hang
during suspend/resume.


Please reflow for 75 characters per line.

Also please mention the exact system you had problems with (also 
firmware versions).




Add extra check to disable ASPM on Intel AlderLake based systems.


Is that a problem with Intel Alder Lake or the Dell system? Shouldn’t 
ASPM just be disabled for the problematic cards for the Dell system. You 
write newer cards worked fine.



Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885
Signed-off-by: Richard Gong 
---
v2: correct commit description
 move the check from chip family to problematic platform
---
  drivers/gpu/drm/amd/amdgpu/vi.c | 17 -
  1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 039b90cdc3bc..8b4eaf54b23e 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -81,6 +81,10 @@
  #include "mxgpu_vi.h"
  #include "amdgpu_dm.h"
  
+#if IS_ENABLED(CONFIG_X86_64)

+#include 
+#endif
+
  #define ixPCIE_LC_L1_PM_SUBSTATE  0x100100C6
  #define PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK  
0x0001L
  #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK  0x0002L
@@ -1134,13 +1138,24 @@ static void vi_enable_aspm(struct amdgpu_device *adev)
WREG32_PCIE(ixPCIE_LC_CNTL, data);
  }
  
+static bool intel_core_apsm_chk(void)


aspm


+{
+#if IS_ENABLED(CONFIG_X86_64)
+   struct cpuinfo_x86 *c = _data(0);
+
+   return (c->x86 == 6 && c->x86_model == INTEL_FAM6_ALDERLAKE);
+#else
+   return false;
+#endif


Please do the check in C code and not the preprocessor.


+}
+
  static void vi_program_aspm(struct amdgpu_device *adev)
  {
u32 data, data1, orig;
bool bL1SS = false;
bool bClkReqSupport = true;
  
-	if (!amdgpu_device_should_use_aspm(adev))

+   if (!amdgpu_device_should_use_aspm(adev) || intel_core_apsm_chk())
return;
  
  	if (adev->flags & AMD_IS_APU ||



Kind regards,

Paul


Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

2022-04-07 Thread Paul Menzel

Dear Arunpravin,


Thank you for your patch.

Am 07.04.22 um 07:46 schrieb Arunpravin Paneer Selvam:

- Switch to drm buddy allocator
- Add resource cursor support for drm buddy


I though after the last long discussion, you would actually act on the 
review comments. Daniel wrote a good summary, you could more or less 
copy and past. So why didn’t you?


So, I really wish to not have the patch commit as is.

The summary should also say something about using mutex over spinlocks. 
For me the version change summaries below are just for reviewers of 
earlier iterations to see what changed, and not something to be read easily.



Kind regards,

Paul



v2(Matthew Auld):
   - replace spinlock with mutex as we call kmem_cache_zalloc
 (..., GFP_KERNEL) in drm_buddy_alloc() function

   - lock drm_buddy_block_trim() function as it calls
 mark_free/mark_split are all globally visible

v3(Matthew Auld):
   - remove trim method error handling as we address the failure case
 at drm_buddy_block_trim() function

v4:
   - fix warnings reported by kernel test robot 

v5:
   - fix merge conflict issue

v6:
   - fix warnings reported by kernel test robot 

v7:
   - remove DRM_BUDDY_RANGE_ALLOCATION flag usage

v8:
   - keep DRM_BUDDY_RANGE_ALLOCATION flag usage
   - resolve conflicts created by drm/amdgpu: remove VRAM accounting v2

v9(Christian):
   - merged the below patch
  - drm/amdgpu: move vram inline functions into a header
   - rename label name as fallback
   - move struct amdgpu_vram_mgr to amdgpu_vram_mgr.h
   - remove unnecessary flags from struct amdgpu_vram_reservation
   - rewrite block NULL check condition
   - change else style as per coding standard
   - rewrite the node max size
   - add a helper function to fetch the first entry from the list

v10(Christian):
- rename amdgpu_get_node() function name as amdgpu_vram_mgr_first_block

v11:
- if size is not aligned with min_page_size, enable is_contiguous flag,
  therefore, the size round up to the power of two and trimmed to the
  original size.
v12:
- rename the function names having prefix as amdgpu_vram_mgr_*()
- modify the round_up() logic conforming to contiguous flag enablement
  or if size is not aligned to min_block_size
- modify the trim logic
- rename node as block wherever applicable

Signed-off-by: Arunpravin Paneer Selvam 
---
  drivers/gpu/drm/Kconfig   |   1 +
  .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |  10 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 359 ++
  4 files changed, 291 insertions(+), 176 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index f1422bee3dcc..5133c3f028ab 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -280,6 +280,7 @@ config DRM_AMDGPU
select HWMON
select BACKLIGHT_CLASS_DEVICE
select INTERVAL_TREE
+   select DRM_BUDDY
help
  Choose this option if you have a recent AMD Radeon graphics card.
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h

index acfa207cf970..6546552e596c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -30,12 +30,15 @@
  #include 
  #include 
  
+#include "amdgpu_vram_mgr.h"

+
  /* state back for walking over vram_mgr and gtt_mgr allocations */
  struct amdgpu_res_cursor {
uint64_tstart;
uint64_tsize;
uint64_tremaining;
-   struct drm_mm_node  *node;
+   void*node;
+   uint32_tmem_type;
  };
  
  /**

@@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource 
*res,
uint64_t start, uint64_t size,
struct amdgpu_res_cursor *cur)
  {
+   struct drm_buddy_block *block;
+   struct list_head *head, *next;
struct drm_mm_node *node;
  
-	if (!res || res->mem_type == TTM_PL_SYSTEM) {

-   cur->start = start;
-   cur->size = size;
-   cur->remaining = size;
-   cur->node = NULL;
-   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
-   return;
-   }
+   if (!res)
+   goto fallback;
  
  	BUG_ON(start + size > res->num_pages << PAGE_SHIFT);
  
-	node = to_ttm_range_mgr_node(res)->mm_nodes;

-   while (start >= node->size << PAGE_SHIFT)
-   start -= node++->size << PAGE_SHIFT;
+   cur->mem_type = res->mem_type;
+
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   head = _amdgpu_vram_mgr_resource(res)->blocks;
+
+   block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+

Re: Public patches but non-public development branch

2022-04-05 Thread Paul Menzel

Dear Alex,


Am 05.04.22 um 16:13 schrieb Alex Deucher:

On Tue, Apr 5, 2022 at 10:03 AM Paul Menzel  wrote:



Am 05.04.22 um 15:14 schrieb Alex Deucher:

On Tue, Apr 5, 2022 at 3:02 AM Paul Menzel wrote:



Am 05.04.22 um 08:54 schrieb Christian König:

Am 05.04.22 um 08:45 schrieb Paul Menzel:



Am 04.04.22 um 23:42 schrieb Philip Yang:

bo_adev is NULL for system memory mapping to GPU.

Fixes: 05fe8eeca92 (drm/amdgpu: fix TLB flushing during eviction)


Sorry, where can I find that commit?


Well that's expected, the development branch is not public.


Well obviously, it was unexpected for me. How should I have known? Where
is that documented? If the patches are publicly posted to the mailing
list, why is that development branch not public?

The current situation is really frustrating for non-AMD employees. How
can the current situation be improved?


Our development branch
(https://gitlab.freedesktop.org/agd5f/linux/-/commits/amd-staging-drm-next)
is available publicly.  There can be a day or so of lag depending on
when it gets mirrored (e.g., over the weekend).


Thank you for the clarification. As can be seen at hand, it still causes
confusion though.

  commit 05fe8eeca927e29b81f3f2a799e9b9b88b0989a9
  Author: Christian König 
  AuthorDate: Wed Mar 30 10:53:15 2022 +0200
  Commit: Christian König 
  CommitDate: Fri Apr 1 11:05:51 2022 +0200

Today is Tuesday, so it wasn’t mirrored yesterday, on Monday.

To avoid this friction in the future, is there an automated way to
mirror the branches? git hooks should allow that to be done on every
push for example.


It's a bit more complicated than that since we have various CI systems
and IT security policies involved, but we can look into it.


That’d be awesome. Thank you!


Kind regards,

Paul


Re: Public patches but non-public development branch

2022-04-05 Thread Paul Menzel

Dear Alex,


Am 05.04.22 um 15:14 schrieb Alex Deucher:

On Tue, Apr 5, 2022 at 3:02 AM Paul Menzel wrote:



Am 05.04.22 um 08:54 schrieb Christian König:

Am 05.04.22 um 08:45 schrieb Paul Menzel:



Am 04.04.22 um 23:42 schrieb Philip Yang:

bo_adev is NULL for system memory mapping to GPU.

Fixes: 05fe8eeca92 (drm/amdgpu: fix TLB flushing during eviction)


Sorry, where can I find that commit?


Well that's expected, the development branch is not public.


Well obviously, it was unexpected for me. How should I have known? Where
is that documented? If the patches are publicly posted to the mailing
list, why is that development branch not public?

The current situation is really frustrating for non-AMD employees. How
can the current situation be improved?


Our development branch
(https://gitlab.freedesktop.org/agd5f/linux/-/commits/amd-staging-drm-next)
is available publicly.  There can be a day or so of lag depending on
when it gets mirrored (e.g., over the weekend).


Thank you for the clarification. As can be seen at hand, it still causes 
confusion though.


commit 05fe8eeca927e29b81f3f2a799e9b9b88b0989a9
Author: Christian König 
AuthorDate: Wed Mar 30 10:53:15 2022 +0200
Commit: Christian König 
CommitDate: Fri Apr 1 11:05:51 2022 +0200

Today is Tuesday, so it wasn’t mirrored yesterday, on Monday.

To avoid this friction in the future, is there an automated way to 
mirror the branches? git hooks should allow that to be done on every 
push for example.



Kind regards,

Paul


Public patches but non-public development branch (Re: [PATCH 1/1] drm/amdkfd: Add missing NULL check in svm_range_map_to_gpu)

2022-04-05 Thread Paul Menzel

Dear Christian,


Am 05.04.22 um 08:54 schrieb Christian König:

Am 05.04.22 um 08:45 schrieb Paul Menzel:



Am 04.04.22 um 23:42 schrieb Philip Yang:

bo_adev is NULL for system memory mapping to GPU.

Fixes: 05fe8eeca92 (drm/amdgpu: fix TLB flushing during eviction)


Sorry, where can I find that commit?


Well that's expected, the development branch is not public.


Well obviously, it was unexpected for me. How should I have known? Where 
is that documented? If the patches are publicly posted to the mailing 
list, why is that development branch not public?


The current situation is really frustrating for non-AMD employees. How 
can the current situation be improved?



Kind regards,

Paul


I do not see it in drm-next from agd5f git archive 
g...@gitlab.freedesktop.org:agd5f/linux.git.


    $ git log --oneline -1
    e45422695c19 (HEAD, agd5f/drm-next) drm/amdkfd: Create file 
descriptor after client is added to smi_clients list



Kind regards,

Paul



Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c

index 907b02045824..d3fb2d0b5a25 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1281,7 +1281,7 @@ svm_range_map_to_gpu(struct kfd_process_device 
*pdd, struct svm_range *prange,

 last_start, prange->start + i,
 pte_flags,
 last_start - prange->start,
-   bo_adev->vm_manager.vram_base_offset,
+   bo_adev ? 
bo_adev->vm_manager.vram_base_offset : 0,

 NULL, dma_addr, >last_update);
    for (j = last_start - prange->start; j <= i; j++)


Re: [PATCH] drm/amdgpu/vcn: remove Unneeded semicolon

2022-03-31 Thread Paul Menzel

Dear Haowen,


Thank you for your patch.

Am 31.03.22 um 07:56 schrieb Haowen Bai:

In the commit message summary, please use:

Remove unneeded semicolon


report by coccicheck:
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:1951:2-3: Unneeded semicolon

fixed c543dcb ("drm/amdgpu/vcn: Add VCN ras error query support")


Please use

Fixes: …

and a commit hash length of 12 characters. (`scripts/checkpatch.pl …` 
should tell you about this.)



Kind regards,

Paul



Signed-off-by: Haowen Bai 
---
  drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index 3e1de8c..17d44be 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -1948,7 +1948,7 @@ static uint32_t vcn_v2_6_query_poison_by_instance(struct 
amdgpu_device *adev,
break;
default:
break;
-   };
+   }
  
  	if (poison_stat)

dev_info(adev->dev, "Poison detected in VCN%d, sub_block%d\n",


Re: [PATCH 1/1] drm: add PSR2 support and capability definition as per eDP 1.5

2022-03-31 Thread Paul Menzel

Dear David,


Thank you for your patch.

Am 31.03.22 um 19:26 schrieb David Zhang:

[why & how]
In eDP 1.5 spec, some new DPCD bit fileds are defined for PSR-SU
support.


You could be specific by using the exact number two. Maybe:

As per eDP 1.5 specification, add the two DPCD bit fields below for 
PSR-SU support:


1.  DP_PSR2_WITH_Y_COORD_ET_SUPPORTED
2.  DP_PSR2_SU_AUX_FRAME_SYNC_NOT_NEEDED


Kind regards,

Paul


Signed-off-by: David Zhang 
---
  include/drm/drm_dp_helper.h | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
index 30359e434c3f..ac7b7571ae66 100644
--- a/include/drm/drm_dp_helper.h
+++ b/include/drm/drm_dp_helper.h
@@ -361,6 +361,7 @@ struct drm_panel;
  # define DP_PSR_IS_SUPPORTED1
  # define DP_PSR2_IS_SUPPORTED 2   /* eDP 1.4 */
  # define DP_PSR2_WITH_Y_COORD_IS_SUPPORTED  3 /* eDP 1.4a */
+# define DP_PSR2_WITH_Y_COORD_ET_SUPPORTED  4  /* eDP 1.5, adopted eDP 
1.4b SCR */
  
  #define DP_PSR_CAPS 0x071   /* XXX 1.2? */

  # define DP_PSR_NO_TRAIN_ON_EXIT1
@@ -375,6 +376,7 @@ struct drm_panel;
  # define DP_PSR_SETUP_TIME_SHIFT1
  # define DP_PSR2_SU_Y_COORDINATE_REQUIRED   (1 << 4)  /* eDP 1.4a */
  # define DP_PSR2_SU_GRANULARITY_REQUIRED(1 << 5)  /* eDP 1.4b */
+# define DP_PSR2_SU_AUX_FRAME_SYNC_NOT_NEEDED (1 << 6)/* eDP 1.5, adopted eDP 
1.4b SCR */
  
  #define DP_PSR2_SU_X_GRANULARITY	0x072 /* eDP 1.4b */

  #define DP_PSR2_SU_Y_GRANULARITY  0x074 /* eDP 1.4b */


Re: [PATCH] fbdev: defio: fix the pagelist corruption

2022-03-31 Thread Paul Menzel

Dear Chuansheng,


Am 31.03.22 um 02:06 schrieb Liu, Chuansheng:


-Original Message-
From: Paul Menzel 
Sent: Thursday, March 31, 2022 12:47 AM


[…]


Am 29.03.22 um 01:58 schrieb Liu, Chuansheng:


-Original Message-
From: Paul Menzel
Sent: Monday, March 28, 2022 2:15 PM



Am 28.03.22 um 02:58 schrieb Liu, Chuansheng:


-Original Message-



Sent: Saturday, March 26, 2022 4:11 PM



Am 17.03.22 um 06:46 schrieb Chuansheng Liu:

Easily hit the below list corruption:
==
list_add corruption. prev->next should be next (c0ceb090), but
was ec604507edc8. (prev=ec604507edc8).
WARNING: CPU: 65 PID: 3959 at lib/list_debug.c:26
__list_add_valid+0x53/0x80
CPU: 65 PID: 3959 Comm: fbdev Tainted: G U
RIP: 0010:__list_add_valid+0x53/0x80
Call Trace:
 
 fb_deferred_io_mkwrite+0xea/0x150
 do_page_mkwrite+0x57/0xc0
 do_wp_page+0x278/0x2f0
 __handle_mm_fault+0xdc2/0x1590
 handle_mm_fault+0xdd/0x2c0
 do_user_addr_fault+0x1d3/0x650
 exc_page_fault+0x77/0x180
 ? asm_exc_page_fault+0x8/0x30
 asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x7fd98fc8fad1
==

Figure out the race happens when one process is adding >lru into
the pagelist tail in fb_deferred_io_mkwrite(), another process is
re-initializing the same >lru in fb_deferred_io_fault(), which is
not protected by the lock.

This fix is to init all the page lists one time during initialization,
it not only fixes the list corruption, but also avoids INIT_LIST_HEAD()
redundantly.

Fixes: 105a940416fc ("fbdev/defio: Early-out if page is already enlisted")
Cc: Thomas Zimmermann 
Signed-off-by: Chuansheng Liu 
---
 drivers/video/fbdev/core/fb_defio.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/core/fb_defio.c 
b/drivers/video/fbdev/core/fb_defio.c
index 98b0f23bf5e2..eafb66ca4f28 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -59,7 +59,6 @@ static vm_fault_t fb_deferred_io_fault(struct vm_fault *vmf)
printk(KERN_ERR "no mapping available\n");

BUG_ON(!page->mapping);
-   INIT_LIST_HEAD(>lru);
page->index = vmf->pgoff;

vmf->page = page;
@@ -220,6 +219,8 @@ static void fb_deferred_io_work(struct work_struct *work)
 void fb_deferred_io_init(struct fb_info *info)
 {
struct fb_deferred_io *fbdefio = info->fbdefio;
+   struct page *page;
+   int i;

BUG_ON(!fbdefio);
mutex_init(>lock);
@@ -227,6 +228,12 @@ void fb_deferred_io_init(struct fb_info *info)
INIT_LIST_HEAD(>pagelist);
if (fbdefio->delay == 0) /* set a default of 1 s */
fbdefio->delay = HZ;
+
+   /* initialize all the page lists one time */
+   for (i = 0; i < info->fix.smem_len; i += PAGE_SIZE) {
+   page = fb_deferred_io_page(info, i);
+   INIT_LIST_HEAD(>lru);
+   }
 }
 EXPORT_SYMBOL_GPL(fb_deferred_io_init);


Applying your patch on top of current Linus’ master branch, tty0 is
unusable and looks frozen. Sometimes network card still works, sometimes
not.


I don't see how the patch would cause below BUG call stack, need some time to
debug. Just few comments:
1. Will the system work well without this patch?


Yes, the framebuffer works well without the patch.


2. When you are sure the patch causes the regression you saw, please get free

to submit one reverted patch, thanks : )

I think you for patch wasn’t submitted yet – at least not pulled by Linus.

The patch has been in drm-tip, could you have a try with the latest drm-tip to 
see
if the Framebuffer works well, in that case, we could revert it in drm-tip then.


With drm-tip (drm-tip: 2022y-03m-29d-13h-14m-35s UTC integration
manifest) everything works fine. (I had to disable amdgpu driver, as it
failed to build.) Is anyone able to explain that?


My patch is for fixing another patch which is in the drm-tip at least,


The referenced commit 105a940416fc in the Fixes tag is also in Linus’ 
master branch.



so I assume applying my patch into Linus tree directly is not
completely proper. That's my intention of asking your help for
retesting drm-tip.
If there were such a relation, that would need to be documented in the 
commit message.



You mean everything working fine means another issue you hit is also
gone?

No, I just mean the hang when applying your patch.

Anyway, after figuring out, that drm-tip, is actually not behind Linus’ 
master branch, I tried to figure out the differences, and it turns out 
it’s also related to commit fac54e2bfb5b (x86/Kconfig: Select 
HAVE_ARCH_HUGE_VMALLOC with HAVE_ARCH_HUGE_VMAP) [1], which is in Linus’ 
master branch, but not drm-tip. Note, I am using a 32-bit user space and 
a 64-bit Linux kernel. Reverting commit fac54e2bfb5b, and having your 
patch a applied, the hang is gone.


I am adding the people involved in th

Re: [PATCH] fbdev: defio: fix the pagelist corruption

2022-03-30 Thread Paul Menzel

[Cc: -jay...@intworks.biz as it bounces]

Dear Chuansheng,


Am 29.03.22 um 01:58 schrieb Liu, Chuansheng:


-Original Message-
From: Paul Menzel
Sent: Monday, March 28, 2022 2:15 PM



Am 28.03.22 um 02:58 schrieb Liu, Chuansheng:


-Original Message-



Sent: Saturday, March 26, 2022 4:11 PM



Am 17.03.22 um 06:46 schrieb Chuansheng Liu:

Easily hit the below list corruption:
==
list_add corruption. prev->next should be next (c0ceb090), but
was ec604507edc8. (prev=ec604507edc8).
WARNING: CPU: 65 PID: 3959 at lib/list_debug.c:26
__list_add_valid+0x53/0x80
CPU: 65 PID: 3959 Comm: fbdev Tainted: G U
RIP: 0010:__list_add_valid+0x53/0x80
Call Trace:

fb_deferred_io_mkwrite+0xea/0x150
do_page_mkwrite+0x57/0xc0
do_wp_page+0x278/0x2f0
__handle_mm_fault+0xdc2/0x1590
handle_mm_fault+0xdd/0x2c0
do_user_addr_fault+0x1d3/0x650
exc_page_fault+0x77/0x180
? asm_exc_page_fault+0x8/0x30
asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x7fd98fc8fad1
==

Figure out the race happens when one process is adding >lru into
the pagelist tail in fb_deferred_io_mkwrite(), another process is
re-initializing the same >lru in fb_deferred_io_fault(), which is
not protected by the lock.

This fix is to init all the page lists one time during initialization,
it not only fixes the list corruption, but also avoids INIT_LIST_HEAD()
redundantly.

Fixes: 105a940416fc ("fbdev/defio: Early-out if page is already enlisted")
Cc: Thomas Zimmermann 
Signed-off-by: Chuansheng Liu 
---
drivers/video/fbdev/core/fb_defio.c | 9 -
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/core/fb_defio.c

b/drivers/video/fbdev/core/fb_defio.c

index 98b0f23bf5e2..eafb66ca4f28 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -59,7 +59,6 @@ static vm_fault_t fb_deferred_io_fault(struct vm_fault *vmf)
printk(KERN_ERR "no mapping available\n");

BUG_ON(!page->mapping);
-   INIT_LIST_HEAD(>lru);
page->index = vmf->pgoff;

vmf->page = page;
@@ -220,6 +219,8 @@ static void fb_deferred_io_work(struct work_struct *work)
void fb_deferred_io_init(struct fb_info *info)
{
struct fb_deferred_io *fbdefio = info->fbdefio;
+   struct page *page;
+   int i;

BUG_ON(!fbdefio);
mutex_init(>lock);
@@ -227,6 +228,12 @@ void fb_deferred_io_init(struct fb_info *info)
INIT_LIST_HEAD(>pagelist);
if (fbdefio->delay == 0) /* set a default of 1 s */
fbdefio->delay = HZ;
+
+   /* initialize all the page lists one time */
+   for (i = 0; i < info->fix.smem_len; i += PAGE_SIZE) {
+   page = fb_deferred_io_page(info, i);
+   INIT_LIST_HEAD(>lru);
+   }
}
EXPORT_SYMBOL_GPL(fb_deferred_io_init);


Applying your patch on top of current Linus’ master branch, tty0 is
unusable and looks frozen. Sometimes network card still works, sometimes
not.


I don't see how the patch would cause below BUG call stack, need some time to
debug. Just few comments:
1. Will the system work well without this patch?


Yes, the framebuffer works well without the patch.


2. When you are sure the patch causes the regression you saw, please get free

to submit one reverted patch, thanks : )

I think you for patch wasn’t submitted yet – at least not pulled by Linus.

The patch has been in drm-tip, could you have a try with the latest drm-tip to 
see if the
Framebuffer works well, in that case, we could revert it in drm-tip then.


With drm-tip (drm-tip: 2022y-03m-29d-13h-14m-35s UTC integration 
manifest) everything works fine. (I had to disable amdgpu driver, as it 
failed to build.) Is anyone able to explain that?



Kind regards,

Paul


Re: [PATCH v2] drm/amdgpu: fix that issue that the number of the crtc of the 3250c is not correct

2022-03-30 Thread Paul Menzel

[Cc: Remove undeliverable Chun Ming Zhou <mailto:david1.z...@amd.com>]

Am 30.03.22 um 08:34 schrieb Paul Menzel:

Dear Tsung-Hua,


Thank you for your patch.

Am 30.03.22 um 04:46 schrieb Ryan Lin:

The commit message summary is quite long and confusing. Maybe:

Use 3 CRTC for 3250c to get internal display working


[Why]
External displays take priority over internal display when there are 
fewer

display controllers than displays.


This causes the internal display to not work on the Chromebook google/zork.


[How]
The root cause is because of that number of the crtc is not correct.


The root cause is the incorrect number of four configured CRTCs.


The number of the crtc on the 3250c is 3, but on the 3500c is 4.
On the source code, we can see that number of the crtc has been fixed 
at 4.

Needs to set the num_crtc to 3 for 3250c platform.


Please do not wrap lines after each sentence, and use a text width of 75 
characters.



v2:
    - remove unnecessary comments and Id

Signed-off-by: Ryan Lin 

---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +---
  1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index 40c91b448f7da..455a2c45e8cda 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2738,9 +2738,15 @@ static int dm_early_init(void *handle)
  break;
  #if defined(CONFIG_DRM_AMD_DC_DCN1_0)
  case CHIP_RAVEN:
-    adev->mode_info.num_crtc = 4;
-    adev->mode_info.num_hpd = 4;
-    adev->mode_info.num_dig = 4;
+    if (adev->rev_id >= 8) {


Is there some define for that number? Maybe add a comment, that it’s for 
3250c?



Kind regards,

Paul



+    adev->mode_info.num_crtc = 3;
+    adev->mode_info.num_hpd = 3;
+    adev->mode_info.num_dig = 3;
+    } else {
+    adev->mode_info.num_crtc = 4;
+    adev->mode_info.num_hpd = 4;
+    adev->mode_info.num_dig = 4;
+    }
  break;
  #endif
  #if defined(CONFIG_DRM_AMD_DC_DCN2_0)


Re: [PATCH v2] drm/amdgpu: fix that issue that the number of the crtc of the 3250c is not correct

2022-03-30 Thread Paul Menzel

Dear Tsung-Hua,


Thank you for your patch.

Am 30.03.22 um 04:46 schrieb Ryan Lin:

The commit message summary is quite long and confusing. Maybe:

Use 3 CRTC for 3250c to get internal display working


[Why]
External displays take priority over internal display when there are fewer
display controllers than displays.


This causes the internal display to not work on the Chromebook google/zork.


[How]
The root cause is because of that number of the crtc is not correct.


The root cause is the incorrect number of four configured CRTCs.


The number of the crtc on the 3250c is 3, but on the 3500c is 4.
On the source code, we can see that number of the crtc has been fixed at 4.
Needs to set the num_crtc to 3 for 3250c platform.


Please do not wrap lines after each sentence, and use a text width of 75 
characters.



v2:
- remove unnecessary comments and Id

Signed-off-by: Ryan Lin 

---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +---
  1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 40c91b448f7da..455a2c45e8cda 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2738,9 +2738,15 @@ static int dm_early_init(void *handle)
break;
  #if defined(CONFIG_DRM_AMD_DC_DCN1_0)
case CHIP_RAVEN:
-   adev->mode_info.num_crtc = 4;
-   adev->mode_info.num_hpd = 4;
-   adev->mode_info.num_dig = 4;
+   if (adev->rev_id >= 8) {


Is there some define for that number? Maybe add a comment, that it’s for 
3250c?



Kind regards,

Paul



+   adev->mode_info.num_crtc = 3;
+   adev->mode_info.num_hpd = 3;
+   adev->mode_info.num_dig = 3;
+   } else {
+   adev->mode_info.num_crtc = 4;
+   adev->mode_info.num_hpd = 4;
+   adev->mode_info.num_dig = 4;
+   }
break;
  #endif
  #if defined(CONFIG_DRM_AMD_DC_DCN2_0)


Re: 回复: Re: [PATCH] drm/amdgpu: resolve s3 hang for r7340

2022-03-29 Thread Paul Menzel

Dear 李真,


[Your mailer formatted the message oddly. Maybe configure it to use only 
plain text email with no HTML parts – common in Linux kernel community 
–, or, if not possible, switch to something else (Mozilla Thunderbird, …).]



Am 29.03.22 um 04:54 schrieb 李真能:

[…]


*日 期:*2022-03-28 15:38
*发件人:*Paul Menzel


[…]


Am 28.03.22 um 09:36 schrieb Paul Menzel:
  > Dear Zhenneng,
  >
  >
  > Thank you for your patch.
  >
  > Am 28.03.22 um 06:05 schrieb Zhenneng Li:
  >> This is a workaround for s3 hang for r7340(amdgpu).
  >
  > Is it hanging when resuming from S3?

Yes, this func is a delayed work after init graphics card.


Thank for clarifying it.


  > Maybe also use the line below for
  > the commit message summary:
  >
  > drm/amdgpu: Add 1 ms delay to init handler to fix s3 resume hang
  >
  > Also, please add a space before the ( in “r7340(amdgpu)”.
  >
  >> When we test s3 with r7340 on arm64 platform, graphics card will hang up,
  >> the error message are as follows:
  >> Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.599374][ 7] [  
T291] amdgpu :02:00.0: fb0: amdgpudrmfb frame buffer device
  >> Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.612869][ 7] [  
T291] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP blockfailed 
-22
  >> Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.623392][ 7] [  
T291] amdgpu :02:00.0: amdgpu_device_ip_late_init failed
  >> Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.630696][ 7] [  
T291] amdgpu :02:00.0: Fatal error during GPU init
  >> Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.637477][ 7] [  
T291] [drm] amdgpu: finishing device.
  >
  > The prefix in the beginning is not really needed. Only the stuff after
  > `kernel: `.
  >
  > Maybe also add the output of `lspci -nn -s …` for that r7340 device.
  >
  >> Change-Id: I5048b3894c0ca9faf2f4847ddab61f9eb17b4823
  >
  > Without the Gerrit instance this belongs to, the Change-Id is of no use
  > in the public.
  >
  >> Signed-off-by: Zhenneng Li
  >> ---
  >>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
  >>   1 file changed, 2 insertions(+)
  >>
  >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
  >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
  >> index 3987ecb24ef4..1eced991b5b2 100644
  >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
  >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
  >> @@ -2903,6 +2903,8 @@ static void
  >> amdgpu_device_delayed_init_work_handler(struct work_struct *work)
  >>   container_of(work, struct amdgpu_device, delayed_init_work.work);
  >>   int r;
  >> +mdelay(1);
  >> +
  >

  > Wow, I wonder how long it took you to find that workaround.

About 3 months, I try to add this delay
work(amdgpu_device_delayed_init_work_handler) from 2000ms to 2500ms, or use mb()
instead of mdelay(1), but it's useless, I don't know the reason,the occurrence
probability  of this bug is one ten-thousandth, do you know the possible 
reasons?


Oh, it’s not even always reproducible. That is hard. Did you try another 
graphics card or another ARM board to rule out hardware specific issues?


Sorry, I do not. Maybe the developers with access to non-public 
datasheets and erratas know.



  >>   r = amdgpu_ib_ring_tests(adev);
  >>   if (r)
  >>   DRM_ERROR("ib ring test failed (%d).\n", r);



Kind regards,

Paul


Re: [PATCH] drm/amdgpu: resolve s3 hang for r7340

2022-03-28 Thread Paul Menzel

[Cc: -Jack Zhang (invalid address)

Am 28.03.22 um 09:36 schrieb Paul Menzel:

Dear Zhenneng,


Thank you for your patch.

Am 28.03.22 um 06:05 schrieb Zhenneng Li:

This is a workaround for s3 hang for r7340(amdgpu).


Is it hanging when resuming from S3? Maybe also use the line below for 
the commit message summary:


drm/amdgpu: Add 1 ms delay to init handler to fix s3 resume hang

Also, please add a space before the ( in “r7340(amdgpu)”.


When we test s3 with r7340 on arm64 platform, graphics card will hang up,
the error message are as follows:
Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [    1.599374][ 7] [  T291] 
amdgpu :02:00.0: fb0: amdgpudrmfb frame buffer device
Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [    1.612869][ 7] [  T291] 
[drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block 
 failed -22
Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [    1.623392][ 7] [  T291] 
amdgpu :02:00.0: amdgpu_device_ip_late_init failed
Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [    1.630696][ 7] [  T291] 
amdgpu :02:00.0: Fatal error during GPU init
Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [    1.637477][ 7] [  T291] 
[drm] amdgpu: finishing device.


The prefix in the beginning is not really needed. Only the stuff after 
`kernel: `.


Maybe also add the output of `lspci -nn -s …` for that r7340 device.


Change-Id: I5048b3894c0ca9faf2f4847ddab61f9eb17b4823


Without the Gerrit instance this belongs to, the Change-Id is of no use 
in the public.



Signed-off-by: Zhenneng Li 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 3987ecb24ef4..1eced991b5b2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2903,6 +2903,8 @@ static void 
amdgpu_device_delayed_init_work_handler(struct work_struct *work)

  container_of(work, struct amdgpu_device, delayed_init_work.work);
  int r;
+    mdelay(1);
+


Wow, I wonder how long it took you to find that workaround.


  r = amdgpu_ib_ring_tests(adev);
  if (r)
  DRM_ERROR("ib ring test failed (%d).\n", r);



Kind regards,

Paul


Re: [PATCH] drm/amdgpu: resolve s3 hang for r7340

2022-03-28 Thread Paul Menzel

Dear Zhenneng,


Thank you for your patch.

Am 28.03.22 um 06:05 schrieb Zhenneng Li:

This is a workaround for s3 hang for r7340(amdgpu).


Is it hanging when resuming from S3? Maybe also use the line below for 
the commit message summary:


drm/amdgpu: Add 1 ms delay to init handler to fix s3 resume hang

Also, please add a space before the ( in “r7340(amdgpu)”.


When we test s3 with r7340 on arm64 platform, graphics card will hang up,
the error message are as follows:
Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.599374][ 7] [  T291] 
amdgpu :02:00.0: fb0: amdgpudrmfb frame buffer device
Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.612869][ 7] [  T291] 
[drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block 
 failed -22
Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.623392][ 7] [  T291] 
amdgpu :02:00.0: amdgpu_device_ip_late_init failed
Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.630696][ 7] [  T291] 
amdgpu :02:00.0: Fatal error during GPU init
Mar  4 01:14:11 greatwall-GW-XX-XXX kernel: [1.637477][ 7] [  T291] 
[drm] amdgpu: finishing device.


The prefix in the beginning is not really needed. Only the stuff after 
`kernel: `.


Maybe also add the output of `lspci -nn -s …` for that r7340 device.


Change-Id: I5048b3894c0ca9faf2f4847ddab61f9eb17b4823


Without the Gerrit instance this belongs to, the Change-Id is of no use 
in the public.



Signed-off-by: Zhenneng Li 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 3987ecb24ef4..1eced991b5b2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2903,6 +2903,8 @@ static void 
amdgpu_device_delayed_init_work_handler(struct work_struct *work)
container_of(work, struct amdgpu_device, 
delayed_init_work.work);
int r;
  
+	mdelay(1);

+


Wow, I wonder how long it took you to find that workaround.


r = amdgpu_ib_ring_tests(adev);
if (r)
DRM_ERROR("ib ring test failed (%d).\n", r);



Kind regards,

Paul


Re: [PATCH] fbdev: defio: fix the pagelist corruption

2022-03-28 Thread Paul Menzel

Dear Chuansheng,


Am 28.03.22 um 02:58 schrieb Liu, Chuansheng:


-Original Message-



Sent: Saturday, March 26, 2022 4:11 PM



Am 17.03.22 um 06:46 schrieb Chuansheng Liu:

Easily hit the below list corruption:
==
list_add corruption. prev->next should be next (c0ceb090), but
was ec604507edc8. (prev=ec604507edc8).
WARNING: CPU: 65 PID: 3959 at lib/list_debug.c:26
__list_add_valid+0x53/0x80
CPU: 65 PID: 3959 Comm: fbdev Tainted: G U
RIP: 0010:__list_add_valid+0x53/0x80
Call Trace:
   
   fb_deferred_io_mkwrite+0xea/0x150
   do_page_mkwrite+0x57/0xc0
   do_wp_page+0x278/0x2f0
   __handle_mm_fault+0xdc2/0x1590
   handle_mm_fault+0xdd/0x2c0
   do_user_addr_fault+0x1d3/0x650
   exc_page_fault+0x77/0x180
   ? asm_exc_page_fault+0x8/0x30
   asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x7fd98fc8fad1
==

Figure out the race happens when one process is adding >lru into
the pagelist tail in fb_deferred_io_mkwrite(), another process is
re-initializing the same >lru in fb_deferred_io_fault(), which is
not protected by the lock.

This fix is to init all the page lists one time during initialization,
it not only fixes the list corruption, but also avoids INIT_LIST_HEAD()
redundantly.

Fixes: 105a940416fc ("fbdev/defio: Early-out if page is already enlisted")
Cc: Thomas Zimmermann 
Signed-off-by: Chuansheng Liu 
---
   drivers/video/fbdev/core/fb_defio.c | 9 -
   1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/core/fb_defio.c 
b/drivers/video/fbdev/core/fb_defio.c
index 98b0f23bf5e2..eafb66ca4f28 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -59,7 +59,6 @@ static vm_fault_t fb_deferred_io_fault(struct vm_fault *vmf)
printk(KERN_ERR "no mapping available\n");

BUG_ON(!page->mapping);
-   INIT_LIST_HEAD(>lru);
page->index = vmf->pgoff;

vmf->page = page;
@@ -220,6 +219,8 @@ static void fb_deferred_io_work(struct work_struct *work)
   void fb_deferred_io_init(struct fb_info *info)
   {
struct fb_deferred_io *fbdefio = info->fbdefio;
+   struct page *page;
+   int i;

BUG_ON(!fbdefio);
mutex_init(>lock);
@@ -227,6 +228,12 @@ void fb_deferred_io_init(struct fb_info *info)
INIT_LIST_HEAD(>pagelist);
if (fbdefio->delay == 0) /* set a default of 1 s */
fbdefio->delay = HZ;
+
+   /* initialize all the page lists one time */
+   for (i = 0; i < info->fix.smem_len; i += PAGE_SIZE) {
+   page = fb_deferred_io_page(info, i);
+   INIT_LIST_HEAD(>lru);
+   }
   }
   EXPORT_SYMBOL_GPL(fb_deferred_io_init);


Applying your patch on top of current Linus’ master branch, tty0 is
unusable and looks frozen. Sometimes network card still works, sometimes
not.


I don't see how the patch would cause below BUG call stack, need some time to
debug. Just few comments:
1. Will the system work well without this patch?


Yes, the framebuffer works well without the patch.


2. When you are sure the patch causes the regression you saw, please get free 
to submit
one reverted patch, thanks : )


I think you for patch wasn’t submitted yet – at least not pulled by Linus.


  $ git log --oneline -nodecorate -2
  1b351a77ed33 (HEAD -> linus) fbdev: defio: fix the pagelist corruption
  52d543b5497c (origin/master, origin/HEAD) Merge tag 'for-linus-5.17-1' of 
https://github.com/cminyard/linux-ipmi

```
[5.256996] raw:    

[5.269582] page dumped because: VM_BUG_ON_PAGE(compound && 
compound_order(page) != order)
[5.279507] [ cut here ]
[5.286406] kernel BUG at mm/page_alloc.c:1326!
[5.291814] invalid opcode:  [#1] PREEMPT SMP
[5.296350] CPU: 0 PID: 167 Comm: systemd-udevd Not tainted 
5.17.0-10753-g1b351a77ed33 #300
[5.304670] Hardware name: ASUS F2A85-M_PRO/F2A85-M_PRO, BIOS 
4.16-337-gb87986e67b 03/25/2022
[5.313163] RIP: 0010:free_pcp_prepare+0x295/0x400
[5.317930] Code: 00 01 00 75 0b 48 8b 45 08 45 31 ff a8 01 74 4b 48 8b 45 00 a9 
00 00 01 00 75 22 48 c7 c6 68 30 11 96 48 89 ef e8 cb 29 fd ff <0f> 0b 48 89 ef 
41 83 c6 01 e8 bd f5 ff ff e9 2e fe ff ff 0f 1f 44
[5.336650] RSP: 0018:a6634062f9c0 EFLAGS: 00010246
[5.341849] RAX: 004e RBX: e4be8000 RCX: 
[5.348957] RDX:  RSI: 96136a37 RDI: 
[5.356063] RBP: e4be840c R08:  R09: dfff
[5.363170] R10: a6634062f7f0 R11: 9652c4a8 R12: 
[5.370277] R13: 0009 R14: 91fd02ebc640 R15: e4be840c
[5.377384] FS:  () GS:91fd7b40(0063) 
knlGS:f7eea800
[5.385443] CS:  0010 DS: 002b ES: 002b CR0: 80050033
[5.391164] CR2: f6f0e840 CR3: 000106b6 CR4: 000406f0
[

Re: [PATCH] fbdev: defio: fix the pagelist corruption

2022-03-26 Thread Paul Menzel

Dear Chuansheng,


Am 17.03.22 um 06:46 schrieb Chuansheng Liu:

Easily hit the below list corruption:
==
list_add corruption. prev->next should be next (c0ceb090), but
was ec604507edc8. (prev=ec604507edc8).
WARNING: CPU: 65 PID: 3959 at lib/list_debug.c:26
__list_add_valid+0x53/0x80
CPU: 65 PID: 3959 Comm: fbdev Tainted: G U
RIP: 0010:__list_add_valid+0x53/0x80
Call Trace:
  
  fb_deferred_io_mkwrite+0xea/0x150
  do_page_mkwrite+0x57/0xc0
  do_wp_page+0x278/0x2f0
  __handle_mm_fault+0xdc2/0x1590
  handle_mm_fault+0xdd/0x2c0
  do_user_addr_fault+0x1d3/0x650
  exc_page_fault+0x77/0x180
  ? asm_exc_page_fault+0x8/0x30
  asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x7fd98fc8fad1
==

Figure out the race happens when one process is adding >lru into
the pagelist tail in fb_deferred_io_mkwrite(), another process is
re-initializing the same >lru in fb_deferred_io_fault(), which is
not protected by the lock.

This fix is to init all the page lists one time during initialization,
it not only fixes the list corruption, but also avoids INIT_LIST_HEAD()
redundantly.

Fixes: 105a940416fc ("fbdev/defio: Early-out if page is already
enlisted")
Cc: Thomas Zimmermann 
Signed-off-by: Chuansheng Liu 
---
  drivers/video/fbdev/core/fb_defio.c | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/core/fb_defio.c 
b/drivers/video/fbdev/core/fb_defio.c
index 98b0f23bf5e2..eafb66ca4f28 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -59,7 +59,6 @@ static vm_fault_t fb_deferred_io_fault(struct vm_fault *vmf)
printk(KERN_ERR "no mapping available\n");
  
  	BUG_ON(!page->mapping);

-   INIT_LIST_HEAD(>lru);
page->index = vmf->pgoff;
  
  	vmf->page = page;

@@ -220,6 +219,8 @@ static void fb_deferred_io_work(struct work_struct *work)
  void fb_deferred_io_init(struct fb_info *info)
  {
struct fb_deferred_io *fbdefio = info->fbdefio;
+   struct page *page;
+   int i;
  
  	BUG_ON(!fbdefio);

mutex_init(>lock);
@@ -227,6 +228,12 @@ void fb_deferred_io_init(struct fb_info *info)
INIT_LIST_HEAD(>pagelist);
if (fbdefio->delay == 0) /* set a default of 1 s */
fbdefio->delay = HZ;
+
+   /* initialize all the page lists one time */
+   for (i = 0; i < info->fix.smem_len; i += PAGE_SIZE) {
+   page = fb_deferred_io_page(info, i);
+   INIT_LIST_HEAD(>lru);
+   }
  }
  EXPORT_SYMBOL_GPL(fb_deferred_io_init);
  
Applying your patch on top of current Linus’ master branch, tty0 is 
unusable and looks frozen. Sometimes network card still works, sometimes 
not.


$ git log --oneline -nodecorate -2
1b351a77ed33 (HEAD -> linus) fbdev: defio: fix the pagelist corruption
52d543b5497c (origin/master, origin/HEAD) Merge tag 
'for-linus-5.17-1' of https://github.com/cminyard/linux-ipmi


```
[5.256996] raw:    

[5.269582] page dumped because: VM_BUG_ON_PAGE(compound && 
compound_order(page) != order)

[5.279507] [ cut here ]
[5.286406] kernel BUG at mm/page_alloc.c:1326!
[5.291814] invalid opcode:  [#1] PREEMPT SMP
[5.296350] CPU: 0 PID: 167 Comm: systemd-udevd Not tainted 
5.17.0-10753-g1b351a77ed33 #300
[5.304670] Hardware name: ASUS F2A85-M_PRO/F2A85-M_PRO, BIOS 
4.16-337-gb87986e67b 03/25/2022

[5.313163] RIP: 0010:free_pcp_prepare+0x295/0x400
[5.317930] Code: 00 01 00 75 0b 48 8b 45 08 45 31 ff a8 01 74 4b 48 
8b 45 00 a9 00 00 01 00 75 22 48 c7 c6 68 30 11 96 48 89 ef e8 cb 29 fd 
ff <0f> 0b 48 89 ef 41 83 c6 01 e8 bd f5 ff ff e9 2e fe ff ff 0f 1f 44

[5.336650] RSP: 0018:a6634062f9c0 EFLAGS: 00010246
[5.341849] RAX: 004e RBX: e4be8000 RCX: 

[5.348957] RDX:  RSI: 96136a37 RDI: 

[5.356063] RBP: e4be840c R08:  R09: 
dfff
[5.363170] R10: a6634062f7f0 R11: 9652c4a8 R12: 

[5.370277] R13: 0009 R14: 91fd02ebc640 R15: 
e4be840c
[5.377384] FS:  () GS:91fd7b40(0063) 
knlGS:f7eea800

[5.385443] CS:  0010 DS: 002b ES: 002b CR0: 80050033
[5.391164] CR2: f6f0e840 CR3: 000106b6 CR4: 
000406f0

[5.398272] Call Trace:
[5.400697]  
[5.402778]  free_unref_page+0x1b/0xf0
[5.406505]  __vunmap+0x216/0x2c0
[5.409798]  drm_fbdev_cleanup+0x5f/0xb0
[5.413698]  drm_fbdev_fb_destroy+0x15/0x30
[5.417857]  unregister_framebuffer+0x2c/0x40
[5.422191]  drm_client_dev_unregister+0x69/0xe0
[5.422962] usb usb4: New USB device found, idVendor=1d6b, 
idProduct=0003, bcdDevice= 5.17

[5.426784]  drm_dev_unregister+0x2e/0x80
[5.439005]  drm_dev_unplug+0x21/0x40
[5.442645]  simpledrm_remove+0x11/0x20
[

Re: Commit messages

2022-03-25 Thread Paul Menzel

Dear Christian, dear Daniel, dear Alex,


Am 23.03.22 um 16:32 schrieb Christian König:

Am 23.03.22 um 16:24 schrieb Daniel Stone:

On Wed, 23 Mar 2022 at 15:14, Alex Deucher  wrote:
On Wed, Mar 23, 2022 at 11:04 AM Daniel Stone  
wrote:

That's not what anyone's saying here ...

No-one's demanding AMD publish RTL, or internal design docs, or
hardware specs, or URLs to JIRA tickets no-one can access.

This is a large and invasive commit with pretty big ramifications;
containing exactly two lines of commit message, one of which just
duplicates the subject.

It cannot be the case that it's completely impossible to provide any
justification, background, or details, about this commit being made.
Unless, of course, it's to fix a non-public security issue, that is
reasonable justification for eliding some of the details. But then
again, 'huge change which is very deliberately opaque' is a really
good way to draw a lot of attention to the commit, and it would be
better to provide more detail about the change to help it slip under
the radar.

If dri-devel@ isn't allowed to inquire about patches which are posted,
then CCing the list is just a façade; might as well just do it all
internally and periodically dump out pull requests.

I think we are in agreement. I think the withheld information
Christian was referring to was on another thread with Christian and
Paul discussing a workaround for a hardware bug:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg75908.htmldata=04%7C01%7Cchristian.koenig%40amd.com%7C6a3f2815d83b4872577008da0ce1347a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637836458652370599%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=QtNB0XHMhTgH%2FNHMwF23Qn%2BgSdYyHJSenbpP%2FHG%2BkxE%3Dreserved=0 


(Thank you Microsoft for keeping us safe.)

I guess it proves, how assuming what other people should know/have read, 
especially when crossing message threads, is causing confusion and 
misunderstandings.



Right, that definitely seems like some crossed wires. I don't see
anything wrong with that commit at all: the commit message and a
comment notes that there is a hardware issue preventing Raven from
being able to do TMZ+GTT, and the code does the very straightforward
and obvious thing to ensure that on VCN 1.0, any TMZ buffer must be
VRAM-placed.


My questions were:

Where is that documented, and how can this be reproduced? 


Shouldn’t these be answered by the commit message? In five(?) years, 
somebody, maybe even with access to the currently non-public documents, 
finds a fault in the commit, and would be helped by having an 
document/errata number where to look at. To verify the fix, the 
developer would need a method to produce the error, so why not just 
share it?


Also, I assume that workarounds often come with downsides, as otherwise 
it would have been programmed like this from the beginning, or instead 
of “workaround” it would be called “improvement”. Shouldn’t that also be 
answered?


So totally made-up example:

Currently, there is a graphics corruption running X on system Y. This is 
caused by a hardware bug in Raven ASIC (details internal document 
#/AMD-Jira #N), and can be worked around by [what is in the commit 
message].


The workaround does not affect the performance, and testing X shows the 
error is fixed.



This one, on the other hand, is much less clear ...


Yes, completely agree. I mean a good bunch of comments on commit 
messages are certainly valid and we could improve them.


That’d be great.

But this patch here was worked on by both AMD and Intel developers. 
Where both sides and I think even people from other companies perfectly 
understands why, what, how etc...


When now somebody comes along and asks for a whole explanation of the 
context why we do it then that sounds really strange to me.


The motivation should be part of the commit message. I didn’t mean 
anyone to rewrite buddy memory allocator Wikipedia article [1]. But the 
commit message at hand for switching the allocator is definitely too terse.



Kind regards,

Paul


[1]: https://en.wikipedia.org/wiki/Buddy_memory_allocation


Commit messages (was: [PATCH v11] drm/amdgpu: add drm buddy support to amdgpu)

2022-03-23 Thread Paul Menzel

Dear Christian,


Am 23.03.22 um 08:42 schrieb Christian König:


Am 23.03.22 um 07:42 schrieb Paul Menzel:



Am 23.03.22 um 07:25 schrieb Arunpravin Paneer Selvam:

- Remove drm_mm references and replace with drm buddy functionalities


The commit message summary to me suggested, you can somehow use both 
allocators now. Two suggestions below:


1.  Switch to drm buddy allocator
2.  Use drm buddy alllocator


- Add res cursor support for drm buddy


As an allocator switch sounds invasive, could you please extend the 
commit message, briefly describing the current situation, saying what 
the downsides are, and why the buddy allocator is “better”.


Well, Paul please stop bothering developers with those requests.

It's my job as maintainer to supervise the commit messages and it is 
certainly NOT require to explain all the details of the current 
situation in a commit message. That is just overkill.


I did not request all the details, and I think my requests are totally 
reasonable. But let’s change the perspective. If there were not any AMD 
graphics drivers bug, I would have never needed to look at the code and 
deal with it. Unfortunately the AMD graphics driver situation – which 
improved a lot in recent years – with no public documentation, 
proprietary firmware and complex devices is still not optimal, and a lot 
of bugs get reported, and I am also hit by bugs, taking time to deal 
with them, and maybe reporting and helping to analyze them. So to keep 
your wording, if you would stop bothering users with bugs and requesting 
their help in fixing them – asking the user to bisect the issue is often 
the first thing. Actually it should not be unreasonable for customers 
buying an AMD device to expect get bug free drivers. It’s strange and a 
sad fact, that the software industry succeeded to sway that valid 
expectation and customers now except they need to regularly install 
software updates, and do not get, for example, a price reduction when 
there are bugs.


Also, as stated everywhere, reviewer time is scarce, so commit authors 
should make it easy to attract new folks.


A simple note that we are switching from the drm_mm backend to the buddy 
backend is sufficient, and that is exactly what the commit message is 
saying here.


Sorry, I disagree. The motivation needs to be part of the commit 
message. For example see recent discussion on the LWN article 
*Donenfeld: Random number generator enhancements for Linux 5.17 and 
5.18* [1].


How much the commit message should be extended, I do not know, but the 
current state is insufficient (too terse).



Kind regards,

Paul


[1]: https://lwn.net/Articles/888413/
 "Donenfeld: Random number generator enhancements for Linux 5.17 
and 5.18"


Re: [PATCH v11] drm/amdgpu: add drm buddy support to amdgpu

2022-03-23 Thread Paul Menzel

Dear Arunprivin,


Thank you for your patch.

Am 23.03.22 um 07:25 schrieb Arunpravin Paneer Selvam:

- Remove drm_mm references and replace with drm buddy functionalities


The commit message summary to me suggested, you can somehow use both 
allocators now. Two suggestions below:


1.  Switch to drm buddy allocator
2.  Use drm buddy alllocator


- Add res cursor support for drm buddy


As an allocator switch sounds invasive, could you please extend the 
commit message, briefly describing the current situation, saying what 
the downsides are, and why the buddy allocator is “better”.


How did you test it? How can it be tested, that there are no regressions?


v2(Matthew Auld):


Nit: I’d add a space before (.


Kind regards,

Paul



   - replace spinlock with mutex as we call kmem_cache_zalloc
 (..., GFP_KERNEL) in drm_buddy_alloc() function

   - lock drm_buddy_block_trim() function as it calls
 mark_free/mark_split are all globally visible

v3(Matthew Auld):
   - remove trim method error handling as we address the failure case
 at drm_buddy_block_trim() function

v4:
   - fix warnings reported by kernel test robot 

v5:
   - fix merge conflict issue

v6:
   - fix warnings reported by kernel test robot 

v7:
   - remove DRM_BUDDY_RANGE_ALLOCATION flag usage

v8:
   - keep DRM_BUDDY_RANGE_ALLOCATION flag usage
   - resolve conflicts created by drm/amdgpu: remove VRAM accounting v2

v9(Christian):
   - merged the below patch
  - drm/amdgpu: move vram inline functions into a header
   - rename label name as fallback
   - move struct amdgpu_vram_mgr to amdgpu_vram_mgr.h
   - remove unnecessary flags from struct amdgpu_vram_reservation
   - rewrite block NULL check condition
   - change else style as per coding standard
   - rewrite the node max size
   - add a helper function to fetch the first entry from the list

v10(Christian):
- rename amdgpu_get_node() function name as amdgpu_vram_mgr_first_block

v11:
- if size is not aligned with min_page_size, enable is_contiguous flag,
  therefore, the size round up to the power of two and trimmed to the
  original size.

Signed-off-by: Arunpravin Paneer Selvam 
---
  drivers/gpu/drm/Kconfig   |   1 +
  .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |  10 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 263 ++
  4 files changed, 234 insertions(+), 137 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index f1422bee3dcc..5133c3f028ab 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -280,6 +280,7 @@ config DRM_AMDGPU
select HWMON
select BACKLIGHT_CLASS_DEVICE
select INTERVAL_TREE
+   select DRM_BUDDY
help
  Choose this option if you have a recent AMD Radeon graphics card.
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h

index acfa207cf970..864c609ba00b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -30,12 +30,15 @@
  #include 
  #include 
  
+#include "amdgpu_vram_mgr.h"

+
  /* state back for walking over vram_mgr and gtt_mgr allocations */
  struct amdgpu_res_cursor {
uint64_tstart;
uint64_tsize;
uint64_tremaining;
-   struct drm_mm_node  *node;
+   void*node;
+   uint32_tmem_type;
  };
  
  /**

@@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource 
*res,
uint64_t start, uint64_t size,
struct amdgpu_res_cursor *cur)
  {
+   struct drm_buddy_block *block;
+   struct list_head *head, *next;
struct drm_mm_node *node;
  
-	if (!res || res->mem_type == TTM_PL_SYSTEM) {

-   cur->start = start;
-   cur->size = size;
-   cur->remaining = size;
-   cur->node = NULL;
-   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
-   return;
-   }
+   if (!res)
+   goto fallback;
  
  	BUG_ON(start + size > res->num_pages << PAGE_SHIFT);
  
-	node = to_ttm_range_mgr_node(res)->mm_nodes;

-   while (start >= node->size << PAGE_SHIFT)
-   start -= node++->size << PAGE_SHIFT;
+   cur->mem_type = res->mem_type;
+
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   head = _amdgpu_vram_mgr_node(res)->blocks;
+
+   block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!block)
+   goto fallback;
+
+   while (start >= amdgpu_node_size(block)) {
+   start -= 

Re: [PATCH] drm: add a check to verify the size alignment

2022-03-21 Thread Paul Menzel

Dear Arunpravin,


Am 21.03.22 um 06:59 schrieb Arunpravin Paneer Selvam:

add a simple check to reject any size not aligned to the
min_page_size.


Nit: I’d start sentences with a capital letter.

Could you please add a summary of the discussion to the commit message, 
so the question “Why?” is answered?



Kind regards,

Paul



Signed-off-by: Arunpravin Paneer Selvam 
---
  drivers/gpu/drm/drm_buddy.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 72f52f293249..b503c88786b0 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -661,6 +661,9 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
if (range_overflows(start, size, mm->size))
return -EINVAL;
  
+	if (WARN_ON(!IS_ALIGNED(size, min_page_size)))

+   return -EINVAL;
+
/* Actual range allocation */
if (start + size == end)
return __drm_buddy_alloc_range(mm, start, size, blocks);


Re: [PATCH] drm/amd/display: Fixed the unused-but-set-variable warning

2022-03-17 Thread Paul Menzel

Dear Aashish,


Am 17.03.22 um 15:01 schrieb Aashish Sharma:

Thank you for your patch. If you are going to send a v2, please use 
imperative mood. Maybe:


drm/amd/display: Fix unused-but-set-variable warning



Fixed this kernel test robot warning:


Maybe:

Fix the kernel test robot warning below:


drivers/gpu/drm/amd/amdgpu/../display/dmub/inc/dmub_cmd.h:2893:12:
warning: variable 'temp' set but not used [-Wunused-but-set-variable]

Replaced the assignment to the unused temp variable with READ_ONCE()
macro to flush the writes.


Replace …

Excuse my ignorance regarding `READ_ONCE()`, but is that the reason you 
remove the volatile qualifier?


Some robots ask in their report to add a Found-by tag. If so, please add 
one.



Signed-off-by: Aashish Sharma 
---
  drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h 
b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
index 873ecd04e01d..b7981a781701 100644
--- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
+++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
@@ -2913,13 +2913,12 @@ static inline void dmub_rb_flush_pending(const struct 
dmub_rb *rb)
uint32_t wptr = rb->wrpt;
  
  	while (rptr != wptr) {

-   uint64_t volatile *data = (uint64_t volatile *)((uint8_t 
*)(rb->base_address) + rptr);
+   uint64_t *data = (uint64_t volatile *)((uint8_t 
*)(rb->base_address) + rptr);
//uint64_t volatile *p = (uint64_t volatile *)data;
-   uint64_t temp;
uint8_t i;
  
  		for (i = 0; i < DMUB_RB_CMD_SIZE / sizeof(uint64_t); i++)

-   temp = *data++;
+   (void)READ_ONCE(*data++);


Did you verify, that the generated code is the same now, or what the 
differences are. If it’s different from before, you should document in 
the commit message, that it’s wanted, as otherwise, it’s an invasive 
change just to fix a warning.



rptr += DMUB_RB_CMD_SIZE;
if (rptr >= rb->capacity)



Kind regards,

Paul


Re: [PATCH] drm: Fix a infinite loop condition when order becomes 0

2022-03-16 Thread Paul Menzel

Dear Arunprivin,


Am 16.03.22 um 07:49 schrieb Arunpravin Paneer Selvam:


On 15/03/22 9:14 pm, Paul Menzel wrote:



Am 15.03.22 um 16:42 schrieb Arunpravin:


On 15/03/22 2:35 pm, Paul Menzel wrote:



Am 15.03.22 um 10:01 schrieb Arunpravin:


On 15/03/22 1:49 pm, Paul Menzel wrote:



Am 14.03.22 um 20:40 schrieb Arunpravin:

handle a situation in the condition order-- == min_order,
when order = 0, leading to order = -1, it now won't exit
the loop. To avoid this problem, added a order check in
the same condition, (i.e) when order is 0, we return
-ENOSPC

Signed-off-by: Arunpravin 


Please use your full name.

okay


You might also configure that in your email program.

yes


Not done yet though. ;-)


done in v2 :)

---
 drivers/gpu/drm/drm_buddy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 72f52f293249..5ab66aaf2bbd 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c


In what tree is that file?


drm-tip - 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm-tip%2Ftree%2Fdata=04%7C01%7CArunpravin.PaneerSelvam%40amd.com%7C3610aafe216d421c715c08da069ac1d7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637829559006306914%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=GM3iXiDQCx%2BM4pD1nmivRFRvkehwTNd2Jtd713cF51g%3Dreserved=0
drm-misc-next - 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Ftree%2Fdata=04%7C01%7CArunpravin.PaneerSelvam%40amd.com%7C3610aafe216d421c715c08da069ac1d7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637829559006306914%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=i7pvmDJu310XRX7h3cQ344j5RYHq7fBZ520l%2F%2Br1%2BQU%3Dreserved=0


Thank Outlook. Now everybody feels safe.


@@ -685,7 +685,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
if (!IS_ERR(block))
break;
 
-			if (order-- == min_order) {

+   if (!order || order-- == min_order) {
err = -ENOSPC;
goto err_free;
}


Thank you for the hint. So the whole function is:

do {
order = min(order, (unsigned int)fls(pages) - 1);
BUG_ON(order > mm->max_order);
BUG_ON(order < min_order);

do {
if (flags & DRM_BUDDY_RANGE_ALLOCATION)
/* Allocate traversing within the range */
block = alloc_range_bias(mm, start, end, order);
else
/* Allocate from freelist */
block = alloc_from_freelist(mm, order, flags);

if (!IS_ERR(block))
break;

if (order-- == min_order) {
err = -ENOSPC;
goto err_free;
}
} while (1);

mark_allocated(block);
mm->avail -= drm_buddy_block_size(mm, block);
kmemleak_update_trace(block);
list_add_tail(>link, );

pages -= BIT(order);

if (!pages)
break;
} while (1);

Was the BUG_ON triggered for your case?

BUG_ON(order < min_order);

no, this BUG_ON is not triggered for this bug


Please give more details.


there is a chance when there is no space to allocate, order value
decrements and reaches to 0 at one point, here we should exit the loop,
otherwise, further order value decrements to -1 and do..while loop
doesn't exit. Hence added a check to exit the loop if order value becomes 0.


Sorry, I do not see it. How can that be with order ≥ min_order and the
check `order-- == min_order`? Is min_order 0? Please explain that in the
next commit message.


please check v2, yes when min_order is 0, the above said situation may
occur.And, since the order is unsigned int, I think it will not trigger
the BUG_ON(order < min_order) when order becomes -1. Hence I think we
needed a check !order to exit the loop.


Thank you for clarifying this. I still do not understand it though. With

order = fls(pages) - 1;
min_order = ilog2(min_page_size) - ilog2(mm->chunk_size);

is zorder` always non-negative? Let’s assume it is. Also, can min_order 
get “negative” (wraps around)?


I would add BUG_ON statements for these cases?

BUG_ON(fls(pages) - 1 < 1);
BUG_ON(ilog2(min_page_size) - ilog2(mm->chunk_size) < 1);

Assuming “negative” is not possible, your case can only happen if 
`order` and `min_order` are 0, right? If `order` is 

Re: [PATCH] drm: Fix a infinite loop condition when order becomes 0

2022-03-15 Thread Paul Menzel

Dear Arunpravin,


Am 15.03.22 um 16:42 schrieb Arunpravin:


On 15/03/22 2:35 pm, Paul Menzel wrote:



Am 15.03.22 um 10:01 schrieb Arunpravin:


On 15/03/22 1:49 pm, Paul Menzel wrote:



Am 14.03.22 um 20:40 schrieb Arunpravin:

handle a situation in the condition order-- == min_order,
when order = 0, leading to order = -1, it now won't exit
the loop. To avoid this problem, added a order check in
the same condition, (i.e) when order is 0, we return
-ENOSPC

Signed-off-by: Arunpravin 


Please use your full name.

okay


You might also configure that in your email program.

yes


Not done yet though. ;-)


---
drivers/gpu/drm/drm_buddy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 72f52f293249..5ab66aaf2bbd 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c


In what tree is that file?


drm-tip - 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm-tip%2Ftree%2Fdata=04%7C01%7CArunpravin.PaneerSelvam%40amd.com%7Cc456573102c04191cf9708da0662f798%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637829319396954551%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=5Bspe5QGjQ0KHfVI8%2F%2BXqxR45q6tOL4FE2fVD3uwL%2FM%3Dreserved=0
drm-misc-next - 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Ftree%2Fdata=04%7C01%7CArunpravin.PaneerSelvam%40amd.com%7Cc456573102c04191cf9708da0662f798%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637829319396954551%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=g2S14TfsHF5ORo9jTZ3uA0l1BH8mnAxk2OWYJeF5i8k%3Dreserved=0


Thank Outlook. Now everybody feels safe.


@@ -685,7 +685,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
if (!IS_ERR(block))
break;

-			if (order-- == min_order) {

+   if (!order || order-- == min_order) {
err = -ENOSPC;
goto err_free;
}


Thank you for the hint. So the whole function is:

do {
order = min(order, (unsigned int)fls(pages) - 1);
BUG_ON(order > mm->max_order);
BUG_ON(order < min_order);

do {
if (flags & DRM_BUDDY_RANGE_ALLOCATION)
/* Allocate traversing within the range */
block = alloc_range_bias(mm, start, end, order);
else
/* Allocate from freelist */
block = alloc_from_freelist(mm, order, flags);

if (!IS_ERR(block))
break;

if (order-- == min_order) {
err = -ENOSPC;
goto err_free;
}
} while (1);

mark_allocated(block);
mm->avail -= drm_buddy_block_size(mm, block);
kmemleak_update_trace(block);
list_add_tail(>link, );

pages -= BIT(order);

if (!pages)
break;
} while (1);

Was the BUG_ON triggered for your case?

BUG_ON(order < min_order);

no, this BUG_ON is not triggered for this bug


Please give more details.


there is a chance when there is no space to allocate, order value
decrements and reaches to 0 at one point, here we should exit the loop,
otherwise, further order value decrements to -1 and do..while loop
doesn't exit. Hence added a check to exit the loop if order value becomes 0.


Sorry, I do not see it. How can that be with order ≥ min_order and the 
check `order-- == min_order`? Is min_order 0? Please explain that in the 
next commit message.



Kind regards,

Paul


[PATCH] drm/amdgpu: Use ternary operator in `vcn_v1_0_start()`

2022-03-15 Thread Paul Menzel
Remove the boilerplate of declaring a variable and using an if else
statement by using the ternary operator.

Signed-off-by: Paul Menzel 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index 3799226defc0..78ad85fdc769 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -1095,13 +1095,8 @@ static int vcn_v1_0_start_dpg_mode(struct amdgpu_device 
*adev)
 
 static int vcn_v1_0_start(struct amdgpu_device *adev)
 {
-   int r;
-
-   if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG)
-   r = vcn_v1_0_start_dpg_mode(adev);
-   else
-   r = vcn_v1_0_start_spg_mode(adev);
-   return r;
+   return (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) ?
+   vcn_v1_0_start_dpg_mode(adev) : vcn_v1_0_start_spg_mode(adev);
 }
 
 /**
-- 
2.35.1



Re: [PATCH] drm: Fix a infinite loop condition when order becomes 0

2022-03-15 Thread Paul Menzel

Dear Arunpravin,


Am 15.03.22 um 10:01 schrieb Arunpravin:


On 15/03/22 1:49 pm, Paul Menzel wrote:



Am 14.03.22 um 20:40 schrieb Arunpravin:

handle a situation in the condition order-- == min_order,
when order = 0, leading to order = -1, it now won't exit
the loop. To avoid this problem, added a order check in
the same condition, (i.e) when order is 0, we return
-ENOSPC

Signed-off-by: Arunpravin 


Please use your full name.

okay


You might also configure that in your email program.


---
   drivers/gpu/drm/drm_buddy.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 72f52f293249..5ab66aaf2bbd 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c


In what tree is that file?


drm-tip - https://cgit.freedesktop.org/drm-tip/tree/
drm-misc-next - https://cgit.freedesktop.org/drm/drm-misc/tree/


@@ -685,7 +685,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
if (!IS_ERR(block))
break;
   
-			if (order-- == min_order) {

+   if (!order || order-- == min_order) {
err = -ENOSPC;
goto err_free;
}


Thank you for the hint. So the whole function is:

do {
order = min(order, (unsigned int)fls(pages) - 1);
BUG_ON(order > mm->max_order);
BUG_ON(order < min_order);

do {
if (flags & DRM_BUDDY_RANGE_ALLOCATION)
/* Allocate traversing within the range */
block = alloc_range_bias(mm, start, end, order);
else
/* Allocate from freelist */
block = alloc_from_freelist(mm, order, flags);

if (!IS_ERR(block))
break;

if (order-- == min_order) {
err = -ENOSPC;
goto err_free;
}
} while (1);

mark_allocated(block);
mm->avail -= drm_buddy_block_size(mm, block);
kmemleak_update_trace(block);
list_add_tail(>link, );

pages -= BIT(order);

if (!pages)
break;
} while (1);

Was the BUG_ON triggered for your case?

BUG_ON(order < min_order);

Please give more details.


Kind regards,

Paul


Re: [PATCH] drm: Fix a infinite loop condition when order becomes 0

2022-03-15 Thread Paul Menzel

Dear Arunpravin,


Am 14.03.22 um 20:40 schrieb Arunpravin:

handle a situation in the condition order-- == min_order,
when order = 0, leading to order = -1, it now won't exit
the loop. To avoid this problem, added a order check in
the same condition, (i.e) when order is 0, we return
-ENOSPC

Signed-off-by: Arunpravin 


Please use your full name.


---
  drivers/gpu/drm/drm_buddy.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 72f52f293249..5ab66aaf2bbd 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c


In what tree is that file?


@@ -685,7 +685,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
if (!IS_ERR(block))
break;
  
-			if (order-- == min_order) {

+   if (!order || order-- == min_order) {
err = -ENOSPC;
goto err_free;
}


Kind regards,

Paul


Re: [PATCH v2] drm/amdgpu: check vm ready by amdgpu_vm->evicting flag

2022-02-22 Thread Paul Menzel

Dear Qiang,


Am 22.02.22 um 03:46 schrieb Qiang Yu:

Workstation application ANSA/META v21.1.4 get this error dmesg when
running CI test suite provided by ANSA/META:
[drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)

This is caused by:
1. create a 256MB buffer in invisible VRAM
2. CPU map the buffer and access it causes vm_fault and try to move
it to visible VRAM
3. force visible VRAM space and traverse all VRAM bos to check if
evicting this bo is valuable
4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
will set amdgpu_vm->evicting, but latter due to not in visible
VRAM, won't really evict it so not add it to amdgpu_vm->evicted
5. before next CS to clear the amdgpu_vm->evicting, user VM ops
ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
but fail in amdgpu_vm_bo_update_mapping() (check
amdgpu_vm->evicting) and get this error log

This error won't affect functionality as next CS will finish the
waiting VM ops. But we'd better clear the error log by checking
the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling
amdgpu_vm_bo_update_mapping() latter.


later

Another reason is amdgpu_vm->evicted list holds all BOs (both
user buffer and page table), but only page table BOs' eviction
prevent VM ops. amdgpu_vm->evicting flag is set only for page
table BOs, so we should use evicting flag instead of evicted list
in amdgpu_vm_ready().

The side effect of This change is: previously blocked VM op (user


this


buffer in "evicted" list but no page table in it) gets done
immediately.

v2: update commit comments.

Reviewed-by: Christian König 
Signed-off-by: Qiang Yu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 37acd8911168..2cd9f1a2e5fa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -770,11 +770,16 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
   * Check if all VM PDs/PTs are ready for updates
   *
   * Returns:
- * True if eviction list is empty.
+ * True if VM is not evicting.
   */
  bool amdgpu_vm_ready(struct amdgpu_vm *vm)
  {
-   return list_empty(>evicted);
+   bool ret;
+
+   amdgpu_vm_eviction_lock(vm);
+   ret = !vm->evicting;
+   amdgpu_vm_eviction_unlock(vm);
+   return ret;
  }
  
  /**


Acked-by: Paul Menzel 


Kind regards,

Paul


Re: [PATCH] drm/amdgpu: check vm ready by evicting

2022-02-21 Thread Paul Menzel

Dear Qiang Yu,


Am 21.02.22 um 11:12 schrieb Qiang Yu:


Thank you for your patch. Reading the commit message summary, I have no 
idea what “check vm ready by evicting” means. Can you please rephrase it?



Workstation application ANSA/META get this error dmesg:


What version, and how can this be reproduced exactly? Just by starting 
the application?



[drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)

This is caused by:
1. create a 256MB buffer in invisible VRAM
2. CPU map the buffer and access it causes vm_fault and try to move
it to visible VRAM
3. force visible VRAM space and traverse all VRAM bos to check if
evicting this bo is valuable
4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
will set amdgpu_vm->evicting, but latter due to not in visible
VRAM, won't really evict it so not add it to amdgpu_vm->evicted
5. before next CS to clear the amdgpu_vm->evicting, user VM ops
ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
but fail in amdgpu_vm_bo_update_mapping() (check
amdgpu_vm->evicting) and get this error log

This error won't affect functionality as next CS will finish the
waiting VM ops. But we'd better clear the error log by check the


s/check/checking/


evicting flag which really stop VM ops latter.


stop*s*?

Can you please elaborate. Christian’s and your discussions was quite 
long, so adding a summary, why this approach works and what possible 
regressions there are going to be might be warranted.



Kind regards,

Paul



Signed-off-by: Qiang Yu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 37acd8911168..2cd9f1a2e5fa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -770,11 +770,16 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
   * Check if all VM PDs/PTs are ready for updates
   *
   * Returns:
- * True if eviction list is empty.
+ * True if VM is not evicting.
   */
  bool amdgpu_vm_ready(struct amdgpu_vm *vm)
  {
-   return list_empty(>evicted);
+   bool ret;
+
+   amdgpu_vm_eviction_lock(vm);
+   ret = !vm->evicting;
+   amdgpu_vm_eviction_unlock(vm);
+   return ret;
  }
  
  /**


[PATCH] drm/amdgpu: Fix typo in *whether* in comment

2022-02-18 Thread Paul Menzel
Signed-off-by: Paul Menzel 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 63a089992645..430e56583751 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -740,7 +740,7 @@ MODULE_PARM_DESC(debug_largebar,
  * systems with a broken CRAT table.
  *
  * Default is auto (according to asic type, iommu_v2, and crat table, to decide
- * whehter use CRAT)
+ * whether use CRAT)
  */
 int ignore_crat;
 module_param(ignore_crat, int, 0444);
-- 
2.35.1



Re: [PATCH linux-next] video: fbdev: fbmem: fix pointer reference to null device field

2022-02-11 Thread Paul Menzel

Dear Zhouyi,


Am 10.02.22 um 07:58 schrieb Zhouyi Zhou:

In function do_remove_conflicting_framebuffers, if device is NULL, there
will be null pointer reference. The patch add a check to the if expression.

Signed-off-by: Zhouyi Zhou 
---
Dear Linux folks

I discover this bug in the PowerPC VM provided by
Open source lab of Oregon State University:

https://lkml.org/lkml/2022/2/8/1145

I found that the root cause of null device field is in offb_init_fb:
info = framebuffer_alloc(sizeof(u32) * 16, NULL);

I have tested the patch in the PowerPC VM. Hope my patch can be correct.

Many Thanks
Zhouyi
--
  drivers/video/fbdev/core/fbmem.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index 34d6bb1bf82e..422b1fc01722 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -1579,7 +1579,7 @@ static void do_remove_conflicting_framebuffers(struct 
apertures_struct *a,
 * If it's not a platform device, at least print a 
warning. A
 * fix would add code to remove the device from the 
system.
 */
-   if (dev_is_platform(device)) {
+   if (device && dev_is_platform(device)) {
registered_fb[i]->forced_out = true;

platform_device_unregister(to_platform_device(device));
} else {


Looks reasonable.

Acked-by: Paul Menzel 


Kind regards,

Paul


PS: Please note, this should be unrelated to my problem though, as I 
didn’t use linux-next. (Let’s continue in the other thread though.)


Unable to unselect VGA_ARB (VGA Arbitration)

2022-01-11 Thread Paul Menzel

Dear Linux folks,


I am using Linux 5.16, and I am unable to unset `VGA_ARB` in Kconfig 
(`make menuconfig`). I have an Asus F2A85-M PRO with an AMD A6-6400K APU 
(integrated Radeon graphics device), so no legacy stuff.


From `drivers/gpu/vga/Kconfig`:

```
config VGA_ARB
bool "VGA Arbitration" if EXPERT
default y
depends on (PCI && !S390)
help
  […]

config VGA_ARB_MAX_GPUS
int "Maximum number of GPUs"
default 16
depends on VGA_ARB
help
  […]

config VGA_SWITCHEROO
bool "Laptop Hybrid Graphics - GPU switching support"
depends on X86
depends on ACPI
depends on PCI
depends on (FRAMEBUFFER_CONSOLE=n || FB=y)
select VGA_ARB
help
  […]
```

But in `make menuconfig` I am unable to unselect it.

-*- VGA Arbitration

and the help says:

Symbol: VGA_ARB [=y]
Type  : bool
  Depends on: HAS_IOMEM [=y] && PCI [=y] && !S390
  Visible if: HAS_IOMEM [=y] && PCI [=y] && !S390 && EXPERT [=n]
  Location:
Main menu
 -> Device Drivers
   -> Graphics support
Selected by [n]:
  - VGA_SWITCHEROO [=n] && HAS_IOMEM [=y] && X86 [=y] && ACPI [=y] 
&& PCI [=y] && (!FRAMEBUFFER_CONSOLE [=y] || FB [=y]=y)


So, VGA_SWITCHEROO is not set, and, therefore, as `Selected by [n]:` 
suggests, I thought I’d be able to deselect it.


It’d be great if you could help me out.


Kind regards,

Paul


Re: [PATCH v3 2/3] drm: Add Gamma and Degamma LUT sizes props to drm_crtc to validate.

2021-10-29 Thread Paul Menzel

Dear Mark,


On 26.10.21 21:21, Mark Yacoub wrote:

From: Mark Yacoub 

[Why]
1. drm_atomic_helper_check doesn't check for the LUT sizes of either Gamma
or Degamma props in the new CRTC state, allowing any invalid size to
be passed on.
2. Each driver has its own LUT size, which could also be different for
legacy users.

[How]
1. Create |degamma_lut_size| and |gamma_lut_size| to save the LUT sizes
assigned by the driver when it's initializing its color and CTM
management.
2. Create drm_atomic_helper_check_crtc which is called by
drm_atomic_helper_check to check the LUT sizes saved in drm_crtc that
they match the sizes in the new CRTC state.
3. As the LUT size check now happens in drm_atomic_helper_check, remove
the lut check in intel_color.c

Resolves: igt@kms_color@pipe-A-invalid-gamma-lut-sizes on MTK
Tested on Zork(amdgpu) and Jacuzzi(mediatek), volteer(TGL)


If you should sent another iteration, only a minor thing, could you 
please add a space before the (.




v2:
1. Remove the rename to a parent commit.
2. Create a drm drm_check_lut_size instead of intel only function.

v1:
1. Fix typos
2. Remove the LUT size check from intel driver
3. Rename old LUT check to indicate it's a channel change

Signed-off-by: Mark Yacoub 
---
  drivers/gpu/drm/drm_atomic_helper.c| 56 ++
  drivers/gpu/drm/drm_color_mgmt.c   |  2 +
  drivers/gpu/drm/i915/display/intel_color.c | 39 ---
  include/drm/drm_atomic_helper.h|  1 +
  include/drm/drm_color_mgmt.h   | 13 +
  include/drm/drm_crtc.h | 11 +
  6 files changed, 102 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index bc3487964fb5e..c565b3516cce9 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -929,6 +929,58 @@ drm_atomic_helper_check_planes(struct drm_device *dev,
  }
  EXPORT_SYMBOL(drm_atomic_helper_check_planes);
  
+/**

+ * drm_atomic_helper_check_crtcs - validate state object for CRTC changes
+ * @state: the driver state object
+ *
+ * Check the CRTC state object such as the Gamma/Degamma LUT sizes if the new
+ * state holds them.
+ *
+ * RETURNS:
+ * Zero for success or -errno
+ */
+int drm_atomic_helper_check_crtcs(struct drm_atomic_state *state)
+{
+   struct drm_crtc *crtc;
+   struct drm_crtc_state *new_crtc_state;
+   int i;
+
+   for_each_new_crtc_in_state (state, crtc, new_crtc_state, i) {
+   if (new_crtc_state->color_mgmt_changed &&
+   new_crtc_state->gamma_lut) {
+   if (drm_check_lut_size(new_crtc_state->gamma_lut,
+  crtc->gamma_lut_size) ||
+   drm_check_lut_size(new_crtc_state->gamma_lut,
+  crtc->gamma_size)) {
+   drm_dbg_state(
+   state->dev,
+   "Invalid Gamma LUT size. Should be %u (or %u 
for legacy) but got %u.\n",
+   crtc->gamma_lut_size, crtc->gamma_size,
+   drm_color_lut_size(
+   new_crtc_state->gamma_lut));
+   return -EINVAL;
+   }
+   }
+
+   if (new_crtc_state->color_mgmt_changed &&
+   new_crtc_state->degamma_lut) {
+   if (drm_check_lut_size(new_crtc_state->degamma_lut,
+  crtc->degamma_lut_size)) {
+   drm_dbg_state(
+   state->dev,
+   "Invalid DeGamma LUT size. Should be %u but 
got %u.\n",
+   crtc->degamma_lut_size,
+   drm_color_lut_size(
+   new_crtc_state->degamma_lut));
+   return -EINVAL;
+   }
+   }
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL(drm_atomic_helper_check_crtcs);
+
  /**
   * drm_atomic_helper_check - validate state object
   * @dev: DRM device
@@ -974,6 +1026,10 @@ int drm_atomic_helper_check(struct drm_device *dev,
if (ret)
return ret;
  
+	ret = drm_atomic_helper_check_crtcs(state);

+   if (ret)
+   return ret;
+
if (state->legacy_cursor_update)
state->async_update = !drm_atomic_helper_async_check(dev, 
state);
  
diff --git a/drivers/gpu/drm/drm_color_mgmt.c b/drivers/gpu/drm/drm_color_mgmt.c

index 6f4e04746d90f..6bb59645a75bc 100644
--- a/drivers/gpu/drm/drm_color_mgmt.c
+++ b/drivers/gpu/drm/drm_color_mgmt.c
@@ -166,6 +166,7 @@ void drm_crtc_enable_color_mgmt(struct drm_crtc *crtc,
struct drm_mode_config 

Re: [PATCH 1/2] drm: Add Gamma and Degamma LUT sizes props to drm_crtc to validate.

2021-10-26 Thread Paul Menzel
 int drm_plane_create_color_properties(struct drm_plane 
*plane,
  enum drm_color_range default_range);
  
  /**

- * enum drm_color_lut_tests - hw-specific LUT tests to perform
+ * enum drm_color_lut_channels_tests - hw-specific LUT tests to perform
   *
   * The drm_color_lut_check() function takes a bitmask of the values here to
   * determine which tests to apply to a userspace-provided LUT.
   */
-enum drm_color_lut_tests {
+enum drm_color_lut_channels_tests {
/**
 * @DRM_COLOR_LUT_EQUAL_CHANNELS:
 *
@@ -119,5 +119,6 @@ enum drm_color_lut_tests {
DRM_COLOR_LUT_NON_DECREASING = BIT(1),
  };
  
-int drm_color_lut_check(const struct drm_property_blob *lut, u32 tests);

+int drm_color_lut_channels_check(const struct drm_property_blob *lut,
+u32 tests);
  #endif
diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
index 2deb15d7e1610..cabd3ef1a6e32 100644
--- a/include/drm/drm_crtc.h
+++ b/include/drm/drm_crtc.h
@@ -1072,6 +1072,17 @@ struct drm_crtc {
/** @funcs: CRTC control functions */
const struct drm_crtc_funcs *funcs;
  
+	/**

+* @degamma_lut_size: Size of degamma LUT.
+*/
+   uint32_t degamma_lut_size;
+
+   /**
+* @gamma_lut_size: Size of Gamma LUT. Not used by legacy userspace 
such as
+* X, which doesn't support large lut sizes.
+*/
+   uint32_t gamma_lut_size;
+
/**
 * @gamma_size: Size of legacy gamma ramp reported to userspace. Set up
 * by calling drm_mode_crtc_set_gamma_size().



Acked-by: Paul Menzel 


Kind regards,

Paul


[PATCH 1/2] drm/amdgpu: Clarify that TMZ unsupported message is due to hardware

2021-09-13 Thread Paul Menzel
The warning

amdgpu :05:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported

leaves the reader wondering, if anything can be done about it. As it’s
unsupported in the hardware, and nothing can be done about, mention that
in the log message.

amdgpu :05:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not 
supported by hardware

Signed-off-by: Paul Menzel 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index c7797eac83c3..c4c56c57b0c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -599,7 +599,7 @@ void amdgpu_gmc_tmz_set(struct amdgpu_device *adev)
default:
adev->gmc.tmz_enabled = false;
dev_warn(adev->dev,
-"Trusted Memory Zone (TMZ) feature not supported\n");
+"Trusted Memory Zone (TMZ) feature not supported by 
hardware\n");
break;
}
 }
-- 
2.33.0



[PATCH 2/2] drm/amdgpu: Demote TMZ unsupported log message from warning to info

2021-09-13 Thread Paul Menzel
As the user cannot do anything about the unsupported Trusted Memory Zone
(TMZ) feature, do not warn about it, but make it informational, so
demote the log level from warning to info.

Signed-off-by: Paul Menzel 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index c4c56c57b0c0..bfa0275ff5d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -598,7 +598,7 @@ void amdgpu_gmc_tmz_set(struct amdgpu_device *adev)
break;
default:
adev->gmc.tmz_enabled = false;
-   dev_warn(adev->dev,
+   dev_info(adev->dev,
 "Trusted Memory Zone (TMZ) feature not supported by 
hardware\n");
break;
}
-- 
2.33.0



Re: [PATCH] drm/amdgpu:Fixed the wrong macro definition in amdgpu_trace.h

2020-12-24 Thread Paul Menzel

Dear Chenyang,


Am 23.12.20 um 02:19 schrieb Chenyang Li:

In line 24 "_AMDGPU_TRACE_H" is missing an underscore.


Nice catch. Could you please update the commit message summary, by 
adding a space after the prefix (colon), and using imperative mood [1]?



drm/amdgpu: Fix macro name _AMDGPU_TRACE_H_ in preprocessor if condition


If you can also add a Fixes tag, that would be even better.

Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)")


Signed-off-by: Chenyang Li 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


[…]


Kind regards,

Paul


[1]: https://chris.beams.io/posts/git-commit/
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


General protection fault: RIP: 0010:free_block+0xdc/0x1f0

2020-09-15 Thread Paul Menzel

Dear Andrew folks, dear Linux folks,


With Linux 5.9-rc4 on a Dell OptiPlex 5080 with Intel Core i7-10700 CPU 
@ 2.90GHz, and external


01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, 
Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611] (rev 87)


running graphical demanding applications glmark2 [1] and the Phoronix 
Test Suite [2] benchmark *pts/desktop-graphics* [3]


$ git describe --tags
v10.0.0m1-13-g0b5ddc3c0

I got three general protection faults, and it restarted or froze (no 
input devices working, screen froze and even network card (no ping)).


Here the system restarted itself:


kernel: general protection fault, probably for non-canonical address 
0xdead0100:  [#1] SMP NOPTI
kernel: CPU: 2 PID: 9702 Comm: glmark2 Kdump: loaded Not tainted 
5.9.0-rc4.mx64.343 #1
kernel: Hardware name: Dell Inc. OptiPlex 5080/032W55, BIOS 1.1.7 08/17/2020
kernel: RIP: 0010:free_block+0xdc/0x1f0


Here it froze:


[14639.665745] general protection fault, probably for non-canonical address 
0xdead0100:  [#1] SMP NOPTI
[14639.675917] CPU: 15 PID: 23094 Comm: pvpython Kdump: loaded Not tainted 
5.9.0-rc4.mx64.343 #1
[14639.684431] Hardware name: Dell Inc. OptiPlex 5080/032W55, BIOS 1.1.7 
08/17/2020
[14639.691823] RIP: 0010:free_block+0xdc/0x1f0


Here it froze:


kernel: general protection fault, probably for non-canonical address 
0xdead0100:  [#1] SMP NOPTI
kernel: CPU: 15 PID: 23094 Comm: pvpython Kdump: loaded Not tainted 
5.9.0-rc4.mx64.343 #1
kernel: Hardware name: Dell Inc. OptiPlex 5080/032W55, BIOS 1.1.7 08/17/2020
kernel: RIP: 0010:free_block+0xdc/0x1f0


Running `scripts/decode_stacktrace.sh`:


linux-5.9_rc4-343.x86_64/source$ scripts/decode_stacktrace.sh vmlinux < 
optiplex-5080-linux-5.9-rc4-gp-pvpython.txt
[14528.718656] cgroup: fork rejected by pids controller in 
/user.slice/user-5272.slice/session-c6.scope
[14639.665745] general protection fault, probably for non-canonical address 
0xdead0100:  [#1] SMP NOPTI
[14639.675917] CPU: 15 PID: 23094 Comm: pvpython Kdump: loaded Not tainted 
5.9.0-rc4.mx64.343 #1
[14639.684431] Hardware name: Dell Inc. OptiPlex 5080/032W55, BIOS 1.1.7 
08/17/2020
[14639.691823] RIP: 0010:free_block (./include/linux/list.h:112 ./include/linux/list.h:135 ./include/linux/list.h:146 mm/slab.c:3336) 
[14639.696006] Code: 00 48 01 d0 48 c1 e8 0c 48 c1 e0 06 4c 01 e8 48 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 48 08 48 8b 50 10 4c 8d 78 08 <48> 89 51 08 48 89 0a 4c 89 da 48 2b 50 28 4c 89 60 08 48 89 68 10

All code

   0:   00 48 01add%cl,0x1(%rax)
   3:   d0 48 c1rorb   -0x3f(%rax)
   6:   e8 0c 48 c1 e0  callq  0xe0c14817
   b:	06   	(bad)  
   c:	4c 01 e8 	add%r13,%rax

   f:   48 8b 50 08 mov0x8(%rax),%rdx
  13:   48 8d 4a ff lea-0x1(%rdx),%rcx
  17:   83 e2 01and$0x1,%edx
  1a:   48 0f 45 c1 cmovne %rcx,%rax
  1e:   48 8b 48 08 mov0x8(%rax),%rcx
  22:   48 8b 50 10 mov0x10(%rax),%rdx
  26:   4c 8d 78 08 lea0x8(%rax),%r15
  2a:*  48 89 51 08 mov%rdx,0x8(%rcx)   <-- trapping 
instruction
  2e:   48 89 0amov%rcx,(%rdx)
  31:   4c 89 damov%r11,%rdx
  34:   48 2b 50 28 sub0x28(%rax),%rdx
  38:   4c 89 60 08 mov%r12,0x8(%rax)
  3c:   48 89 68 10 mov%rbp,0x10(%rax)

Code starting with the faulting instruction
===
   0:   48 89 51 08 mov%rdx,0x8(%rcx)
   4:   48 89 0amov%rcx,(%rdx)
   7:   4c 89 damov%r11,%rdx
   a:   48 2b 50 28 sub0x28(%rax),%rdx
   e:   4c 89 60 08 mov%r12,0x8(%rax)
  12:   48 89 68 10 mov%rbp,0x10(%rax)
[14639.714747] RSP: 0018:c9001c26fab8 EFLAGS: 00010046
[14639.719970] RAX: ea000d193600 RBX: 8000 RCX: dead0100
[14639.727099] RDX: dead0122 RSI: 88842d5f3ef0 RDI: 88842b440300
[14639.734225] RBP: dead0122 R08: c9001c26fb30 R09: 88842b441280
[14639.741351] R10: 000f R11: 8883464d80c0 R12: dead0100
[14639.748477] R13: ea00 R14: 88842d5f3ff0 R15: ea000d193608
[14639.755604] FS:  7fd3b7e8f040() GS:88842d5c() 
knlGS:
[14639.763692] CS:  0010 DS:  ES:  CR0: 80050033
[14639.769430] CR2: 7fd344233548 CR3: 0002f46aa003 CR4: 007706e0
[14639.776556] PKRU: 5554
[14639.779265] Call Trace:
[14639.781717] ___cache_free (mm/slab.c:3389 mm/slab.c:3455) 
[14639.785463] kfree (./arch/x86/include/asm/irqflags.h:41 ./arch/x86/include/asm/irqflags.h:84 mm/slab.c:3757) 
[14639.788432] kmem_freepages (mm/slab.h:266 mm/slab.h:437 mm/slab.c:1406) 

Re: [PATCH] amdgpu_dm: fix nonblocking atomic commit use-after-free

2020-07-28 Thread Paul Menzel

Dear Linux folks,


Am 25.07.20 um 07:20 schrieb Mazin Rezk:

On Saturday, July 25, 2020 12:59 AM, Duncan wrote:


On Sat, 25 Jul 2020 03:03:52 + Mazin Rezk wrote:


Am 24.07.20 um 19:33 schrieb Kees Cook:


There was a fix to disable the async path for this driver that
worked around the bug too, yes? That seems like a safer and more
focused change that doesn't revert the SLUB defense for all
users, and would actually provide a complete, I think, workaround


That said, I haven't seen the async disabling patch. If you could
link to it, I'd be glad to test it out and perhaps we can use that
instead.


I'm confused. Not to put words in Kees' mouth; /I/ am confused (which
admittedly could well be just because I make no claims to be a
coder and am simply reading the bug and thread, but I'd appreciate some
"unconfusing" anyway).

My interpretation of the "async disabling" reference was that it was to
comment #30 on the bug:

https://bugzilla.kernel.org/show_bug.cgi?id=207383#c30

... which (if I'm not confused on this point too) appears to be yours.
There it was stated...

I've also found that this bug exclusively occurs when commit_work is on
the workqueue. After forcing drm_atomic_helper_commit to run all of the
commits without adding to the workqueue and running the OS, the issue
seems to have disappeared.


Would not forcing all commits to run directly, without placing them on
the workqueue, be "async disabling"? That's what I /thought/ he was
referencing.


Oh, I thought he was referring to a different patch. Kees, could I get
your confirmation on this?

The change I made actually affected all of the DRM code, although this could
easily be changed to be specific to amdgpu. (By forcing blocking on
amdgpu_dm's non-blocking commit code)

That said, I'd still need to test further because I only did test it for a
couple of hours then. Although it should work in theory.


OTOH your base/context swap idea sounds like a possibly "less
disturbance" workaround, if it works, and given the point in the
commit cycle... (But if it's out Sunday it's likely too late to test
and get it in now anyway; if it's another week, tho...)


The base/context swap idea should make the use-after-free behave how it
did in 5.6. Since the bug doesn't cause an issue in 5.6, it's less of a
"less disturbance" workaround and more of a "no disturbance" workaround.


Sorry for bothering, but is there now a solution, besides reverting the 
commits, to avoid freezes/crashes *without* performance regressions?



Kind regards,

Paul
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] amdgpu_dm: fix nonblocking atomic commit use-after-free

2020-07-26 Thread Paul Menzel


Dear Kees,


Am 24.07.20 um 19:33 schrieb Kees Cook:

On Fri, Jul 24, 2020 at 09:45:18AM +0200, Paul Menzel wrote:

Am 24.07.20 um 00:32 schrieb Kees Cook:

On Thu, Jul 23, 2020 at 09:10:15PM +, Mazin Rezk wrote:

As Linux 5.8-rc7 is going to be released this Sunday, I wonder, if commit
3202fa62f ("slub: relocate freelist pointer to middle of object") should be
reverted for now to fix the regression for the users according to Linux’ no
regression policy. Once the AMDGPU/DRM driver issue is fixed, it can be
reapplied. I know it’s not optimal, but as some testing is going to be
involved for the fix, I’d argue it’s the best option for the users.


Well, the SLUB defense was already released in v5.7, so I'm not sure it
really helps for amdgpu_dm users seeing it there too.


In my opinion, it would help, as the stable release could pick up the 
revert, ones it’s in Linus’ master branch.



There was a fix to disable the async path for this driver that worked
around the bug too, yes? That seems like a safer and more focused
change that doesn't revert the SLUB defense for all users, and would
actually provide a complete, I think, workaround whereas reverting
the SLUB change means the race still exists. For example, it would be
hit with slab poisoning, etc.


I do not know. If there is such a fix, that would be great. But if you 
do not know, how should a normal user? ;-)



Kind regards,

Paul


Kind regards,

Paul
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] amdgpu_dm: fix nonblocking atomic commit use-after-free

2020-07-26 Thread Paul Menzel

Dear Kees,


Am 24.07.20 um 00:32 schrieb Kees Cook:

On Thu, Jul 23, 2020 at 09:10:15PM +, Mazin Rezk wrote:

When amdgpu_dm_atomic_commit_tail is running in the workqueue,
drm_atomic_state_put will get called while amdgpu_dm_atomic_commit_tail is
running, causing a race condition where state (and then dm_state) is
sometimes freed while amdgpu_dm_atomic_commit_tail is running. This bug has
occurred since 5.7-rc1 and is well documented among polaris11 users [1].

Prior to 5.7, this was not a noticeable issue since the freelist pointer
was stored at the beginning of dm_state (base), which was unused. After
changing the freelist pointer to be stored in the middle of the struct, the
freelist pointer overwrote the context, causing dc_state to become garbage
data and made the call to dm_enable_per_frame_crtc_master_sync dereference
a freelist pointer.

This patch fixes the aforementioned issue by calling drm_atomic_state_get
in amdgpu_dm_atomic_commit before drm_atomic_helper_commit is called and
drm_atomic_state_put after amdgpu_dm_atomic_commit_tail is complete.

According to my testing on 5.8.0-rc6, this should fix bug 207383 on
Bugzilla [1].

[1] https://bugzilla.kernel.org/show_bug.cgi?id=207383


Nice work tracking this down!


Fixes: 3202fa62f ("slub: relocate freelist pointer to middle of object")


I do, however, object to this Fixes tag. :) The flaw appears to have
been with amdgpu_dm's reference tracking of "state" in the nonblocking
case. (How this reference counting is supposed to work correctly, though,
I'm not sure.) If I look at where the drm helper was split from being
the default callback, it looks like this was what introduced the bug:

da5c47f682ab ("drm/amd/display: Remove acrtc->stream")

? 3202fa62f certainly exposed it much more quickly, but there was a race
even without 3202fa62f where something could have realloced the memory
and written over it.


I understand the Fixes tag mainly a help when backporting commits.

As Linux 5.8-rc7 is going to be released this Sunday, I wonder, if 
commit 3202fa62f ("slub: relocate freelist pointer to middle of object") 
should be reverted for now to fix the regression for the users according 
to Linux’ no regression policy. Once the AMDGPU/DRM driver issue is 
fixed, it can be reapplied. I know it’s not optimal, but as some testing 
is going to be involved for the fix, I’d argue it’s the best option for 
the users.



Kind regards,

Paul
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


drm: BUG: unable to handle page fault for address: 17ec6000

2020-07-09 Thread Paul Menzel

Dear Linux folks,


Building Linux v5.8-rc4-25-gbfe91da29bfad with Clang/LLD 
1:11~++20200701093119+ffee8040534-1~exp1 from Debian experimental for 
32-bit (`ARCH=i386`), starting Weston (Wayland) or X.Org Server results 
in non-working screen, and Linux shows the trace below [1].



[  502.044997] BUG: unable to handle page fault for address: 17ec6000
[  502.045650] #PF: supervisor write access in kernel mode
[  502.046301] #PF: error_code(0x0002) - not-present page
[  502.046956] *pde =  
[  502.047612] Oops: 0002 [#1] SMP

[  502.048269] CPU: 0 PID: 2125 Comm: Xorg.wrap Not tainted 
5.8.0-rc4-00105-g4da71f1ee6263 #141
[  502.048967] Hardware name: System manufacturer System Product Name/F2A85-M 
PRO, BIOS 6601 11/25/2014
[  502.049686] EIP: __srcu_read_lock+0x11/0x20
[  502.050413] Code: 83 e0 03 50 56 68 72 c6 99 dd 68 46 c6 99 dd e8 3a c8 fe ff 83 
c4 10 eb ce 0f 1f 44 00 00 55 89 e5 8b 48 68 8b 40 7c 83 e1 01 <64> ff 04 88 f0 
83 44 24 fc 00 89 c8 5d c3 90 0f 1f 44 00 00 55 89
[  502.052027] EAX:  EBX: f36671b8 ECX:  EDX: 0286
[  502.052856] ESI: f3f94eb8 EDI: f3e51c00 EBP: f303dd9c ESP: f303dd9c
[  502.053695] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010246
[  502.054543] CR0: 80050033 CR2: 17ec6000 CR3: 2eea2000 CR4: 000406d0
[  502.055402] Call Trace:
[  502.056275]  drm_minor_acquire+0x6f/0x140 [drm]
[  502.057162]  drm_stub_open+0x2e/0x110 [drm]
[  502.058049]  chrdev_open+0xdd/0x1e0
[  502.058937]  do_dentry_open+0x21d/0x330
[  502.059828]  vfs_open+0x23/0x30
[  502.060718]  path_openat+0x947/0xd60
[  502.061610]  ? unlink_anon_vmas+0x53/0x120
[  502.062504]  do_filp_open+0x6d/0x100
[  502.063404]  ? __alloc_fd+0x73/0x140
[  502.064305]  do_sys_openat2+0x1b3/0x2a0
[  502.065217]  __ia32_sys_openat+0x90/0xb0
[  502.066128]  ? prepare_exit_to_usermode+0xa/0x20
[  502.067046]  do_fast_syscall_32+0x68/0xd0
[  502.067970]  do_SYSENTER_32+0x12/0x20
[  502.068902]  entry_SYSENTER_32+0x9f/0xf2
[  502.069839] EIP: 0xb7ef14f9
[  502.070764] Code: Bad RIP value.
[  502.071689] EAX: ffda EBX: ff9c ECX: bfa6a2ac EDX: 8002
[  502.072654] ESI:  EDI: b7ed1000 EBP: bfa6b2c8 ESP: bfa6a1c0
[  502.073630] DS: 007b ES: 007b FS:  GS: 0033 SS: 007b EFLAGS: 0246
[  502.074615] Modules linked in: af_packet k10temp r8169 realtek i2c_piix4 
snd_hda_codec_realtek snd_hda_codec_generic ohci_pci ohci_hcd ehci_pci 
snd_hda_codec_hdmi ehci_hcd radeon i2c_algo_bit snd_hda_intel ttm 
snd_intel_dspcfg snd_hda_codec drm_kms_helper snd_hda_core snd_pcm cfbimgblt 
cfbcopyarea cfbfillrect snd_timer sysimgblt syscopyarea sysfillrect snd 
fb_sys_fops xhci_pci xhci_hcd soundcore acpi_cpufreq drm 
drm_panel_orientation_quirks agpgart ipv6 nf_defrag_ipv6
[  502.077895] CR2: 17ec6000
[  502.079050] ---[ end trace ced4517b63a6db26 ]---
[  502.080214] EIP: __srcu_read_lock+0x11/0x20
[  502.081392] Code: 83 e0 03 50 56 68 72 c6 99 dd 68 46 c6 99 dd e8 3a c8 fe ff 83 
c4 10 eb ce 0f 1f 44 00 00 55 89 e5 8b 48 68 8b 40 7c 83 e1 01 <64> ff 04 88 f0 
83 44 24 fc 00 89 c8 5d c3 90 0f 1f 44 00 00 55 89
[  502.083891] EAX:  EBX: f36671b8 ECX:  EDX: 0286
[  502.085148] ESI: f3f94eb8 EDI: f3e51c00 EBP: f303dd9c ESP: f303dd9c
[  502.086406] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010246
[  502.087675] CR0: 80050033 CR2: 17ec6000 CR3: 2eea2000 CR4: 000406d0



$ dmesg | ./scripts/decodecode
[ 55.784870] Code: 83 e0 03 50 56 68 ca c6 99 cf 68 9e c6 99 cf e8 3a c8 fe ff 83 c4 
10 eb ce 0f 1f 44 00 00 55 89 e5 8b 48 68 8b 40 7c 83 e1 01 <64> ff 04 88 f0 83 
44 24 fc 00 89 c8 5d c3 90 0f 1f 44 00 00 55 89
All code

   0:   83 e0 03and$0x3,%eax
   3:   50  push   %eax
   4:   56  push   %esi
   5:   68 ca c6 99 cf  push   $0xcf99c6ca
   a:   68 9e c6 99 cf  push   $0xcf99c69e
   f:   e8 3a c8 fe ff  call   0xfffec84e
  14:   83 c4 10add$0x10,%esp
  17:   eb ce   jmp0xffe7
  19:   0f 1f 44 00 00  nopl   0x0(%eax,%eax,1)
  1e:   55  push   %ebp
  1f:   89 e5   mov%esp,%ebp
  21:   8b 48 68mov0x68(%eax),%ecx
  24:   8b 40 7cmov0x7c(%eax),%eax
  27:   83 e1 01and$0x1,%ecx
  2a:*  64 ff 04 88 incl   %fs:(%eax,%ecx,4)<-- 
trapping instruction
  2e:   f0 83 44 24 fc 00   lock addl $0x0,-0x4(%esp)
  34:   89 c8   mov%ecx,%eax
  36:   5d  pop%ebp
  37:	c3   	ret
  38:	90   	nop

  39:   0f 1f 44 00 00  nopl   0x0(%eax,%eax,1)
  3e:   55  push   %ebp
  3f:   89  .byte 0x89

Code starting with the faulting instruction
===
   0:   64 ff 04 88 incl   %fs:(%eax,%ecx,4)
   4:   f0 83 44 24 fc 00   lock addl 

[PATCH 2/3] gpu/drm: Fix spelling of *frequency*

2020-05-11 Thread Paul Menzel
Fix all occurrences with the command below.

git grep -l frequencey | xargs sed -i 's/frequencey/frequency/g'

Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
Cc: Alex Deucher 
Cc: Christian König 
Cc: David (ChunMing) Zhou 
Cc: amd-...@lists.freedesktop.org
Signed-off-by: Paul Menzel 
---
 drivers/gpu/drm/amd/include/atombios.h | 4 ++--
 drivers/gpu/drm/msm/dsi/dsi_host.c | 2 +-
 drivers/gpu/drm/radeon/atombios.h  | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/include/atombios.h 
b/drivers/gpu/drm/amd/include/atombios.h
index afef574c3b88b..7fe1d0d66701c 100644
--- a/drivers/gpu/drm/amd/include/atombios.h
+++ b/drivers/gpu/drm/amd/include/atombios.h
@@ -6138,7 +6138,7 @@ ucLVDSOffToOnDelay_in4Ms: LVDS power down 
sequence time in unit of 4ms.
 
 ucMinAllowedBL_Level: Lowest LCD backlight PWM level. This is 
customer platform specific parameters. By default it is 0.
 
-ulNbpStateMemclkFreq[4]:  system memory clock frequencey in unit of 
10Khz in different NB pstate.
+ulNbpStateMemclkFreq[4]:  system memory clock frequency in unit of 
10Khz in different NB pstate.
 
 
**/
 
@@ -6346,7 +6346,7 @@ ucMinAllowedBL_Level: Lowest LCD backlight 
PWM level. This is custom
 
 ulLCDBitDepthControlVal:  GPU display control encoder bit dither 
control setting, used to program register mmFMT_BIT_DEPTH_CONTROL
 
-ulNbpStateMemclkFreq[4]:  system memory clock frequencey in unit of 
10Khz in different NB P-State(P0, P1, P2 & P3).
+ulNbpStateMemclkFreq[4]:  system memory clock frequency in unit of 
10Khz in different NB P-State(P0, P1, P2 & P3).
 ulNbpStateNClkFreq[4]:NB P-State NClk frequency in different NB 
P-State
 usNBPStateVoltage[4]: NB P-State (P0/P1 & P2/P3) voltage; NBP3 
refers to lowes voltage
 usBootUpNBVoltage:NB P-State voltage during boot up before 
driver loaded
diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
b/drivers/gpu/drm/msm/dsi/dsi_host.c
index 11ae5b8444c32..7b50c2b7af74f 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -743,7 +743,7 @@ int dsi_calc_clk_rate_v2(struct msm_dsi_host *msm_host, 
bool is_dual_dsi)
 * esc clock is byte clock followed by a 4 bit divider,
 * we need to find an escape clock frequency within the
 * mipi DSI spec range within the maximum divider limit
-* We iterate here between an escape clock frequencey
+* We iterate here between an escape clock frequency
 * between 20 Mhz to 5 Mhz and pick up the first one
 * that can be supported by our divider
 */
diff --git a/drivers/gpu/drm/radeon/atombios.h 
b/drivers/gpu/drm/radeon/atombios.h
index 4d0f6de32957f..b9d7d54e537cf 100644
--- a/drivers/gpu/drm/radeon/atombios.h
+++ b/drivers/gpu/drm/radeon/atombios.h
@@ -5206,7 +5206,7 @@ ucLVDSOffToOnDelay_in4Ms: LVDS power down 
sequence time in unit of 4ms.
 
 ucMinAllowedBL_Level: Lowest LCD backlight PWM level. This is 
customer platform specific parameters. By default it is 0. 
 
-ulNbpStateMemclkFreq[4]:  system memory clock frequencey in unit of 
10Khz in different NB pstate. 
+ulNbpStateMemclkFreq[4]:  system memory clock frequency in unit of 
10Khz in different NB pstate. 
 
 
**/
 
@@ -5413,7 +5413,7 @@ ucMinAllowedBL_Level: Lowest LCD backlight 
PWM level. This is custom
 
 ulLCDBitDepthControlVal:  GPU display control encoder bit dither 
control setting, used to program register mmFMT_BIT_DEPTH_CONTROL
 
-ulNbpStateMemclkFreq[4]:  system memory clock frequencey in unit of 
10Khz in different NB P-State(P0, P1, P2 & P3).
+ulNbpStateMemclkFreq[4]:  system memory clock frequency in unit of 
10Khz in different NB P-State(P0, P1, P2 & P3).
 ulNbpStateNClkFreq[4]:NB P-State NClk frequency in different NB 
P-State
 usNBPStateVoltage[4]: NB P-State (P0/P1 & P2/P3) voltage; NBP3 
refers to lowes voltage
 usBootUpNBVoltage:NB P-State voltage during boot up before 
driver loaded 
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/3] gpu/drm: Fix spelling of *sequence* and *frequency*

2020-05-11 Thread Paul Menzel
Fix all occurrences with the commands below.

$ git grep -l equnce drivers/gpu/ | xargs sed -i 's/equnce/equence/g'

Cc: Alex Deucher 
Cc: Christian König 
Cc: David (ChunMing) Zhou 
Cc: amd-...@lists.freedesktop.org
Signed-off-by: Paul Menzel 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 4 ++--
 drivers/gpu/drm/amd/include/atombios.h | 6 +++---
 drivers/gpu/drm/radeon/atombios.h  | 6 +++---
 drivers/gpu/drm/radeon/cik.c   | 4 ++--
 drivers/gpu/drm/radeon/radeon_fence.c  | 6 +++---
 5 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index 733d398c61ccb..3da40056c6c5d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -2176,7 +2176,7 @@ static void gfx_v7_0_ring_emit_vgt_flush(struct 
amdgpu_ring *ring)
  * @adev: amdgpu_device pointer
  * @fence: amdgpu fence object
  *
- * Emits a fence sequnce number on the gfx ring and flushes
+ * Emits a fence sequence number on the gfx ring and flushes
  * GPU caches.
  */
 static void gfx_v7_0_ring_emit_fence_gfx(struct amdgpu_ring *ring, u64 addr,
@@ -2217,7 +2217,7 @@ static void gfx_v7_0_ring_emit_fence_gfx(struct 
amdgpu_ring *ring, u64 addr,
  * @adev: amdgpu_device pointer
  * @fence: amdgpu fence object
  *
- * Emits a fence sequnce number on the compute ring and flushes
+ * Emits a fence sequence number on the compute ring and flushes
  * GPU caches.
  */
 static void gfx_v7_0_ring_emit_fence_compute(struct amdgpu_ring *ring,
diff --git a/drivers/gpu/drm/amd/include/atombios.h 
b/drivers/gpu/drm/amd/include/atombios.h
index 8ba21747b40a3..afef574c3b88b 100644
--- a/drivers/gpu/drm/amd/include/atombios.h
+++ b/drivers/gpu/drm/amd/include/atombios.h
@@ -6138,7 +6138,7 @@ ucLVDSOffToOnDelay_in4Ms: LVDS power down 
sequence time in unit of 4ms.
 
 ucMinAllowedBL_Level: Lowest LCD backlight PWM level. This is 
customer platform specific parameters. By default it is 0.
 
-ulNbpStateMemclkFreq[4]:  system memory clock frequncey in unit of 
10Khz in different NB pstate.
+ulNbpStateMemclkFreq[4]:  system memory clock frequencey in unit of 
10Khz in different NB pstate.
 
 
**/
 
@@ -6346,7 +6346,7 @@ ucMinAllowedBL_Level: Lowest LCD backlight 
PWM level. This is custom
 
 ulLCDBitDepthControlVal:  GPU display control encoder bit dither 
control setting, used to program register mmFMT_BIT_DEPTH_CONTROL
 
-ulNbpStateMemclkFreq[4]:  system memory clock frequncey in unit of 
10Khz in different NB P-State(P0, P1, P2 & P3).
+ulNbpStateMemclkFreq[4]:  system memory clock frequencey in unit of 
10Khz in different NB P-State(P0, P1, P2 & P3).
 ulNbpStateNClkFreq[4]:NB P-State NClk frequency in different NB 
P-State
 usNBPStateVoltage[4]: NB P-State (P0/P1 & P2/P3) voltage; NBP3 
refers to lowes voltage
 usBootUpNBVoltage:NB P-State voltage during boot up before 
driver loaded
@@ -8902,7 +8902,7 @@ typedef struct _ATOM_XTMDS_INFO
   ATOM_I2C_ID_CONFIG_ACCESS  sucI2cId;   //Point the ID on which I2C 
is used to control external chip
   UCHAR  ucXtransimitterID;
   UCHAR  ucSupportedLink;// Bit field, bit0=1, single 
link supported;bit1=1,dual link supported
-  UCHAR  ucSequnceAlterID;   // Even with the same 
external TMDS asic, it's possible that the program seqence alters
+  UCHAR  ucSequenceAlterID;   // Even with the same 
external TMDS asic, it's possible that the program seqence alters
  // due to design. This ID is 
used to alert driver that the sequence is not "standard"!
   UCHAR  ucMasterAddress;// Address to control Master 
xTMDS Chip
   UCHAR  ucSlaveAddress; // Address to control Slave 
xTMDS Chip
diff --git a/drivers/gpu/drm/radeon/atombios.h 
b/drivers/gpu/drm/radeon/atombios.h
index 4b86e8b450090..4d0f6de32957f 100644
--- a/drivers/gpu/drm/radeon/atombios.h
+++ b/drivers/gpu/drm/radeon/atombios.h
@@ -5206,7 +5206,7 @@ ucLVDSOffToOnDelay_in4Ms: LVDS power down 
sequence time in unit of 4ms.
 
 ucMinAllowedBL_Level: Lowest LCD backlight PWM level. This is 
customer platform specific parameters. By default it is 0. 
 
-ulNbpStateMemclkFreq[4]:  system memory clock frequncey in unit of 
10Khz in different NB pstate. 
+ulNbpStateMemclkFreq[4]:  system memory clock frequencey in unit of 
10Khz in different NB pstate. 
 
 
**/
 
@@ -5413,7 +5413,7 @@ ucMinAllowedBL_Level: Lowest LCD backlight 
PWM level. This is custom
 
 u

Re: [regression 5.7-rc1] System does not power off, just halts

2020-04-14 Thread Paul Menzel

Dear Prike, dear Alex, dear Linux folks,


Am 13.04.20 um 10:44 schrieb Paul Menzel:

A regression between causes a system with the AMD board MSI B350M MORTAR 
(MS-7A37) with an AMD Ryzen 3 2200G not to power off any more but just 
to halt.


The regression is introduced in 9ebe5422ad6c..b032227c6293. I am in the 
process to bisect this, but maybe somebody already has an idea.


I found the Easter egg:


commit 487eca11a321ef33bcf4ca5adb3c0c4954db1b58
Author: Prike Liang 
Date:   Tue Apr 7 20:21:26 2020 +0800

drm/amdgpu: fix gfx hang during suspend with video playback (v2)

The system will be hang up during S3 suspend because of SMU is pending

for GC not respose the register CP_HQD_ACTIVE access request.This issue
root cause of accessing the GC register under enter GFX CGGPG and can
be fixed by disable GFX CGPG before perform suspend.

v2: Use disable the GFX CGPG instead of RLC safe mode guard.

Signed-off-by: Prike Liang 

Tested-by: Mengbing Wang 
Reviewed-by: Huang Rui 
Signed-off-by: Alex Deucher 
Cc: sta...@vger.kernel.org


It reverts cleanly on top of 5.7-rc1, and this fixes the issue.

Greg, please do not apply this to the stable series. The commit message 
doesn’t even reference a issue/bug report, and doesn’t give a detailed 
problem description. What system is it?


Dave, Alex, how to proceed? Revert? I created issue 1094 [1].


Kind regards,

Paul


[1]: https://gitlab.freedesktop.org/drm/amd/-/issues/1094
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] Initialize ATA before graphics

2020-02-24 Thread Paul Menzel
From: Arjan van de Ven 
Date: Thu, 2 Jun 2016 23:36:32 -0500

ATA init is the long pole in the boot process, and its asynchronous.
Move the graphics init after it, so that ATA and graphics initialize
in parallel.

Signed-off-by: Paul Menzel 
---

1.  Taken from Clear Linux: 
https://github.com/clearlinux-pkgs/linux/commits/master/0110-Initialize-ata-before-graphics.patch
2.  Arjan, can you please add your Signed-off-by line?

 drivers/Makefile | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/Makefile b/drivers/Makefile
index aaef17c..d08f3a3 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -58,15 +58,8 @@ obj-y+= char/
 # iommu/ comes before gpu as gpu are using iommu controllers
 obj-y  += iommu/
 
-# gpu/ comes after char for AGP vs DRM startup and after iommu
-obj-y  += gpu/
-
 obj-$(CONFIG_CONNECTOR)+= connector/
 
-# i810fb and intelfb depend on char/agp/
-obj-$(CONFIG_FB_I810)   += video/fbdev/i810/
-obj-$(CONFIG_FB_INTEL)  += video/fbdev/intelfb/
-
 obj-$(CONFIG_PARPORT)  += parport/
 obj-$(CONFIG_NVM)  += lightnvm/
 obj-y  += base/ block/ misc/ mfd/ nfc/
@@ -79,6 +72,14 @@ obj-$(CONFIG_IDE)+= ide/
 obj-y  += scsi/
 obj-y  += nvme/
 obj-$(CONFIG_ATA)  += ata/
+
+# gpu/ comes after char for AGP vs DRM startup and after iommu
+obj-y  += gpu/
+
+# i810fb and intelfb depend on char/agp/
+obj-$(CONFIG_FB_I810)   += video/fbdev/i810/
+obj-$(CONFIG_FB_INTEL)  += video/fbdev/intelfb/
+
 obj-$(CONFIG_TARGET_CORE)  += target/
 obj-$(CONFIG_MTD)  += mtd/
 obj-$(CONFIG_SPI)  += spi/
-- 
https://clearlinux.org



smime.p7s
Description: S/MIME Cryptographic Signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Regression drm-tip] Internal audio device missing

2019-12-27 Thread Paul Menzel

Dear Takashi,


Am 26.12.19 um 11:03 schrieb Takashi Iwai:

On Thu, 26 Dec 2019 10:47:18 +0100, Paul Menzel wrote:



With

 $ git describe --tags drm-tip/drm-tip
 v5.5-rc3-1481-ga20d8cd6901a

the internal audio device is not available, and just a dummy device.

Running `alsa-info.sh` [1], the messages below are shown with the
problematic Linux kernel.

 alsactl: get_controls:567: snd_ctl_open error: Sound protocol is
not compatible
 cat: /tmp/alsa-info.ateDlDjrZX/alsactl.tmp: No such file or directory


That's an unexpected side-effect of the recent protocol version bump
in sound.git for-next branch.  It seems that we can't change the minor
version unless we really want to break something.

Below is the fix patch.  Please give it a try.


Thank you for the quick reply and fix.


-- 8< --
From: Takashi Iwai 
Subject: [PATCH] ALSA: control: Fix incompatible protocol error

The recent change to bump the ALSA control API protocol version from
2.0.7 to 2.1.0 caused a regression on user-space; while the user-space
expects both the major and the minor versions to be identical with the
supported numbers, we changed the minor number from 0 to 1.

For recovering from the incompatibility, this patch changes the
protocol version again to 2.0.8, which is compatible, but yet higher
than the original number 2.0.7, indicating that the protocol change.

Fixes: bd3eb4e87eb3 ("ALSA: ctl: bump protocol version up to v2.1.0")
Reported-by: Paul Menzel 
Signed-off-by: Takashi Iwai 
---
  include/uapi/sound/asound.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h
index e36dadaf84ba..30ebb2a42983 100644
--- a/include/uapi/sound/asound.h
+++ b/include/uapi/sound/asound.h
@@ -936,7 +936,7 @@ struct snd_timer_tread {
   *  *
   /
  
-#define SNDRV_CTL_VERSION		SNDRV_PROTOCOL_VERSION(2, 1, 0)

+#define SNDRV_CTL_VERSION  SNDRV_PROTOCOL_VERSION(2, 0, 8)
  
  struct snd_ctl_card_info {

int card;   /* card number */



Tested-by: Paul Menzel 

Are there CI systems, which should have caught this problem?

Which user-space component should forward this problem to the user 
(desktop environment displaying a warning)?



Kind regards,

Paul
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Regression drm-tip] Internal audio device missing

2019-12-26 Thread Paul Menzel

Dear Linux folks,


With

$ git describe --tags drm-tip/drm-tip
v5.5-rc3-1481-ga20d8cd6901a

the internal audio device is not available, and just a dummy device.

Running `alsa-info.sh` [1], the messages below are shown with the 
problematic Linux kernel.


alsactl: get_controls:567: snd_ctl_open error: Sound protocol is 
not compatible

cat: /tmp/alsa-info.ateDlDjrZX/alsactl.tmp: No such file or directory

Please find the output of `alsa-info.sh` with Linux 5.5-rc3 and drm-tip 
attached.



Kind regards,

Paul


[1]: https://www.alsa-project.org/alsa-info.sh
upload=true=true=
!!
!!ALSA Information Script v 0.4.64
!!

!!Script ran on: Thu Dec 26 09:26:49 UTC 2019


!!Linux Distribution
!!--

Debian GNU/Linux bullseye/sid \n \l PRETTY_NAME="Debian GNU/Linux bullseye/sid" 
NAME="Debian GNU/Linux" ID=debian HOME_URL="https://www.debian.org/; 
SUPPORT_URL="https://www.debian.org/support; 
BUG_REPORT_URL="https://bugs.debian.org/;


!!DMI Information
!!---

Manufacturer:  Micro-Star International Co., Ltd.
Product Name:  MS-7A37
Product Version:   1.0
Firmware Version:  1.MR
Board Vendor:  MSI
Board Name:B350M MORTAR (MS-7A37)


!!ACPI Device Status Information
!!---

/sys/bus/acpi/devices/AMDI0030:00/status 15
/sys/bus/acpi/devices/AMDIF030:00/status 15
/sys/bus/acpi/devices/LNXVIDEO:01/status 15
/sys/bus/acpi/devices/PNP0103:00/status  15
/sys/bus/acpi/devices/PNP0501:00/status  15
/sys/bus/acpi/devices/PNP0A08:00/status  15
/sys/bus/acpi/devices/PNP0C01:00/status  15
/sys/bus/acpi/devices/PNP0C02:02/status  15
/sys/bus/acpi/devices/PNP0C02:04/status  15
/sys/bus/acpi/devices/PNP0C0C:00/status  11
/sys/bus/acpi/devices/PNP0C0F:00/status  11
/sys/bus/acpi/devices/PNP0C0F:01/status  11
/sys/bus/acpi/devices/PNP0C0F:02/status  11
/sys/bus/acpi/devices/PNP0C0F:03/status  11
/sys/bus/acpi/devices/PNP0C0F:04/status  11
/sys/bus/acpi/devices/PNP0C0F:05/status  11
/sys/bus/acpi/devices/PNP0C0F:06/status  11
/sys/bus/acpi/devices/PNP0C0F:07/status  11
/sys/bus/acpi/devices/device:11/status   11
/sys/bus/acpi/devices/device:12/status   11
/sys/bus/acpi/devices/device:23/status   15


!!Kernel Information
!!--

Kernel release:5.5.0-rc3-01525-gf5797232f233
Operating System:  GNU/Linux
Architecture:  x86_64
Processor: unknown
SMP Enabled:   Yes


!!ALSA Version
!!

Driver version: k5.5.0-rc3-01525-gf5797232f233
Library version:1.1.9
Utilities version:  1.1.9


!!Loaded ALSA modules
!!---

snd_hda_intel
snd_hda_intel


!!Sound Servers on this system
!!

Pulseaudio:
  Installed - Yes (/usr/bin/pulseaudio)
  Running - Yes


!!Soundcards recognised by ALSA
!!-

 0 [Generic]: HDA-Intel - HD-Audio Generic
  HD-Audio Generic at 0xfcc88000 irq 54
 1 [Generic_1  ]: HDA-Intel - HD-Audio Generic
  HD-Audio Generic at 0xfcc8 irq 55


!!PCI Soundcards installed in the system
!!--

26:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] 
Raven/Raven2/Fenghuang HDMI/DP Audio Controller
26:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 
10h-1fh) HD Audio Controller


!!Advanced information - PCI Vendor/Device/Subsystem ID's
!!---

26:00.1 0403: 1002:15de
Subsystem: 1002:15de
--
26:00.6 0403: 1022:15e3
Subsystem: 1462:fa37


!!Modprobe options (Sound related)
!!

snd_pcsp: index=-2
snd_usb_audio: index=-2
snd_atiixp_modem: index=-2
snd_intel8x0m: index=-2
snd_via82xx_modem: index=-2


!!Loaded sound module options
!!---

!!Module: snd_hda_intel
align_buffer_size : -1
bdl_pos_adj : 
-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1
beep_mode : 
N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N
dsp_driver : Y
enable : Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y
enable_msi : -1
id : 
(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null),(null)
index : 
-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1
jackpoll_ms : 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
model : 

Re: [PATCH 3/4] drm/amd/display: In VRR mode, do DRM core vblank handling at end of vblank.

2019-03-19 Thread Paul Menzel

Dear Mario,


On 18.03.19 18:19, Mario Kleiner wrote:

In VRR mode, proper vblank/pageflip timestamps can only be computed
after the display scanout position has left front-porch. Therefore
delay calls to drm_crtc_handle_vblank(), and thereby calls to
drm_update_vblank_count() and pageflip event delivery, to after the
end of front-porch when in VRR mode.

We add a new vupdate irq, which triggers at the end of the vupdate
interval, ie. at the end of vblank, and calls the core vblank handler
function. The new irq handler is not executed in standard non-VRR
mode, so vblank handling for fixed refresh rate mode is identical
to the past implementation.

Signed-off-by: Mario Kleiner 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|   1 +
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 129 -
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h  |   9 ++
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c  |  22 
  .../amd/display/dc/irq/dce110/irq_service_dce110.c |   7 +-
  .../amd/display/dc/irq/dce120/irq_service_dce120.c |   7 +-
  .../amd/display/dc/irq/dce80/irq_service_dce80.c   |   6 +-
  .../amd/display/dc/irq/dcn10/irq_service_dcn10.c   |  40 +--
  8 files changed, 205 insertions(+), 16 deletions(-)


[…]

`scripts/checkpatch.pl` shows two errors on your commit.


+   /* Use VUPDATE interrupt */
+   for (i = VISLANDS30_IV_SRCID_D1_V_UPDATE_INT; i <= 
VISLANDS30_IV_SRCID_D6_V_UPDATE_INT; i+=2) {


ERROR: spaces required around that '+=' (ctx:VxV)
#107: FILE: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:1490:
+	for (i = VISLANDS30_IV_SRCID_D1_V_UPDATE_INT; i <= 
VISLANDS30_IV_SRCID_D6_V_UPDATE_INT; i+=2) {
 	 
   ^


[…]


  static inline int dm_set_vblank(struct drm_crtc *crtc, bool enable)
  {
enum dc_irq_source irq_source;
struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc);
struct amdgpu_device *adev = crtc->dev->dev_private;
+   struct dm_crtc_state *acrtc_state = to_dm_crtc_state(crtc->state);
+   int rc = 0;
+
+   if (enable) {
+   /* vblank irq on -> Only need vupdate irq in vrr mode */
+   if (amdgpu_dm_vrr_active(acrtc_state))
+   rc = dm_set_vupdate_irq(crtc, true);
+   }
+   else {
+   /* vblank irq off -> vupdate irq off */
+   rc = dm_set_vupdate_irq(crtc, false);
+   }


ERROR: else should follow close brace '}'
#198: FILE: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:3346:
+   }
+   else {

Also:

WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
#258: FILE: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c:679:
+  unsigned crtc_id,

[…]


Kind regards,

Paul
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: randr: Virtual monitor not present with MST display

2019-03-18 Thread Paul Menzel


Dear Harry,


On 18.03.19 21:55, Wentland, Harry wrote:

On 2019-03-08 4:11 a.m., Michel Dänzer wrote:

On 2019-03-06 5:35 p.m., Paul Menzel wrote:

On 03/06/19 15:55, Michel Dänzer wrote:

On 2019-03-06 1:41 p.m., Paul Menzel wrote:

On 03/05/19 20:07, Alex Deucher wrote:

On Tue, Mar 5, 2019 at 1:16 PM Paul Menzel wrote:



Using the MST display Dell UP3214Q (two panels) with an AMD system,
the virtual monitor object is not created. GDM and Xfce consider both
panels as separate screens (`xrandr --listmonitors`).

[...]


I didn’t provide the output of xrandr in my previous message.

 $ xrandr --listmonitors
  Monitors: 2
  0: +DisplayPort-9 1920/698x2160/392+0+0  DisplayPort-9
  1: +DisplayPort-10 1920/698x2160/392+1920+0  DisplayPort-10

Please find the X.Org X Server log attached.


With an Intel system, the monitor object is shown.


To clarify, the modesetting driver is used with the Intel hardware.


Does this work better with the modesetting driver on the AMD system?


With Linux 4.19.19, there was the same problem with the modesetting driver
during my limited testing.

Updating to Linux 4.20.13, it worked with the modesetting driver, but the
AMDGPU driver still failed to properly utilize the MST monitor.


Does
https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu/merge_requests/32
help?


Michel, do you know if this is supposed to work with
xf86-video-amdgpu? When I've tried it before I didn't have any luck but
didn't have time to look into it.


Sorry, what is your question. With the commit from the merge request 
applied it works here. Also, the commit was added to the master branch 
in the mean time.



Kind regards,

Paul
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: randr: Virtual monitor not present with MST display

2019-03-06 Thread Paul Menzel
Dear Alex,


On 03/05/19 20:07, Alex Deucher wrote:
> On Tue, Mar 5, 2019 at 1:16 PM Paul Menzel wrote:

>> Using the MST display Dell UP3214Q (two panels) with an AMD system,
>> the virtual monitor object is not created. GDM and Xfce consider both
>> panels as separate screens (`xrandr --listmonitors`).
>>
>> [0.00] Linux version 4.20.13.mx64.248 
>> (r...@holidayincambodia.molgen.mpg.de) (gcc version 7.3.0 (GCC)) #1 SMP Wed 
>> Feb 27 14:10:55 CET 2019
>>
>> [   79.494297] [drm] DM_MST: stopping TM on aconnector: a8109331 
>> [id: 56]
>> [   79.494362] [drm] DM_MST: Disabling connector: 776ea22b [id: 63] 
>> [master: a8109331]
>> [   79.494406] [drm] DM_MST: Disabling connector: 057ebdbb [id: 67] 
>> [master: a8109331]
>> [   79.781882] snd_hda_intel :00:1f.3: spurious response 0x0:0x2, last 
>> cmd=0x201f0500
>> [   86.028806] [drm] DM_MST: starting TM on aconnector: a8109331 
>> [id: 56]
>> [   86.053072] [drm] DM_MST: added connector: 1c9b49ed [id: 71] 
>> [master: a8109331]
>> [   86.108540] [drm] SADs count is: -2, don't need to read it
>> [   86.386661] [drm:create_stream_for_sink [amdgpu]] *ERROR* Failed to 
>> create stream for sink!
>> [   87.237878] [drm] DM_MST: added connector: c3ffcbbb [id: 80] 
>> [master: a8109331]
>> [   87.293028] [drm] SADs count is: -2, don't need to read it
>> [  206.993344] [drm] DM_MST: stopping TM on aconnector: a8109331 
>> [id: 56]
>> [  206.993423] [drm] DM_MST: Disabling connector: 1c9b49ed [id: 71] 
>> [master: a8109331]
>> [  206.993456] [drm] DM_MST: Disabling connector: c3ffcbbb [id: 80] 
>> [master: a8109331]
>> [  207.548051] [drm:create_stream_for_sink [amdgpu]] *ERROR* Failed to 
>> create stream for sink!
>> [  207.603193] [drm:create_stream_for_sink [amdgpu]] *ERROR* Failed to 
>> create stream for sink!
>> [  207.762388] traps: xfdesktop[2225] general protection fault 
>> ip:7f588981226c sp:7ffee65af370 error:0 in 
>> libgobject-2.0.so.0.5800.1[7f58897da000+56000]
>> [  210.320612] [drm] DM_MST: starting TM on aconnector: b456cd59 
>> [id: 62]
>> [  210.343497] [drm] DM_MST: added connector: 735839d5 [id: 73] 
>> [master: b456cd59]
>> [  210.399168] [drm] SADs count is: -2, don't need to read it
>> [  210.404454] [drm] DM_MST: added connector: cccb0c2d [id: 88] 
>> [master: b456cd59]
>> [  210.675589] [drm] SADs count is: -2, don't need to read it

I didn’t provide the output of xrandr in my previous message.

$ xrandr --listmonitors
 Monitors: 2
 0: +DisplayPort-9 1920/698x2160/392+0+0  DisplayPort-9
 1: +DisplayPort-10 1920/698x2160/392+1920+0  DisplayPort-10

Please find the X.Org X Server log attached.

>> With an Intel system, the monitor object is shown.

To clarify, the modesetting driver is used with the Intel hardware.

>> $ xrandr --listmonitors
>> Monitors: 1
>>  0: +Auto-Monitor-1 3840/698x2160/392+0+0  DP-1-9 DP-1-8
>>
>> Do you have an idea, what the AMD drivers does differently, and how
>> to fix this?
> 
> + dri-devel
> 
> My understanding is that this is handled at the Desktop level rather
> than the driver.  X exposes the tile info via randr 1.5 and the
> desktop environment should handle it as a single monitor.  Mutter
> handles this in GNOME for example.
> 
> https://cgit.freedesktop.org/xorg/xserver/commit/?id=7e1f86d42b54fb7f6492875e47a718eaeca3069b
> https://lists.x.org/archives/xorg-announce/2015-May/002605.html
> https://mail.gnome.org/archives/desktop-devel-list/2015-November/msg00018.html

As it works with the modesetting driver, it looks to me like Xfce
supports it, and it’s a driver issue. The Linux error messages
also do not look promising. Any idea, if they are related?
Lastly, shouldn’t at least the output of `xrandr` be correct, if
everything is set up correctly?


Kind regards,

Paul
[57.665] 
X.Org X Server 1.20.4
X Protocol Version 11, Revision 0
[57.665] Build Operating System: Linux 4.19.19.mx64.244 x86_64 
[57.665] Current Operating System: Linux inbetweenmove.molgen.mpg.de 4.19.19.mx64.244 #1 SMP Tue Feb 5 13:01:13 CET 2019 x86_64
[57.665] Kernel command line: BOOT_IMAGE=/boot/bzImage-4.19.19.mx64.244 crashkernel=256M root=LABEL=root ro console=ttyS1,115200n8 console=tty0 init=/bin/systemd audit=0
[57.665] Build Date: 27 February 2019  12:32:24PM
[57.665]  
[57.665] Current version of pixman: 0.38.0
[57.665] 	Before reporting problems, check http://wiki.x.org
	to make sure that you have the latest version.
[57.665] Marke

Re: [PATCH] drm: Require PCI for selecting VGA_ARB.

2019-01-09 Thread Paul Menzel
Dear Maarten,


Thank you very much for the quick response.

On 01/08/19 16:37, Maarten Lankhorst wrote:
> Op 08-01-2019 om 16:07 schreef Paul Menzel:

>> Building Linux 5.0-rc1 fails with the errors below. Please find the
>> configuration file attached.
>>
>> ```
>> $ make -j120
>> […]
>> drivers/gpu/vga/vgaarb.c: In function ‘__vga_tryget’:
>> drivers/gpu/vga/vgaarb.c:286:14: error: ‘PCI_VGA_STATE_CHANGE_DECODES’ 
>> undeclared (first use in this function); did you mean 
>> ‘PCI_SUBTRACTIVE_DECODE’?
>>  flags |= PCI_VGA_STATE_CHANGE_DECODES;
>>   ^~~~
>>   PCI_SUBTRACTIVE_DECODE
>> drivers/gpu/vga/vgaarb.c:286:14: note: each undeclared identifier is 
>> reported only once for each function it appears in
>>   CC [M]  net/netfilter/xt_realm.o
>>   CC  drivers/hid/hid-cherry.o
>> drivers/gpu/vga/vga_switcheroo.c: In function 
>> ‘vga_switcheroo_runtime_suspend’:
>> drivers/gpu/vga/vga_switcheroo.c:1053:2: error: implicit declaration of 
>> function ‘pci_bus_set_current_state’; did you mean ‘__set_current_state’? 
>> [-Werror=implicit-function-declaration]
>>   pci_bus_set_current_state(pdev->bus, PCI_D3cold);
>>   ^
>>   __set_current_state
>> drivers/gpu/vga/vgaarb.c:291:13: error: ‘PCI_VGA_STATE_CHANGE_BRIDGE’ 
>> undeclared (first use in this function); did you mean 
>> ‘PCI_VGA_STATE_CHANGE_DECODES’?
>> flags |= PCI_VGA_STATE_CHANGE_BRIDGE;
>>  ^~~
>>  PCI_VGA_STATE_CHANGE_DECODES
>>   CC  fs/reiserfs/dir.o
>>   LD [M]  net/tipc/tipc.o
>> drivers/gpu/vga/vga_switcheroo.c: In function 
>> ‘vga_switcheroo_runtime_resume’:
>> drivers/gpu/vga/vga_switcheroo.c:1067:2: error: implicit declaration of 
>> function ‘pci_wakeup_bus’; did you mean ‘__wake_up_bit’? 
>> [-Werror=implicit-function-declaration]
>>   pci_wakeup_bus(pdev->bus);
>>   ^~
>>   __wake_up_bit
>> drivers/gpu/vga/vgaarb.c:293:3: error: implicit declaration of function 
>> ‘pci_set_vga_state’; did you mean ‘pci_set_power_state’? 
>> [-Werror=implicit-function-declaration]
>>pci_set_vga_state(conflict->pdev, false, pci_bits, flags);
>>^
>>pci_set_power_state
>>   CC  fs/read_write.o
>>   CC  drivers/hid/hid-chicony.o
>>   CC  drivers/hid/hid-cypress.o
>> drivers/gpu/vga/vgaarb.c: In function ‘vga_arb_device_init’:
>> drivers/gpu/vga/vgaarb.c:1495:25: error: ‘pci_bus_type’ undeclared (first 
>> use in this function); did you mean ‘pci_pcie_type’?
>>   bus_register_notifier(_bus_type, _notifier);
>>  ^~~~
>>  pci_pcie_type
>> cc1: some warnings being treated as errors
>> make[3]: *** [scripts/Makefile.build:277: drivers/gpu/vga/vgaarb.o] Error 1
>> make[3]: *** Waiting for unfinished jobs
>> […]
>> ```

> WARNING: unmet direct dependencies detected for VGA_ARB
>   Depends on [n]: HAS_IOMEM [=y] && PCI [=n] && !S390
>   Selected by [y]:
>   - VGA_SWITCHEROO [=y] && HAS_IOMEM [=y] && X86 [=y] && ACPI [=y]
> 
> So I guess you need to enable PCI, probably not a common config you're using. 
> :)
> 
> Especially since you selected EXPERT.

We have the attached defconfig, which is then integrated using
`make olddefconfig`.

> Oh well, apply this with git am --scissors?
> -8<
> When configuring the kernel without PCI we can still enable VGA switcheroo,
> which is not possible because VGA_ARB cannot be selected.
> 
> Remove this by depending on PCI for !S390.
> 
> Reported-by: Paul Menzel 
> Signed-off-by: Maarten Lankhorst 
> ---
> diff --git a/drivers/gpu/vga/Kconfig b/drivers/gpu/vga/Kconfig
> index b677e5d524e6..ef5671475870 100644
> --- a/drivers/gpu/vga/Kconfig
> +++ b/drivers/gpu/vga/Kconfig
> @@ -21,6 +21,7 @@ config VGA_SWITCHEROO
>   bool "Laptop Hybrid Graphics - GPU switching support"
>   depends on X86
>   depends on ACPI
> + depends on (PCI && !S390) # For VGA_ARB
>   select VGA_ARB
>   help
> Many laptops released in 2008/9/10 have two GPUs with a multiplexer

Is this an effect of commit eb01d42a (PCI: consolidate PCI config entry in
drivers/pci) as the `default y` is missing now?


Kind regards,

Paul
cat >.config <<-EOF
CONFIG_LOCALVERSION="$KERNELLOCAL"
CONFIG_KERNEL_BZIP2=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_AUDIT=y
CONFIG_HIGH

5.0-rc1 fails to build with vga/vgaarb.c:286:14: error: ‘PCI_VGA_STATE_CHANGE_DECODES’ undeclared

2019-01-09 Thread Paul Menzel
Dear Linux folks,


Building Linux 5.0-rc1 fails with the errors below. Please find the
configuration file attached.

```
$ make -j120
[…]
drivers/gpu/vga/vgaarb.c: In function ‘__vga_tryget’:
drivers/gpu/vga/vgaarb.c:286:14: error: ‘PCI_VGA_STATE_CHANGE_DECODES’ 
undeclared (first use in this function); did you mean ‘PCI_SUBTRACTIVE_DECODE’?
 flags |= PCI_VGA_STATE_CHANGE_DECODES;
  ^~~~
  PCI_SUBTRACTIVE_DECODE
drivers/gpu/vga/vgaarb.c:286:14: note: each undeclared identifier is reported 
only once for each function it appears in
  CC [M]  net/netfilter/xt_realm.o
  CC  drivers/hid/hid-cherry.o
drivers/gpu/vga/vga_switcheroo.c: In function ‘vga_switcheroo_runtime_suspend’:
drivers/gpu/vga/vga_switcheroo.c:1053:2: error: implicit declaration of 
function ‘pci_bus_set_current_state’; did you mean ‘__set_current_state’? 
[-Werror=implicit-function-declaration]
  pci_bus_set_current_state(pdev->bus, PCI_D3cold);
  ^
  __set_current_state
drivers/gpu/vga/vgaarb.c:291:13: error: ‘PCI_VGA_STATE_CHANGE_BRIDGE’ 
undeclared (first use in this function); did you mean 
‘PCI_VGA_STATE_CHANGE_DECODES’?
flags |= PCI_VGA_STATE_CHANGE_BRIDGE;
 ^~~
 PCI_VGA_STATE_CHANGE_DECODES
  CC  fs/reiserfs/dir.o
  LD [M]  net/tipc/tipc.o
drivers/gpu/vga/vga_switcheroo.c: In function ‘vga_switcheroo_runtime_resume’:
drivers/gpu/vga/vga_switcheroo.c:1067:2: error: implicit declaration of 
function ‘pci_wakeup_bus’; did you mean ‘__wake_up_bit’? 
[-Werror=implicit-function-declaration]
  pci_wakeup_bus(pdev->bus);
  ^~
  __wake_up_bit
drivers/gpu/vga/vgaarb.c:293:3: error: implicit declaration of function 
‘pci_set_vga_state’; did you mean ‘pci_set_power_state’? 
[-Werror=implicit-function-declaration]
   pci_set_vga_state(conflict->pdev, false, pci_bits, flags);
   ^
   pci_set_power_state
  CC  fs/read_write.o
  CC  drivers/hid/hid-chicony.o
  CC  drivers/hid/hid-cypress.o
drivers/gpu/vga/vgaarb.c: In function ‘vga_arb_device_init’:
drivers/gpu/vga/vgaarb.c:1495:25: error: ‘pci_bus_type’ undeclared (first use 
in this function); did you mean ‘pci_pcie_type’?
  bus_register_notifier(_bus_type, _notifier);
 ^~~~
 pci_pcie_type
cc1: some warnings being treated as errors
make[3]: *** [scripts/Makefile.build:277: drivers/gpu/vga/vgaarb.o] Error 1
make[3]: *** Waiting for unfinished jobs
[…]
```


Kind regards,

Paul
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 5.0.0-rc1 Kernel Configuration
#

#
# Compiler: gcc (GCC) 7.3.0
#
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=70300
CONFIG_CLANG_VERSION=0
CONFIG_CC_HAS_ASM_GOTO=y
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=".mx64.239"
CONFIG_LOCALVERSION_AUTO=y
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
# CONFIG_KERNEL_GZIP is not set
CONFIG_KERNEL_BZIP2=y
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_ARCH_CLOCKSOURCE_INIT=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_HZ_PERIODIC=y
# CONFIG_NO_HZ_IDLE is not set
# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
CONFIG_HIGH_RES_TIMERS=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
# CONFIG_PSI is not set
CONFIG_CPU_ISOLATION=y

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y

Re: [PATCH v3 03/13] fbdev: add kerneldoc do remove_conflicting_framebuffers()

2018-09-05 Thread Paul Menzel
Dear Michał,


Thank you for documenting the function. Do you mean *to* instead of *do*
in the commit message summary?


On 09/01/18 16:08, Michał Mirosław wrote:
> Document remove_conflicting_framebuffers() behaviour.
> 
> Signed-off-by: Michał Mirosław 
> ---
>  drivers/video/fbdev/core/fbmem.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/video/fbdev/core/fbmem.c 
> b/drivers/video/fbdev/core/fbmem.c
> index 0df148eb4699..2de93b5014e3 100644
> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -1775,6 +1775,16 @@ int unlink_framebuffer(struct fb_info *fb_info)
>  }
>  EXPORT_SYMBOL(unlink_framebuffer);
>  
> +/**
> + * remove_conflicting_framebuffers - remove firmware-configured framebuffers
> + * @a: memory range, users of which are to be removed
> + * @name: requesting driver name
> + * @primary: also kick vga16fb if present
> + *
> + * This function removes framebuffer devices (initialized by 
> firmware/bootloader)
> + * which use memory range described by @a. If @a is NULL all such devices are
> + * removed.
> + */
>  int remove_conflicting_framebuffers(struct apertures_struct *a,
>   const char *name, bool primary)
>  {

Acked-by: Paul Menzel 


Kind regards,

Paul



smime.p7s
Description: S/MIME Cryptographic Signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Decrease boot time by not probing unconnected ports to improve desktop user experience

2018-07-30 Thread Paul Menzel

Dear Linux folks,


It looks like, that loading a graphics driver, the DRM methods probe 
each connector.


For example, the Asus F2A85-M PRO has four monitor ports, but only one 
monitor is connected over DVI. Here is the log excerpt with Linux 4.18-rc6+.



[0.200631] [drm] DFP2: INTERNAL_UNIPHY
[0.228298] [drm:radeon_atom_encoder_dpms] encoder dpms 33 to mode 3, 
devices 0008, active_devices 
[0.228302] [drm:radeon_atom_encoder_dpms] encoder dpms 33 to mode 3, 
devices 0001, active_devices 
[0.238911] [drm:radeon_atom_encoder_dpms] encoder dpms 30 to mode 3, 
devices 0080, active_devices 
[0.239041] [drm:drm_setup_crtcs] 
[0.239043] [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:47:DP-1]

[0.243296] [drm:radeon_dp_aux_transfer_native] dp_aux_ch flags not zero: 
0201
[0.244282] [drm:radeon_dp_aux_transfer_native] dp_aux_ch flags not zero: 
0201
[0.245281] [drm:radeon_dp_aux_transfer_native] dp_aux_ch flags not zero: 
0201
[0.246282] [drm:radeon_dp_aux_transfer_native] dp_aux_ch flags not zero: 
0201
[0.247283] [drm:radeon_dp_aux_transfer_native] dp_aux_ch flags not zero: 
0201
[0.248283] [drm:radeon_dp_aux_transfer_native] dp_aux_ch flags not zero: 
0201
[0.249284] [drm:radeon_dp_aux_transfer_native] dp_aux_ch flags not zero: 
0201
[0.250284] [drm:radeon_dp_aux_transfer_native] dp_aux_ch flags not zero: 
0201
[0.251285] [drm:radeon_dp_aux_transfer_native] dp_aux_ch flags not zero: 
0201
[0.252287] [drm:radeon_dp_aux_transfer_native] dp_aux_ch flags not zero: 
0201
[0.274294] [drm:drm_dp_dpcd_access] Too many retries, giving up. First 
error: -5
[0.274307] [drm:radeon_atombios_connected_scratch_regs] DFP1 disconnected
[0.274311] [drm:drm_dp_i2c_do_msg] transaction failed: -5
[0.274313] [drm:drm_dp_i2c_do_msg] transaction failed: -5
[0.274316] [drm:drm_dp_i2c_do_msg] transaction failed: -5
[0.274318] [drm:drm_dp_i2c_do_msg] transaction failed: -5
[0.274320] [drm:drm_dp_i2c_do_msg] transaction failed: -5
[0.274322] [drm:drm_dp_i2c_do_msg] transaction failed: -5
[0.274324] [drm:drm_dp_i2c_do_msg] transaction failed: -5
[0.274326] [drm:drm_dp_i2c_do_msg] transaction failed: -5
[0.274328] [drm:drm_dp_i2c_do_msg] transaction failed: -5
[0.274330] [drm:drm_dp_i2c_do_msg] transaction failed: -5
[0.274334] [drm:drm_helper_probe_single_connector_modes] 
[CONNECTOR:47:DP-1] status updated from unknown to disconnected
[0.274335] [drm:drm_helper_probe_single_connector_modes] 
[CONNECTOR:47:DP-1] disconnected
[0.274338] [drm:drm_helper_probe_single_connector_modes] 
[CONNECTOR:50:VGA-1]
[0.276306] [drm:radeon_dp_getdpcd] DPCD: 11 0a 84 01 00 0b 01 01 02 00 00 
00 00 00 00
[0.293299] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[0.294299] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[0.315804] [drm:radeon_atom_dig_detect] Bios 0 scratch 2 0001
[0.315809] [drm:radeon_atombios_connected_scratch_regs] CRT1 disconnected
[0.315812] [drm:radeon_atombios_connected_scratch_regs] CRT1 disconnected
[0.316304] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[0.317305] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[0.318305] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[0.319305] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[0.320305] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[0.321306] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[0.322307] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[0.323307] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[0.324307] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[0.325308] [drm:drm_dp_i2c_do_msg] I2C nack (result=0, size=0
[0.325321] [drm:drm_helper_probe_single_connector_modes] 
[CONNECTOR:50:VGA-1] status updated from unknown to disconnected
[0.325323] [drm:drm_helper_probe_single_connector_modes] 
[CONNECTOR:50:VGA-1] disconnected
[0.325326] [drm:drm_helper_probe_single_connector_modes] 
[CONNECTOR:52:DVI-D-1]
[0.354644] [drm:radeon_atombios_connected_scratch_regs] DFP2 connected
[0.354646] [drm:drm_helper_probe_single_connector_modes] 
[CONNECTOR:52:DVI-D-1] status updated from unknown to connected
[0.354648] [drm:drm_add_display_info] non_desktop set to 0
[0.354652] [drm:drm_add_edid_modes] ELD: no CEA Extension found
[0.354653] [drm:drm_add_display_info] non_desktop set to 0
[0.354743] [drm:drm_helper_probe_single_connector_modes] 
[CONNECTOR:52:DVI-D-1] probed modes :
[0.354746] [drm:drm_mode_debug_printmodeline] Modeline 54:"1280x1024" 60 
108000 1280 1328 1440 1688 1024 1025 1028 1066 0x48 0x5
[0.354748] [drm:drm_mode_debug_printmodeline] Modeline 64:"1280x1024" 75 
135000 1280 1296 1440 1688 1024 1025 1028 1066 0x40 0x5
[0.354750] [drm:drm_mode_debug_printmodeline] Modeline 55:"1280x960" 60 

Re: [4.17-rc4+ regression] X server does not start anymore with segmentation fault in `r600_dri.so`

2018-05-16 Thread Paul Menzel

Dear Michel,


Am 15.05.2018 um 10:41 schrieb Michel Dänzer:

On 2018-05-15 08:38 AM, Paul Menzel wrote:

On 2018-05-14 10:44, Michel Dänzer wrote:

On 2018-05-13 11:01 AM, Paul Menzel wrote:



There is a regression introduced by a commit after Linux 4.17-rc4
causing the X.Org X server start to fail with the Radeon module loaded
on Debian Sid/unstable. The same Linux kernel build works with the
modesetting driver on the same system (no module *radeon* loaded) and
with i915 and the modesetting driver on a different system with Debian
9.4 (Stretch/stable).


[    16.263] xf86EnableIOPorts: failed to set IOPL for I/O (Operation
not permitted)
[…]
[    16.765] (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x50)
[0x5b4e60]
[    16.766] (EE) 1: /usr/lib/xorg/Xorg (0x40d000+0x1abd92) [0x5b8d92]
[    16.766] (EE) 2: linux-gate.so.1 (__kernel_rt_sigreturn+0x0)
[0xb7f2ad5c]
[    16.766] (EE) 3: /lib/i386-linux-gnu/libc.so.6
(0xb78a+0x140099) [0xb79e0099]
[    16.766] (EE) 4: /usr/lib/i386-linux-gnu/dri/r600_dri.so
(0xb62f9000+0x6698fd) [0xb69628fd]


Crashes in r600_dri.so => most likely a Mesa bug. Can you get a gdb
backtrace of the crash?


```
#0  0xb7f1bd45 in __kernel_vsyscall ()
#1  0xb78bd5b2 in __libc_signal_restore_set (set=0xbf93883c) at
../sysdeps/unix/sysv/linux/nptl-signals.h:80
#2  __GI_raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:48
#3  0xb78be9d1 in __GI_abort () at abort.c:79
#4  0x0061bf45 in OsAbort () at ../../../../os/utils.c:1361
#5  0x004ec96c in ddxGiveUp (error=EXIT_ERR_ABORT) at
../../../../../../hw/xfree86/common/xf86Init.c:1011
#6  0x004eca05 in AbortDDX (error=EXIT_ERR_ABORT) at
../../../../../../hw/xfree86/common/xf86Init.c:1055
#7  0x00621c6f in AbortServer () at ../../../../os/log.c:874
#8  0x00622654 in FatalError (f=0x650110 "Caught signal %d (%s). Server
aborting\n") at ../../../../os/log.c:1015
#9  0x00618def in OsSigHandler (signo=11, sip=0xbf938b4c,
unused=0xbf938bcc) at ../../../../os/osinit.c:154
#10 
#11 __memcpy_ssse3 () at ../sysdeps/i386/i686/multiarch/memcpy-ssse3.S:144
#12 0xb69518fd in memcpy (__len=48, __src=,
__dest=) at
/usr/include/i386-linux-gnu/bits/string_fortified.h:34
#13 r600_create_vertex_fetch_shader (ctx=0xf08e40, count=2,
elements=0xbf93933c) at
../../../../../src/gallium/drivers/r600/r600_asm.c:2701


Does
https://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes=76ef6b28ea4f81c3d511866a9b31392caa833126
help?


Yes, it does help. The problem is unreproducible with 
v4.17-rc5-20-g21b9f1c7e319.



BTW, if the CPU in this system is 64-bit capable, I'd recommend running
a 64-bit X server.
I still have a Lenovo X60 which is 32-bit. As the SSD drive is easily 
portable, I ues the system installation for the Lenovo X60 and the 
ASRock E350M1.



Kind regards,

Paul
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2] drm/vc4: Fix sleeps during the IRQ handler for DSI transactions.

2017-10-14 Thread Paul Menzel
Dear Eric,


Some nit picks where stuff contradicts the coding style.

Am Freitag, den 13.10.2017, 17:12 -0700 schrieb Eric Anholt:
> VC4's DSI1 has a bug where the AXI connection is broken for 32-bit
> writes from the CPU, so we use the DMA engine to DMA 32-bit values
> into registers instead.  That sleeps, so we can't do it from the top
> half.
> 
> As a solution, use an interrupt thread so that all our writes happen
> when sleeping is is allowed.
> 
> v2: Use IRQF_ONESHOT (suggested by Boris)
> 
> Signed-off-by: Eric Anholt 
> ---
> 
> Boris, that cleanup ended up working and it looks great.  Thanks!
> 
>  drivers/gpu/drm/vc4/vc4_dsi.c | 31 +--
>  1 file changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vc4/vc4_dsi.c b/drivers/gpu/drm/vc4/vc4_dsi.c
> index 554605af344e..3b74fda5662d 100644
> --- a/drivers/gpu/drm/vc4/vc4_dsi.c
> +++ b/drivers/gpu/drm/vc4/vc4_dsi.c
> @@ -1360,6 +1360,25 @@ static void dsi_handle_error(struct vc4_dsi *dsi,
>   *ret = IRQ_HANDLED;
>  }
>  
> +/* Initial handler for port 1 where we need the reg_dma workaround.
> + * The register DMA writes sleep, so we can't do it in the top half.
> + * Instead we use IRQF_ONESHOT so that the IRQ gets disabled in the
> + * parent interrupt contrller until our interrupt thread is done.
> + */

There should be a “blank comment line” at the beginning [1].

> The preferred style for long (multi-line) comments is:
> 
> /*
>  * This is the preferred style for multi-line
>  * comments in the Linux kernel source code.
>  * Please use it consistently.
>  *
>  * Description:  A column of asterisks on the left side,
>  * with beginning and ending almost-blank lines.
>  */

> +static irqreturn_t vc4_dsi_irq_defer_to_thread_handler(int irq, void *data)
> +{
> + struct vc4_dsi *dsi = data;
> + u32 stat = DSI_PORT_READ(INT_STAT);
> +
> + if (!stat)
> + return IRQ_NONE;
> +
> + return IRQ_WAKE_THREAD;
> +}
> +
> +/* Normal IRQ handler for port 0, or the threaded IRQ handler for port
> + * 1 where we need the reg_dma workaround.
> + */
>  static irqreturn_t vc4_dsi_irq_handler(int irq, void *data)
>  {
>   struct vc4_dsi *dsi = data;
> @@ -1539,8 +1558,16 @@ static int vc4_dsi_bind(struct device *dev, struct 
> device *master, void *data)
>   /* Clear any existing interrupt state. */
>   DSI_PORT_WRITE(INT_STAT, DSI_PORT_READ(INT_STAT));
>  
> - ret = devm_request_irq(dev, platform_get_irq(pdev, 0),
> -vc4_dsi_irq_handler, 0, "vc4 dsi", dsi);
> + if (dsi->reg_dma_mem) {
> + ret = devm_request_threaded_irq(dev, platform_get_irq(pdev, 0),
> + 
> vc4_dsi_irq_defer_to_thread_handler,
> + vc4_dsi_irq_handler,
> + IRQF_ONESHOT,
> + "vc4 dsi", dsi);
> + } else {
> + ret = devm_request_irq(dev, platform_get_irq(pdev, 0),
> +vc4_dsi_irq_handler, 0, "vc4 dsi", dsi);
> + }

The braces should be removed [2].

> Do not unnecessarily use braces where a single statement will do.

>   if (ret) {
>   if (ret != -EPROBE_DEFER)
>   dev_err(dev, "Failed to get interrupt: %d\n", ret);


Thanks,

Paul


[1] https://www.kernel.org/doc/html/v4.13/process/coding-style.html#commenting
[2] 
https://www.kernel.org/doc/html/v4.13/process/coding-style.html#placing-braces-and-spaces

signature.asc
Description: This is a digitally signed message part
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: dix/dispatch.c:4049: AttachOffloadGPU: Assertion `new->current_master == pScreen' failed.

2017-08-15 Thread Paul Menzel

Dear Michel,


On 2017-08-15 03:19, Michel Dänzer wrote:

On 15/08/17 02:48 AM, Paul Menzel wrote:

On 08/14/17 03:15, Michel Dänzer wrote:

On 11/08/17 01:02 AM, Paul Menzel wrote:

On 08/10/17 15:14, Paul Menzel wrote:


Xorg:
/dev/shm/bee-root/xorg-server/xorg-server-1.19.3-0/source/dix/dispatch.c:4049:

AttachOffloadGPU: Assertion `new->current_master == pScreen' 
failed.

```

The USB keyboard does not work, and the system has to be controlled
remotely over SSH. Plugging in the DisplayPort cable into the 
internal
Intel graphics ports the Linux messages can be seen, but nothing 
else
works. Stopping GDM does not bring back the TTY screen. The 
keyboard
still does work. Plugging the USB cable out and back in doesn’t fix 
it

either.

Trying to disable the internal Intel graphics card, did not fix the
problem either.


Blacklisting the module *i915* works around the problem.

Please tell me if you want me to create a bug report for this issue,
and, if yes, what part in the graphics stack this problem belongs 
to.


It's hard to be sure without seeing a backtrace of the assertion
failure, but I'd say start at
https://bugs.freedesktop.org/enter_bug.cgi?product=xorg=Server/Ext/RandR.



Do you know why the stack trace is not present in the X server log?


Because your xserver build doesn't have
https://cgit.freedesktop.org/xorg/xserver/commit/?id=27a6b9f7c84c914d0f5909ec1069d72f5035bc04
(it's only in Git master yet).


With this patch the stack trace is indeed in the log.

```
[…]
Xorg: 
/dev/shm/bee-root/xorg-server/xorg-server-1.19.3-1/source/dix/dispatch.c:4049: 
AttachOffloadGPU: Assertion `new->current_master == pScreen' failed.

(EE)
(EE) Backtrace:
(EE) 0: /usr/libexec/Xorg (xorg_backtrace+0x41) [0x580461]
(EE) 1: /usr/libexec/Xorg (0x40+0x183e09) [0x583e09]
(EE) 2: /lib/libpthread.so.0 (0x7f6f957d3000+0x11a50) [0x7f6f957e4a50]
(EE) 3: /lib/libc.so.6 (gsignal+0x104) [0x7f6f954682f4]
(EE) 4: /lib/libc.so.6 (abort+0x16a) [0x7f6f9546975a]
(EE) 5: /lib/libc.so.6 (0x7f6f95435000+0x2c027) [0x7f6f95461027]
(EE) 6: /lib/libc.so.6 (0x7f6f95435000+0x2c0d2) [0x7f6f954610d2]
(EE) 7: /usr/libexec/Xorg (AttachOffloadGPU+0x6e) [0x4357fe]
(EE) 8: /usr/libexec/Xorg (0x40+0xa8b76) [0x4a8b76]
(EE) 9: /usr/libexec/Xorg (InitOutput+0x5b3) [0x4775b3]
(EE) 10: /usr/libexec/Xorg (0x40+0x38cb6) [0x438cb6]
(EE) 11: /lib/libc.so.6 (__libc_start_main+0xf0) [0x7f6f95455520]
(EE) 12: /usr/libexec/Xorg (_start+0x2a) [0x42421a]
(EE)
(EE)
Fatal server error:
(EE) Caught signal 6 (Aborted). Server aborting
(EE)
(EE)
[…]
```


I was able to get a stack trace as root and with GDB. Issue #10
(dix/dispatch.c:4049: AttachOffloadGPU: Assertion `new->current_master
== pScreen' failed.) tracks this problem now, and the stack traces are
attached there.


Thanks, though it looks like it might actually be a downstream issue.


I don’t have my login credentials handy right now, so I attach the X 
server

here, and go looking for the credentials.


Kind regards,

Paul[388062.443] 
X.Org X Server 1.19.3
Release Date: 2017-03-15
[388062.443] X Protocol Version 11, Revision 0
[388062.443] Build Operating System: Linux 4.9.38.mx64.164 x86_64 
[388062.443] Current Operating System: Linux curie.molgen.mpg.de 
4.12.5.mx64.169 #1 SMP Tue Aug 8 11:22:02 CEST 2017 x86_64
[388062.443] Kernel command line: init=/bin/systemd 
BOOT_IMAGE=/boot/bzImage-4.12.5.mx64.169 crashkernel=256M root=LABEL=root ro 
console=ttyS1,115200n8 console=tty0
[388062.443] Build Date: 15 August 2017  05:40:26AM
[388062.443]  
[388062.443] Current version of pixman: 0.34.0
[388062.443]Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.
[388062.443] Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[388062.443] (==) Log file: "/var/log/Xorg.5.log", Time: Tue Aug 15 05:44:59 
2017
[388062.444] (==) Using config file: "/etc/X11/xorg.conf"
[388062.444] (==) Using config directory: "/etc/X11/xorg.conf.d"
[388062.444] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[388062.444] (==) No Layout section.  Using the first Screen section.
[388062.444] (==) No screen section available. Using defaults.
[388062.444] (**) |-->Screen "Default Screen Section" (0)
[388062.444] (**) |   |-->Monitor ""
[388062.444] (==) No monitor specified for screen "Default Screen Section".
Using a default monitor configuration.
[388062.444] (==) Automatically adding devices
[388062.444] (==) Automatically enabling devices
[388062.444] (==) Automatically adding GPU devices
[388062.444] (==) Max clients allowed: 256, resource mask: 0x1f
[388062.444] (==) FontPath set to:
/usr/share/fonts/X11/misc/,
/usr/share/fonts/X11/TTF/,
/usr/share/fonts/X11/OTF/,
  

Re: dix/dispatch.c:4049: AttachOffloadGPU: Assertion `new->current_master == pScreen' failed.

2017-08-14 Thread Paul Menzel

Dear Michel,


Thank you for your reply.


On 08/14/17 03:15, Michel Dänzer wrote:

On 11/08/17 01:02 AM, Paul Menzel wrote:



On 08/10/17 15:14, Paul Menzel wrote:


Getting Dell Precision 3620 systems with an Intel Skylake system
i7-6700K, and an external AMD graphics card, the X.Org server does not
start and hits an assert. The software stack is Linux 4.9.38 and
4.12.5, X.Org server 1.19.3 and xf86-video-amdgpu 1.3.0.

```
$ lspci -nn -s 00:01.0
00:01.0 PCI bridge [0604]: Intel Corporation Skylake PCIe Controller
(x16) [8086:1901] (rev 07)
$ lspci -nn -s 1:00.0
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc.
[AMD/ATI] Bonaire [FirePro W5100] [1002:6649]
```

The monitor’s DVI cable and a DVI-to-DP adapter is plugged into the
DisplayPort port of the AMD graphics card. The TTY login screen can be
shortly seen before GDM tries to start.

Please find the last X.Org server messages below.

```
[…]
(II) Loading /usr/lib/xorg/modules/libfb.so
(II) Module fb: vendor="X.Org Foundation"
  compiled for 1.19.3, module version = 1.0.0
(--) Depth 24 pixmap format is 32 bpp
(==) modeset(G0): Backing store enabled
(==) modeset(G0): Silken mouse enabled
(II) modeset(G0): RandR 1.2 enabled, ignore the following RandR
disabled message.
(==) modeset(G0): DPMS enabled
(II) modeset(G0): [DRI2] Setup complete
(II) modeset(G0): [DRI2]   DRI driver: i965
(II) modeset(G0): [DRI2]   VDPAU driver: i965
(II) AMDGPU(0): [DRI2] Setup complete
(II) AMDGPU(0): [DRI2]   DRI driver: radeonsi
(II) AMDGPU(0): [DRI2]   VDPAU driver: radeonsi
(II) AMDGPU(0): Front buffer pitch: 6912 bytes
(II) AMDGPU(0): SYNC extension fences enabled
(II) AMDGPU(0): Present extension enabled
(==) AMDGPU(0): DRI3 enabled
(==) AMDGPU(0): Backing store enabled
(II) AMDGPU(0): Direct rendering enabled
(II) AMDGPU(0): Use GLAMOR acceleration.
(II) AMDGPU(0): Acceleration enabled
(==) AMDGPU(0): DPMS enabled
(==) AMDGPU(0): Silken mouse enabled
(II) AMDGPU(0): Set up textured video (glamor)
(II) AMDGPU(0): RandR 1.2 enabled, ignore the following RandR disabled
message.
(--) RandR disabled
Xorg:
/dev/shm/bee-root/xorg-server/xorg-server-1.19.3-0/source/dix/dispatch.c:4049:
AttachOffloadGPU: Assertion `new->current_master == pScreen' failed.
```

The USB keyboard does not work, and the system has to be controlled
remotely over SSH. Plugging in the DisplayPort cable into the internal
Intel graphics ports the Linux messages can be seen, but nothing else
works. Stopping GDM does not bring back the TTY screen. The keyboard
still does work. Plugging the USB cable out and back in doesn’t fix it
either.

Trying to disable the internal Intel graphics card, did not fix the
problem either.


Blacklisting the module *i915* works around the problem.

Please tell me if you want me to create a bug report for this issue,
and, if yes, what part in the graphics stack this problem belongs to.


It's hard to be sure without seeing a backtrace of the assertion
failure, but I'd say start at
https://bugs.freedesktop.org/enter_bug.cgi?product=xorg=Server/Ext/RandR.


Do you know why the stack trace is not present in the X server log? I 
was able to get a stack trace as root and with GDB. Issue #10 
(dix/dispatch.c:4049: AttachOffloadGPU: Assertion `new->current_master 
== pScreen' failed.) tracks this problem now, and the stack traces are 
attached there.



Kind regards,

Paul


[1] https://bugs.freedesktop.org/show_bug.cgi?id=10
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: dix/dispatch.c:4049: AttachOffloadGPU: Assertion `new->current_master == pScreen' failed.

2017-08-12 Thread Paul Menzel

Dear Linux folks,


On 08/10/17 15:14, Paul Menzel wrote:

Getting Dell Precision 3620 systems with an Intel Skylake system 
i7-6700K, and an external AMD graphics card, the X.Org server does not 
start and hits an assert. The software stack is Linux 4.9.38 and 4.12.5, 
X.Org server 1.19.3 and xf86-video-amdgpu 1.3.0.


```
$ lspci -nn -s 00:01.0
00:01.0 PCI bridge [0604]: Intel Corporation Skylake PCIe Controller 
(x16) [8086:1901] (rev 07)

$ lspci -nn -s 1:00.0
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. 
[AMD/ATI] Bonaire [FirePro W5100] [1002:6649]

```

The monitor’s DVI cable and a DVI-to-DP adapter is plugged into the 
DisplayPort port of the AMD graphics card. The TTY login screen can be 
shortly seen before GDM tries to start.


Please find the last X.Org server messages below.

```
[…]
(II) Loading /usr/lib/xorg/modules/libfb.so
(II) Module fb: vendor="X.Org Foundation"
 compiled for 1.19.3, module version = 1.0.0
(--) Depth 24 pixmap format is 32 bpp
(==) modeset(G0): Backing store enabled
(==) modeset(G0): Silken mouse enabled
(II) modeset(G0): RandR 1.2 enabled, ignore the following RandR disabled 
message.

(==) modeset(G0): DPMS enabled
(II) modeset(G0): [DRI2] Setup complete
(II) modeset(G0): [DRI2]   DRI driver: i965
(II) modeset(G0): [DRI2]   VDPAU driver: i965
(II) AMDGPU(0): [DRI2] Setup complete
(II) AMDGPU(0): [DRI2]   DRI driver: radeonsi
(II) AMDGPU(0): [DRI2]   VDPAU driver: radeonsi
(II) AMDGPU(0): Front buffer pitch: 6912 bytes
(II) AMDGPU(0): SYNC extension fences enabled
(II) AMDGPU(0): Present extension enabled
(==) AMDGPU(0): DRI3 enabled
(==) AMDGPU(0): Backing store enabled
(II) AMDGPU(0): Direct rendering enabled
(II) AMDGPU(0): Use GLAMOR acceleration.
(II) AMDGPU(0): Acceleration enabled
(==) AMDGPU(0): DPMS enabled
(==) AMDGPU(0): Silken mouse enabled
(II) AMDGPU(0): Set up textured video (glamor)
(II) AMDGPU(0): RandR 1.2 enabled, ignore the following RandR disabled 
message.

(--) RandR disabled
Xorg: 
/dev/shm/bee-root/xorg-server/xorg-server-1.19.3-0/source/dix/dispatch.c:4049: 
AttachOffloadGPU: Assertion `new->current_master == pScreen' failed.

```

The USB keyboard does not work, and the system has to be controlled 
remotely over SSH. Plugging in the DisplayPort cable into the internal 
Intel graphics ports the Linux messages can be seen, but nothing else 
works. Stopping GDM does not bring back the TTY screen. The keyboard 
still does work. Plugging the USB cable out and back in doesn’t fix it 
either.


Trying to disable the internal Intel graphics card, did not fix the 
problem either.


Blacklisting the module *i915* works around the problem.

Please tell me if you want me to create a bug report for this issue, 
and, if yes, what part in the graphics stack this problem belongs to.



Kind regards,

Paul
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


dix/dispatch.c:4049: AttachOffloadGPU: Assertion `new->current_master == pScreen' failed.

2017-08-12 Thread Paul Menzel

Dear Linux folks,


Getting Dell Precision 3620 systems with an Intel Skylake system 
i7-6700K, and an external AMD graphics card, the X.Org server does not 
start and hits an assert. The software stack is Linux 4.9.38 and 4.12.5, 
X.Org server 1.19.3 and xf86-video-amdgpu 1.3.0.


```
$ lspci -nn -s 00:01.0
00:01.0 PCI bridge [0604]: Intel Corporation Skylake PCIe Controller 
(x16) [8086:1901] (rev 07)

$ lspci -nn -s 1:00.0
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. 
[AMD/ATI] Bonaire [FirePro W5100] [1002:6649]

```

The monitor’s DVI cable and a DVI-to-DP adapter is plugged into the 
DisplayPort port of the AMD graphics card. The TTY login screen can be 
shortly seen before GDM tries to start.


Please find the last X.Org server messages below.

```
[…]
(II) Loading /usr/lib/xorg/modules/libfb.so
(II) Module fb: vendor="X.Org Foundation"
compiled for 1.19.3, module version = 1.0.0
(--) Depth 24 pixmap format is 32 bpp
(==) modeset(G0): Backing store enabled
(==) modeset(G0): Silken mouse enabled
(II) modeset(G0): RandR 1.2 enabled, ignore the following RandR disabled 
message.

(==) modeset(G0): DPMS enabled
(II) modeset(G0): [DRI2] Setup complete
(II) modeset(G0): [DRI2]   DRI driver: i965
(II) modeset(G0): [DRI2]   VDPAU driver: i965
(II) AMDGPU(0): [DRI2] Setup complete
(II) AMDGPU(0): [DRI2]   DRI driver: radeonsi
(II) AMDGPU(0): [DRI2]   VDPAU driver: radeonsi
(II) AMDGPU(0): Front buffer pitch: 6912 bytes
(II) AMDGPU(0): SYNC extension fences enabled
(II) AMDGPU(0): Present extension enabled
(==) AMDGPU(0): DRI3 enabled
(==) AMDGPU(0): Backing store enabled
(II) AMDGPU(0): Direct rendering enabled
(II) AMDGPU(0): Use GLAMOR acceleration.
(II) AMDGPU(0): Acceleration enabled
(==) AMDGPU(0): DPMS enabled
(==) AMDGPU(0): Silken mouse enabled
(II) AMDGPU(0): Set up textured video (glamor)
(II) AMDGPU(0): RandR 1.2 enabled, ignore the following RandR disabled 
message.

(--) RandR disabled
Xorg: 
/dev/shm/bee-root/xorg-server/xorg-server-1.19.3-0/source/dix/dispatch.c:4049: 
AttachOffloadGPU: Assertion `new->current_master == pScreen' failed.

```

The USB keyboard does not work, and the system has to be controlled 
remotely over SSH. Plugging in the DisplayPort cable into the internal 
Intel graphics ports the Linux messages can be seen, but nothing else 
works. Stopping GDM does not bring back the TTY screen. The keyboard 
still does work. Plugging the USB cable out and back in doesn’t fix it 
either.


Trying to disable the internal Intel graphics card, did not fix the 
problem either.



Kind regards,

Paul
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


OT: Patches for Chromium browser to use VAAPI available

2017-07-31 Thread Paul Menzel
Dear Linux folks,


Just a small note, that Intel pushed patches for the Chromium browser
for review [1] to use VAAPI [2] along with vaapi-driver to run all
supported media use cases with hardware acceleration.

This will hopefully let GNU/Linux systems desktop to catch up further
to Microsoft Windows systems, especially on mobile devices like laptops
where this makes a difference in how long the battery lasts.

Unfortunately, Mozilla Firefox still lacks support for this too. Bug
report #1210727 [3] has been open for several years. Maybe Intel is so
kind to also do that work. ;-)


Thanks,

Paul


[1] https://chromium-review.googlesource.com/c/532294
[2] https://01.org/linuxgraphics/community/vaapi
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1210727

signature.asc
Description: This is a digitally signed message part
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 11/14] drm/mgag200: Consolidate depth/bpp handling

2017-07-20 Thread Paul Menzel
Dear Takashi,


Thank you for posting these patches for review.


Am Dienstag, den 18.07.2017, 16:43 +0200 schrieb Takashi Iwai:
> From: Egbert Eich 
> 
> The depth/bpp handling for chips with limited memory in commit
> 918be888d613 ("drm/mgag200: on cards with < 2MB VRAM default to
> 16-bit") was incomplete: the bpp limits were applied to mode
> validation.
> 
> This consolidates dpeth/bpp handling, adds it to mode validation

depth

> and moves the code which reads the command line specified depth
> into the correct location.
> 
> Fixes: 918be888d613 ("drm/mgag200: on cards with < 2MB VRAM default to 
> 16-bit")

```
$ git tag --contains 918be888d613 | head -1
v3.14
```

Please mark this as stable then too?

Also, could the original commit author time-stamps be preserved if you
have them in your SUSE repositories? The `Date` line could be added
below the `From` line. That would help to know for how long the changes
have been tested.

> Signed-off-by: Egbert Eich 
> Signed-off-by: Takashi Iwai 
> ---
>  drivers/gpu/drm/mgag200/mgag200_drv.h  |  2 ++
>  drivers/gpu/drm/mgag200/mgag200_fb.c   |  7 +--
>  drivers/gpu/drm/mgag200/mgag200_main.c |  9 ++---
>  drivers/gpu/drm/mgag200/mgag200_mode.c | 14 +++---
>  4 files changed, 16 insertions(+), 16 deletions(-)

[…]

Otherwise this looks fine.


Thanks,

Paul

signature.asc
Description: This is a digitally signed message part
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] dim: Switch Link: tags to https://

2017-07-16 Thread Paul Menzel
Dear Daniel,


Am Sonntag, den 16.07.2017, 12:26 +0200 schrieb Daniel Vetter:
> http: gets a "301 moved permanently" reply.
> 
> Reported-by: Paul Menzel <paulepan...@users.sourceforge.net>
> Cc: Paul Menzel <paulepan...@users.sourceforge.net>
> Signed-off-by: Daniel Vetter <daniel.vet...@intel.com>

Reviewed-by: Paul Menzel <paulepan...@users.sourceforge.net>


Thanks,

Paul

signature.asc
Description: This is a digitally signed message part
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Please use HTTPS URLs for Patchwork references

2017-07-16 Thread Paul Menzel
Dear Linux folks,


Currently commits contain non HTTPS URLs like below.

```
Link: 
http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-7-ville.syrj...@linux.intel.com
```

I guess these are generated by scripts. Could you please update them to
generate HTTPS URLs, as that’s where the “non-secure” URLs are
redirected to?

```
$ curl -I http://patchwork.freedesktop.org
HTTP/1.1 301 Moved Permanently
Date: Sun, 16 Jul 2017 09:25:15 GMT
Server: Apache/2.4.10 (Debian)
Location: https://patchwork.freedesktop.org/
Content-Type: text/html; charset=iso-8859-1
```


Thanks,

Paul

signature.asc
Description: This is a digitally signed message part
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Caching of EDID for X server to decrease startup time of X server

2015-08-08 Thread Paul Menzel
Dear DRI folks,


I am trying to speed up booting to the graphical login screen. The goal
is, that after entering the LUKS password, the graphical login manager,
GDM in my case, should be there in less than one second.

I am using Debian 8.1 (Jessie) with Linux 3.16.7 and the X server
1.16.4.

Currently, the X server takes quite some time. Running the following
command, it takes almost a second. Note, it includes the shutdown time
too.

$ time sudo X -pogo :1 >& /dev/null # from within a running GNOME session

real0m0.709s
user0m0.068s
sys 0m0.128s

I’ll paste the content of `/var/log/Xorg.1.log` below. The time there
goes from 231.235 to 231.922, which is 687 ms.

[   231.235] 
X.Org X Server 1.16.4
Release Date: 2014-12-20
[   231.235] X Protocol Version 11, Revision 0
[   231.235] Build Operating System: Linux 3.16.0-4-amd64 x86_64 Debian
[   231.235] Current Operating System: Linux mattotaupa 3.16.0-4-amd64 #1 
SMP Debian 3.16.7-ckt11-1+deb8u3 (2015-08-04) x86_64
[   231.235] Kernel command line: BOOT_IMAGE=/vmlinuz-3.16.0-4-amd64 
root=UUID=1b45d72e-7bd8-490f-bd9e-7e5990859148 ro quiet noisapnp 
drm_kms_helper.poll=0 radeon.dpm=1 pcie_aspm=force pcie_aspm.policy=powersave 
nmi_watchdog=0
[   231.235] Build Date: 11 February 2015  12:32:02AM
[   231.235] xorg-server 2:1.16.4-1 (http://www.debian.org/support) 
[   231.235] Current version of pixman: 0.32.6
[   231.235]Before reporting problems, check 
http://wiki.x.org
to make sure that you have the latest version.
[   231.235] Markers: (--) probed, (**) from config file, (==) default 
setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[   231.235] (==) Log file: "/var/log/Xorg.1.log", Time: Sat Aug  8 
10:01:09 2015
[   231.236] (==) Using config directory: "/etc/X11/xorg.conf.d"
[   231.236] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[   231.236] (==) No Layout section.  Using the first Screen section.
[   231.236] (==) No screen section available. Using defaults.
[   231.236] (**) |-->Screen "Default Screen Section" (0)
[   231.236] (**) |   |-->Monitor ""
[   231.236] (==) No device specified for screen "Default Screen Section".
Using the first device section listed.
[   231.236] (**) |   |-->Device "RS780 [Radeon HD 3200]"
[   231.236] (==) No monitor specified for screen "Default Screen Section".
Using a default monitor configuration.
[   231.236] (==) Automatically adding devices
[   231.236] (==) Automatically enabling devices
[   231.236] (==) Automatically adding GPU devices
[   231.236] (WW) The directory "/usr/share/fonts/X11/misc" does not exist.
[   231.236]Entry deleted from font path.
[   231.236] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not 
exist.
[   231.236]Entry deleted from font path.
[   231.236] (WW) The directory "/usr/share/fonts/X11/100dpi/" does not 
exist.
[   231.236]Entry deleted from font path.
[   231.236] (WW) The directory "/usr/share/fonts/X11/100dpi" does not 
exist.
[   231.236]Entry deleted from font path.
[   231.236] (==) FontPath set to:
/usr/share/fonts/X11/75dpi/:unscaled,
/usr/share/fonts/X11/Type1,
/usr/share/fonts/X11/75dpi,
built-ins
[   231.236] (==) ModulePath set to "/usr/lib/xorg/modules"
[   231.236] (II) The server relies on udev to provide the list of input 
devices.
If no devices become available, reconfigure udev or disable 
AutoAddDevices.
[   231.236] (II) Loader magic: 0x7f7e2ab51d80
[   231.236] (II) Module ABI versions:
[   231.236]X.Org ANSI C Emulation: 0.4
[   231.236]X.Org Video Driver: 18.0
[   231.236]X.Org XInput driver : 21.0
[   231.236]X.Org Server Extension : 8.0
[   231.236] (II) xfree86: Adding drm device (/dev/dri/card0)
[   231.236] (EE) /dev/dri/card0: failed to set DRM interface version 1.4: 
Permission denied
[   231.238] (--) PCI:*(0:1:5:0) 1002:9610:1849:9610 rev 0, Mem @ 
0xf400/67108864, 0xfdff/65536, 0xfde0/1048576, I/O @ 0xc000/256
[   231.238] (II) LoadModule: "glx"
[   231.238] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[   231.239] (II) Module glx: vendor="X.Org Foundation"
[   231.239]compiled for 1.16.4, module version = 1.0.0
[   231.239]ABI class: X.Org Server Extension, version 8.0
[   231.239] (==) AIGLX enabled
[   231.239] (II) LoadModule: "radeon"
[   231.240] (II) Loading /usr/lib/xorg/modules/drivers/radeon_drv.so
[   231.240] (II) Module radeon: vendor="X.Org Foundation"
[   231.240]compiled for 

plugin-containe[…]: segfault in r600_dri.so[93c80000+812000]

2014-12-21 Thread Paul Menzel
Dear DRI folks,


my X session crashed.

$ more /var/log/Xorg.0.log.old
[…]
[  2509.130] (EE) 
[  2509.130] (EE) Backtrace:
[  2509.626] (EE) 0: /usr/bin/Xorg (xorg_backtrace+0x52) [0xb76e9262]
[  2509.627] (EE) 1: /usr/bin/Xorg (0xb7543000+0x1aa502) [0xb76ed502]
[  2509.627] (EE) 2: linux-gate.so.1 (__kernel_rt_sigreturn+0x0) 
[0xb751fd24]
[  2509.627] (EE) 3: /usr/bin/Xorg (miDoCopy+0x5f) [0xb76c76df]
[  2509.627] (EE) 4: /usr/lib/xorg/modules/libexa.so 
(0xb6b0d000+0x7a09) [0xb6b14a09]
[  2509.627] (EE) 5: /usr/bin/Xorg (0xb7543000+0x12b325) [0xb766e325]
[  2509.628] (EE) 6: /usr/lib/xorg/modules/drivers/radeon_drv.so 
(0xb6b28000+0x3d60d) [0xb6b6560d]
[  2509.628] (EE) 7: /usr/lib/xorg/modules/drivers/radeon_drv.so 
(0xb6b28000+0x3da4c) [0xb6b65a4c]
[  2509.628] (EE) 8: /usr/lib/xorg/modules/drivers/radeon_drv.so 
(0xb6b28000+0x41f24) [0xb6b69f24]
[  2509.628] (EE) 9: /usr/lib/i386-linux-gnu/libdrm.so.2 
(drmHandleEvent+0xec) [0xb73dc46c]
[  2509.628] (EE) 10: /usr/lib/xorg/modules/drivers/radeon_drv.so 
(0xb6b28000+0x427ad) [0xb6b6a7ad]
[  2509.628] (EE) 11: /usr/bin/Xorg (WakeupHandler+0x64) [0xb7584a54]
[  2509.628] (EE) 12: /usr/bin/Xorg (WaitForSomething+0x1b3) 
[0xb76e6443]
[  2509.629] (EE) 13: /usr/bin/Xorg (0xb7543000+0x3ca4e) [0xb757fa4e]
[  2509.629] (EE) 14: /usr/bin/Xorg (0xb7543000+0x40eca) [0xb7583eca]
[  2509.629] (EE) 15: /usr/bin/Xorg (0xb7543000+0x2abca) [0xb756dbca]
[  2509.629] (EE) 16: /lib/i386-linux-gnu/i686/cmov/libc.so.6 
(__libc_start_main+0xf3) [0xb70daa63]
[  2509.630] (EE) 17: /usr/bin/Xorg (0xb7543000+0x2ac08) [0xb756dc08]
[  2509.630] (EE) 
[  2509.630] (EE) Segmentation fault at address 0x10
[  2509.630] (EE) 
Fatal server error:
[  2509.630] (EE) Caught signal 11 (Segmentation fault). Server aborting
[  2509.630] (EE) 
[  2509.630] (EE) 
Please consult the The X.Org Foundation support 
 at http://wiki.x.org
 for help. 
[  2509.630] (EE) Please also check the log file at 
"/var/log/Xorg.0.log" for additional information.
[  2509.630] (EE) 
[  2509.631] (II) AIGLX: Suspending AIGLX clients for VT switch
[  2509.643] (EE) Server terminated with error (1). Closing log file.

I think it is due to Iceweasel/Firefox, but I think I deleted the
correct core dump file. There is also a core dump file for
`/usr/bin/Xorg`. Do you see anything helpful in the pasted backtrace?


Thanks,

Paul


Thread 2 (Thread 0xb3ed7b40 (LWP 1705)):
#0  0xb751fd3c in __kernel_vsyscall ()
No symbol table info available.
#1  0xb70a6c4b in pthread_cond_wait@@GLIBC_2.3.2 () at 
../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_wait.S:187
No locals.
#2  0xb71b798c in __pthread_cond_wait (cond=0xb94e703c, mutex=0xb94e7024) at 
forward.c:149
__p = 
#3  0xb64f4fed in cnd_wait (mtx=0xb94e7024, cond=0xb94e703c) at 
../../../../../../../include/c11/threads_posix.h:154
No locals.
#4  pipe_semaphore_wait (sema=0xb94e7024) at 
../../../../../../../src/gallium/auxiliary/os/os_thread.h:248
No locals.
#5  radeon_drm_cs_emit_ioctl (param=0xb94e6da8) at 
../../../../../../../src/gallium/winsys/radeon/drm/radeon_drm_winsys.c:595
ws = 0xb94e6da8
cs = 
i = 
#6  0xb64f4705 in impl_thrd_routine (p=0xb94e7788) at 
../../../../../../../include/c11/threads_posix.h:87
pack = {func = 0xb64f4e90 , arg = 0xb94e6da8}
#7  0xb70a2efb in start_thread (arg=0xb3ed7b40) at pthread_create.c:309
__res = 
pd = 0xb3ed7b40
now = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-1223995392, -1276282048, 
4001536, -1276284248, 1676094040, -1386653103}, 
  mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = 
{prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = 
pagesize_m1 = 
sp = 
freesize = 
__PRETTY_FUNCTION__ = "start_thread"
#8  0xb71aadfe in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:129
No locals.

Thread 1 (Thread 0xb6efd880 (LWP 1153)):
#0  0xb751fd3c in __kernel_vsyscall ()
No symbol table info available.
#1  0xb70ef307 in __GI_raise (sig=sig at entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
resultvar = 
resultvar = 
pid = -113632
selftid = 1153
#2  0xb70f09c3 in __GI_abort () at abort.c:89
save_stage = 2
act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, 
sa_mask = {__val = {0, 171515904, 0, 3075743744, 3075746096, 
  5, 3218070208, 3075669151, 3075746536, 3069363272, 1, 5, 0, 0, 0, 
0, 3218070180, 3078287392, 3075809736, 3075767512, 0, 0, 
  0, 3071045912, 0, 0, 0, 3078287360, 1, 3078325456, 3078325360, 
3075694624}}, sa_flags = -1216630208, 
  sa_restorer = 

Brightness too high on internal AMD/ATI graphics device, but fine with external AMD/ATI graphics card

2014-10-26 Thread Paul Menzel
Am Montag, den 07.07.2014, 17:40 -0400 schrieb Alex Deucher:
> On Sat, Jul 5, 2014 at 3:42 AM, Paul Menzel wrote:

> > connecting a VGA monitor to the internal graphics device of the ASRock
> > E350M1 [1], the brightness is much too high.

[?]

> It sounds like the DAC bg/adj values are wrong on that board (they are
> tuned on a per board basis and stored in the bios).  You might try a
> bios upgrade on the board if there is one available.  If not, please
> file a bug (https://bugs.freedesktop.org) and attach a copy of your
> dmesg output, the vbios from the problematic board, and if possible
> the output of `radeonreg regs dce4`.
> 
> To get a copy of your vbios:
> (as root)
> (use lspci to get the bus id)
> cd /sys/bus/pci/devices/
> echo 1 > rom
> cat rom > /tmp/vbios.rom
> echo 0 > rom
> 
> You can get radeonreg here:
> http://cgit.freedesktop.org/~airlied/radeontool/

Alex, thank you for your reply! The bug I filed was assigned the number
85469 [1].


Thanks,

Paul


[1] https://bugs.freedesktop.org/show_bug.cgi?id=85469
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20141026/227d1df4/attachment.sig>


Brightness too high on internal AMD/ATI graphics device, but fine with external AMD/ATI graphics card

2014-07-05 Thread Paul Menzel
Dear Linux folks,


connecting a VGA monitor to the internal graphics device of the ASRock
E350M1 [1], the brightness is much too high.

$ lspci -tvnn
[?]
+-01.0  Advanced Micro Devices [AMD] nee ATI Wrestler [Radeon HD 6310] 
[1002:9802]
+-01.1  Advanced Micro Devices [AMD] nee ATI Wrestler HDMI Audio 
[Radeon HD 6250/6310] [1002:1314]
[?]

Plugging in an external AMD/ATI graphics card and connecting the VGA
monitor there, the brightness is normal as expected.

$ lspci -nn
[?]
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. 
[AMD/ATI] Cedar [Radeon HD 5000/6000/7350/8350 Series] [1002:68f9]
01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] 
Cedar HDMI Audio [Radeon HD 5400/6300 Series] [1002:aa68]
[?]

The software stack is the same. Any idea how I can configure or debug
that? This has been verified with 3.2.x to 3.14.9.


Thanks,

Paul
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: 



Is a stale image but moving mouse cursor a driver problem?

2014-01-20 Thread Paul Menzel
Dear DRI folks,


using Debian Sid/unstable with Linux 3.12.6, GNOME 3.8 and the Radeon
stack the following problem happened, I do not know what to make of it.

Under unknown circumstances the GNOME Screensaver on tty7 ?freezes?. The
big clock in the background can be seen, but nothing can be entered (or
there is no feedback). Only the mouse cursor can be moved.

Switching to tty1 works and then switching back to tty7 the image from
tty1 is still seen (distribution information in the first line and
`hostname login:` in second) and the mouse cursor can still be moved.

There are no messages shown in `.xsession-errors`, `/var/log/syslog` or
`/var/log/Xorg.0.log`.

Asking the GNOME folks about it in #gnome-shell on  they
replied it might be a driver bug. Can you confirm that?

Are there any log files I can look at or other things I can do to find
out what is wrong?


Thanks,

Paul
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: 



[PATCH] intel: Track whether a buffer is idle to avoid trips to the kernel.

2014-01-20 Thread Paul Menzel
Dear Eric,


thank you for the patch. I noticed one typo.


Am Mittwoch, den 15.01.2014, 00:38 -0800 schrieb Eric Anholt:
> I've seen a number of apps spending unreasonable amounts of time in
> drm_intel_bo_busy during the buffer mapping process.
> 
> We can't track idleness in general, in the case of buffers shared
> across processes.  But this should significantly reduce our overhead
> for checking for busy on things like VBOs.
> 
> Improves (unoptimized) glamor x11perf -f8text by 0.243334% +/-
> 0.161498% (n=1549), which has formerly been spending about .5% of its
> time hitting the kernel for drm_intel_gem_bo_busy().
> ---
> 
> I've still got a patch outstanding on the list for valgrind-cleaning
> the modesetting paths.  Since we're probably rolling a release soon,
> that might be nice to get in.
> 
>  intel/intel_bufmgr_gem.c | 23 ++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c
> index 75e95e6..27ad576 100644
> --- a/intel/intel_bufmgr_gem.c
> +++ b/intel/intel_bufmgr_gem.c
> @@ -212,6 +212,15 @@ struct _drm_intel_bo_gem {
>   bool reusable;
>  
>   /**
> +  * Boolean of whether the GPU is definitely not accessing the buffer.
> +  *
> +  * This is only valid when reusable, since non-reusable
> +  * buffers are those that have been shared wth other

w*i*th

> +  * processes, so we don't know their state.
> +  */
> + bool idle;
> +
> + /**
>* Size in bytes of this buffer and its relocation descendents.
>*
>* Used to avoid costly tree walking in

The rest looks fine.


Thanks,

Paul
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: 



Asus U38N: Black screen with Radeon driver in Linux 3.10, 3.11 and 3.12

2014-01-13 Thread Paul Menzel
Dear Alex,


thank you for your reply.


Am Sonntag, den 12.01.2014, 18:36 -0500 schrieb Alex Deucher:
> On Sat, Jan 11, 2014 at 4:32 PM, Paul Menzel wrote:

> > as reported in the channel #radeon on , with the
> > laptop Asus U38N-C4010H with an AMD Radeon HD 7600G
> > (Trinity A8-4555M) I am unable to get something displayed on the screen
> > with modesetting enabled. The backlight of the screen is on, but nothing
> > is shown on the screen. Booting with `radeon.modeset=0` on the Linux
> > kernel command line works. I was able to reproduce this problem with
> > Grml with Linux 3.10 and 3.11 and Debian Sid/unstable with Linux 3.12.6.
> >
> > Attaching a VGA monitor to the mini VGA connector using an adapter,
> > nothing was shown on the monitor either.
> 
> Please file a bug: https://bugs.freedesktop.org (Product: DRI,
> Component: DRM/Radeon) and attach your xorg log and dmesg output.

I created ticket #73530 [1].


Thanks,

Paul


[1] https://bugs.freedesktop.org/show_bug.cgi?id=73530
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140113/4d659310/attachment-0001.pgp>


Asus U38N: Black screen with Radeon driver in Linux 3.10, 3.11 and 3.12

2014-01-11 Thread Paul Menzel
Dear Linux folks,


as reported in the channel #radeon on , with the
laptop Asus U38N-C4010H with an AMD Radeon HD 7600G
(Trinity A8-4555M) I am unable to get something displayed on the screen
with modesetting enabled. The backlight of the screen is on, but nothing
is shown on the screen. Booting with `radeon.modeset=0` on the Linux
kernel command line works. I was able to reproduce this problem with
Grml with Linux 3.10 and 3.11 and Debian Sid/unstable with Linux 3.12.6.

Attaching a VGA monitor to the mini VGA connector using an adapter,
nothing was shown on the monitor either.

The following is the output I get when starting with `radeon.modeset=0`
and then doing `sudo modprobe -r radeon` and `sudo modprobe radeon
modeset=1`.

Jan 11 06:42:24 myhostname kernel: [ 1060.871165] [drm] radeon kernel 
modesetting enabled.
Jan 11 06:42:24 myhostname kernel: [ 1060.871244] checking generic 
(d000 1f) vs hw (d000 1000)
Jan 11 06:42:24 myhostname kernel: [ 1060.871247] fb: conflicting fb hw 
usage radeondrmfb vs EFI VGA - removing generic driver
Jan 11 06:42:24 myhostname kernel: [ 1060.871322] Console: switching to 
colour dummy device 80x25
Jan 11 06:42:24 myhostname kernel: [ 1060.871836] [drm] initializing 
kernel modesetting (ARUBA 0x1002:0x9908 0x1043:0x1557).
Jan 11 06:42:24 myhostname kernel: [ 1060.871882] [drm] register mmio 
base: 0xFEB0
Jan 11 06:42:24 myhostname kernel: [ 1060.871884] [drm] register mmio 
size: 262144
Jan 11 06:42:24 myhostname kernel: [ 1060.871913] [drm] ACPI VFCT 
contains a BIOS for 00:01.0 1002:9908, size 19968
Jan 11 06:42:24 myhostname kernel: [ 1060.871925] ATOM BIOS: 113
Jan 11 06:42:24 myhostname kernel: [ 1060.871997] radeon :00:01.0: 
VRAM: 768M 0x - 0x2FFF (768M used)
Jan 11 06:42:24 myhostname kernel: [ 1060.872000] radeon :00:01.0: 
GTT: 1024M 0x3000 - 0x6FFF
Jan 11 06:42:24 myhostname kernel: [ 1060.872003] [drm] Detected VRAM 
RAM=768M, BAR=256M
Jan 11 06:42:24 myhostname kernel: [ 1060.872005] [drm] RAM width 
64bits DDR
Jan 11 06:42:24 myhostname kernel: [ 1060.872088] [TTM] Zone  kernel: 
Available graphics memory: 1581256 kiB
Jan 11 06:42:24 myhostname kernel: [ 1060.872091] [TTM] Initializing 
pool allocator
Jan 11 06:42:24 myhostname kernel: [ 1060.872101] [TTM] Initializing 
DMA pool allocator
Jan 11 06:42:24 myhostname kernel: [ 1060.872129] [drm] radeon: 768M of 
VRAM memory ready
Jan 11 06:42:24 myhostname kernel: [ 1060.872133] [drm] radeon: 1024M 
of GTT memory ready.
Jan 11 06:42:24 myhostname kernel: [ 1060.874660] radeon :00:01.0: 
firmware: direct-loading firmware radeon/TAHITI_uvd.bin
Jan 11 06:42:24 myhostname kernel: [ 1060.874989] [drm] GART: num cpu 
pages 262144, num gpu pages 262144
Jan 11 06:42:24 myhostname kernel: [ 1060.882325] [drm] Loading ARUBA 
Microcode
Jan 11 06:42:24 myhostname kernel: [ 1060.882749] radeon :00:01.0: 
firmware: direct-loading firmware radeon/ARUBA_pfp.bin
Jan 11 06:42:24 myhostname kernel: [ 1060.883100] radeon :00:01.0: 
firmware: direct-loading firmware radeon/ARUBA_me.bin
Jan 11 06:42:24 myhostname kernel: [ 1060.883456] radeon :00:01.0: 
firmware: direct-loading firmware radeon/ARUBA_rlc.bin
Jan 11 06:42:24 myhostname kernel: [ 1060.884967] [drm] PCIE GART of 
1024M enabled (table at 0x00276000).
Jan 11 06:42:24 myhostname kernel: [ 1060.885159] radeon :00:01.0: 
WB enabled
Jan 11 06:42:24 myhostname kernel: [ 1060.885166] radeon :00:01.0: 
fence driver on ring 0 use gpu addr 0x3c00 and cpu addr 
0x880129f11c00
Jan 11 06:42:24 myhostname kernel: [ 1060.885905] radeon :00:01.0: 
fence driver on ring 5 use gpu addr 0x00075a18 and cpu addr 
0xc90011db5a18
Jan 11 06:42:24 myhostname kernel: [ 1060.885910] radeon :00:01.0: 
fence driver on ring 1 use gpu addr 0x3c04 and cpu addr 
0x880129f11c04
Jan 11 06:42:24 myhostname kernel: [ 1060.885914] radeon :00:01.0: 
fence driver on ring 2 use gpu addr 0x3c08 and cpu addr 
0x880129f11c08
Jan 11 06:42:24 myhostname kernel: [ 1060.885918] radeon :00:01.0: 
fence driver on ring 3 use gpu addr 0x3c0c and cpu addr 
0x880129f11c0c
Jan 11 06:42:24 myhostname kernel: [ 1060.885922] radeon :00:01.0: 
fence driver on ring 4 use gpu addr 0x3c10 and cpu addr 
0x880129f11c10
Jan 11 06:42:24 myhostname kernel: [ 1060.885929] [drm] Supports vblank 
timestamp caching Rev 1 (10.10.2010).
Jan 11 06:42:24 myhostname kernel: [ 1060.885931] [drm] Driver supports 
precise vblank timestamp query.
Jan 11 06:42:24 myhostname kernel: [ 1060.885958] radeon :00:01.0: 
irq 52 for MSI/MSI-X
Jan 

Asus U38N: Unable to get Video BIOS on Radeon HD 7600G

2014-01-11 Thread Paul Menzel
Dear Linux developers,


as reported in the channel #radeon on , I am not able
to dump the Video BIOS on my Asus U38N-C4010H with AMD Radeon HD 7600G
(Trinity A8-4555M).

I am using Debian Sid/unstable with Linux version 3.12-1-amd64
(3.12.6-2).

$ cd /sys/devices/pci\:00/\:00\:01.0/
$ echo 1 | sudo tee rom
1
$ sudo cat rom > /tmp/vbios.rom
cat: rom: Input/output error

Am I doing something wrong or is this an error in the code?


Thanks,

Paul
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: 



drm fixes for 3.11: Tag more patches for stable? (was: [git pull] drm fixes)

2013-08-09 Thread Paul Menzel
Dear Dave,


Am Freitag, den 09.08.2013, 05:53 +0100 schrieb Dave Airlie:

[?]

> The following changes since commit c095ba7224d8edc71dcef0d655911399a8bd4a3f:
> 
>   Linux 3.11-rc4 (2013-08-04 13:46:46 -0700)
> 
> are available in the git repository at:
> 
>   git://people.freedesktop.org/~airlied/linux drm-fixes
> 
> for you to fetch changes up to e42f5814212079aecd5826dba10588a896ac0862:
> 
>   Merge tag 'drm-intel-fixes-2013-08-08' of 
> git://people.freedesktop.org/~danvet/drm-intel into drm-fixes (2013-08-09 
> 09:09:37 +1000)
> 
> 
> 
> Aaron Lu (1):
>   drm/i915: avoid brightness overflow when doing scale
> 
> Alex Deucher (11):
>   drm/radeon: properly handle pm on gpu reset
>   drm/radeon: select audio dto based on encoder id for DCE3
>   drm/radeon/dpm: adjust thermal protection requirements
>   drm/radeon/dpm: fix spread spectrum setup (v2)
>   drm/radeon/dpm: adjust power state properly for UVD on SI
>   drm/radeon/dpm: disable sclk ss on rv6xx
>   drm/radeon: fix audio dto calculation on DCE3+ (v3)
>   drm/radeon: always program the MC on startup
>   drm/radeon/cik: use a mutex to properly lock srbm instanced registers
>   drm/radeon/dpm: require rlc for dpm
>   drm/radeon: make missing smc ucode non-fatal
> 
> Christian K?nig (5):
>   drm/radeon: fix halting UVD
>   drm/radeon: only save UVD bo when we have open handles
>   drm/radeon: stop sending invalid UVD destroy msg
>   drm/radeon: add more UVD CS checking
>   drm/radeon: remove unnecessary unpin

as UVD is also in 3.10 should more of these be tagged
`stable at vger.kernel.org` too? I only checked

drm/radeon: add more UVD CS checking

and it did not have that tag.

[?]


Thanks,

Paul
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: 



drm fixes for 3.11: Tag more patches for stable? (was: [git pull] drm fixes)

2013-08-09 Thread Paul Menzel
Dear Dave,


Am Freitag, den 09.08.2013, 05:53 +0100 schrieb Dave Airlie:

[…]

 The following changes since commit c095ba7224d8edc71dcef0d655911399a8bd4a3f:
 
   Linux 3.11-rc4 (2013-08-04 13:46:46 -0700)
 
 are available in the git repository at:
 
   git://people.freedesktop.org/~airlied/linux drm-fixes
 
 for you to fetch changes up to e42f5814212079aecd5826dba10588a896ac0862:
 
   Merge tag 'drm-intel-fixes-2013-08-08' of 
 git://people.freedesktop.org/~danvet/drm-intel into drm-fixes (2013-08-09 
 09:09:37 +1000)
 
 
 
 Aaron Lu (1):
   drm/i915: avoid brightness overflow when doing scale
 
 Alex Deucher (11):
   drm/radeon: properly handle pm on gpu reset
   drm/radeon: select audio dto based on encoder id for DCE3
   drm/radeon/dpm: adjust thermal protection requirements
   drm/radeon/dpm: fix spread spectrum setup (v2)
   drm/radeon/dpm: adjust power state properly for UVD on SI
   drm/radeon/dpm: disable sclk ss on rv6xx
   drm/radeon: fix audio dto calculation on DCE3+ (v3)
   drm/radeon: always program the MC on startup
   drm/radeon/cik: use a mutex to properly lock srbm instanced registers
   drm/radeon/dpm: require rlc for dpm
   drm/radeon: make missing smc ucode non-fatal
 
 Christian König (5):
   drm/radeon: fix halting UVD
   drm/radeon: only save UVD bo when we have open handles
   drm/radeon: stop sending invalid UVD destroy msg
   drm/radeon: add more UVD CS checking
   drm/radeon: remove unnecessary unpin

as UVD is also in 3.10 should more of these be tagged
`sta...@vger.kernel.org` too? I only checked

drm/radeon: add more UVD CS checking

and it did not have that tag.

[…]


Thanks,

Paul


signature.asc
Description: This is a digitally signed message part
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] [RFC] drm/nouveau: bring back hdmi audio device after switcheroo power down

2013-07-24 Thread Paul Menzel
Am Mittwoch, den 24.07.2013, 17:13 +1000 schrieb Dave Airlie:
> After a full device powerdown via the optimus power switch, we seem
> to lose the HDMI device completely on power on, this keep track of

keep*s*

> whether we had a hdmi audio sub function device at power on, and
> pokes a magic register to make it reappear after the optimus power
> switch is thrown.
> 
> This at least works on my NVC4 machine, probably needs testing on
> a few other laptops with other nvidia GPUs.
> 
> Signed-off-by: Dave Airlie 
> ---
>  drivers/gpu/drm/nouveau/nouveau_drm.c | 32 
>  drivers/gpu/drm/nouveau/nouveau_drm.h |  2 ++
>  drivers/gpu/drm/nouveau/nouveau_vga.c | 17 +
>  3 files changed, 51 insertions(+)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
> b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index 6197266..12a6240 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -296,6 +296,31 @@ static int nouveau_drm_probe(struct pci_dev *pdev,
>   return 0;
>  }
>  
> +#define PCI_CLASS_MULTIMEDIA_HD_AUDIO 0x0403

Should that go into some header?

> +
> +static void
> +nouveau_get_hdmi_dev(struct drm_device *dev)
> +{
> + struct nouveau_drm *drm = dev->dev_private;
> + struct pci_dev *pdev = dev->pdev;
> +
> + /* subfunction one is a hdmi audio device? */

Just function as in ::.?

> + drm->hdmi_device = pci_get_bus_and_slot((unsigned int)pdev->bus->number,
> + 
> PCI_DEVFN(PCI_SLOT(pdev->devfn), 1));
> +
> + if (!drm->hdmi_device) {
> + DRM_INFO("hdmi device  not found %d %d %d\n", 
> pdev->bus->number, PCI_SLOT(pdev->devfn), 1);

Just one space after device?

> + return;
> + }
> +
> + if ((drm->hdmi_device->class >> 8) != PCI_CLASS_MULTIMEDIA_HD_AUDIO) {
> + DRM_INFO("possible hdmi device  not audio %d\n", 
> drm->hdmi_device->class);
> + pci_dev_put(drm->hdmi_device);
> + drm->hdmi_device = NULL;
> + return;
> + }
> +}
> +
>  static int
>  nouveau_drm_load(struct drm_device *dev, unsigned long flags)
>  {
> @@ -314,6 +339,8 @@ nouveau_drm_load(struct drm_device *dev, unsigned long 
> flags)
>   INIT_LIST_HEAD(>clients);
>   spin_lock_init(>tile.lock);
>  
> + nouveau_get_hdmi_dev(dev);
> +
>   /* make sure AGP controller is in a consistent state before we
>* (possibly) execute vbios init tables (see nouveau_agp.h)
>*/
> @@ -400,6 +427,9 @@ fail_ttm:
>   nouveau_agp_fini(drm);
>   nouveau_vga_fini(drm);
>  fail_device:
> + if (drm->hdmi_device)
> + pci_dev_put(drm->hdmi_device);
> +
>   nouveau_cli_destroy(>client);
>   return ret;
>  }
> @@ -424,6 +454,8 @@ nouveau_drm_unload(struct drm_device *dev)
>   nouveau_agp_fini(drm);
>   nouveau_vga_fini(drm);
>  
> + if (drm->hdmi_device)
> + pci_dev_put(drm->hdmi_device);
>   nouveau_cli_destroy(>client);
>   return 0;
>  }
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.h 
> b/drivers/gpu/drm/nouveau/nouveau_drm.h
> index 41ff7e0..f276e37 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.h
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.h
> @@ -129,6 +129,8 @@ struct nouveau_drm {
>  
>   /* power management */
>   struct nouveau_pm *pm;
> +
> + struct pci_dev *hdmi_device;
>  };
>  
>  static inline struct nouveau_drm *
> diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c 
> b/drivers/gpu/drm/nouveau/nouveau_vga.c
> index 25d3495..d8af49c 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_vga.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
> @@ -27,6 +27,22 @@ nouveau_vga_set_decode(void *priv, bool state)
>  }
>  
>  static void
> +nouveau_reenable_hdmi_device(struct drm_device *dev)
> +{
> + struct nouveau_drm *drm = nouveau_drm(dev);
> + struct nouveau_device *device = nv_device(drm->device);
> + uint32_t val;
> +
> + if (!drm->hdmi_device)
> + return;
> +
> + /* write magic value into magic place */
> + val = nv_rd32(device, 0x88488);
> + val |= (1 << 25);
> + nv_wr32(device, 0x88488, val);

Use a define for this nevertheless?

> +}
> +
> +static void
>  nouveau_switcheroo_set_state(struct pci_dev *pdev,
>enum vga_switcheroo_state state)
>  {
> @@ -37,6 +53,7 @@ nouveau_switcheroo_set_state(struct pci_dev *pdev,
>   dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
>   nouveau_pmops_resume(>dev);
>   drm_kms_helper_poll_enable(dev);
> + nouveau_reenable_hdmi_device(dev);
>   dev->switch_power_state = DRM_SWITCH_POWER_ON;
>   } else {
>   printk(KERN_ERR "VGA switcheroo: switched nouveau off\n");

Otherwise this looks good,


Thanks,

Paul
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: 

Re: [PATCH] [RFC] drm/nouveau: bring back hdmi audio device after switcheroo power down

2013-07-24 Thread Paul Menzel
Am Mittwoch, den 24.07.2013, 17:13 +1000 schrieb Dave Airlie:
 After a full device powerdown via the optimus power switch, we seem
 to lose the HDMI device completely on power on, this keep track of

keep*s*

 whether we had a hdmi audio sub function device at power on, and
 pokes a magic register to make it reappear after the optimus power
 switch is thrown.
 
 This at least works on my NVC4 machine, probably needs testing on
 a few other laptops with other nvidia GPUs.
 
 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
  drivers/gpu/drm/nouveau/nouveau_drm.c | 32 
  drivers/gpu/drm/nouveau/nouveau_drm.h |  2 ++
  drivers/gpu/drm/nouveau/nouveau_vga.c | 17 +
  3 files changed, 51 insertions(+)
 
 diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
 b/drivers/gpu/drm/nouveau/nouveau_drm.c
 index 6197266..12a6240 100644
 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
 +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
 @@ -296,6 +296,31 @@ static int nouveau_drm_probe(struct pci_dev *pdev,
   return 0;
  }
  
 +#define PCI_CLASS_MULTIMEDIA_HD_AUDIO 0x0403

Should that go into some header?

 +
 +static void
 +nouveau_get_hdmi_dev(struct drm_device *dev)
 +{
 + struct nouveau_drm *drm = dev-dev_private;
 + struct pci_dev *pdev = dev-pdev;
 +
 + /* subfunction one is a hdmi audio device? */

Just function as in domain:bus:slot.func?

 + drm-hdmi_device = pci_get_bus_and_slot((unsigned int)pdev-bus-number,
 + 
 PCI_DEVFN(PCI_SLOT(pdev-devfn), 1));
 +
 + if (!drm-hdmi_device) {
 + DRM_INFO(hdmi device  not found %d %d %d\n, 
 pdev-bus-number, PCI_SLOT(pdev-devfn), 1);

Just one space after device?

 + return;
 + }
 +
 + if ((drm-hdmi_device-class  8) != PCI_CLASS_MULTIMEDIA_HD_AUDIO) {
 + DRM_INFO(possible hdmi device  not audio %d\n, 
 drm-hdmi_device-class);
 + pci_dev_put(drm-hdmi_device);
 + drm-hdmi_device = NULL;
 + return;
 + }
 +}
 +
  static int
  nouveau_drm_load(struct drm_device *dev, unsigned long flags)
  {
 @@ -314,6 +339,8 @@ nouveau_drm_load(struct drm_device *dev, unsigned long 
 flags)
   INIT_LIST_HEAD(drm-clients);
   spin_lock_init(drm-tile.lock);
  
 + nouveau_get_hdmi_dev(dev);
 +
   /* make sure AGP controller is in a consistent state before we
* (possibly) execute vbios init tables (see nouveau_agp.h)
*/
 @@ -400,6 +427,9 @@ fail_ttm:
   nouveau_agp_fini(drm);
   nouveau_vga_fini(drm);
  fail_device:
 + if (drm-hdmi_device)
 + pci_dev_put(drm-hdmi_device);
 +
   nouveau_cli_destroy(drm-client);
   return ret;
  }
 @@ -424,6 +454,8 @@ nouveau_drm_unload(struct drm_device *dev)
   nouveau_agp_fini(drm);
   nouveau_vga_fini(drm);
  
 + if (drm-hdmi_device)
 + pci_dev_put(drm-hdmi_device);
   nouveau_cli_destroy(drm-client);
   return 0;
  }
 diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.h 
 b/drivers/gpu/drm/nouveau/nouveau_drm.h
 index 41ff7e0..f276e37 100644
 --- a/drivers/gpu/drm/nouveau/nouveau_drm.h
 +++ b/drivers/gpu/drm/nouveau/nouveau_drm.h
 @@ -129,6 +129,8 @@ struct nouveau_drm {
  
   /* power management */
   struct nouveau_pm *pm;
 +
 + struct pci_dev *hdmi_device;
  };
  
  static inline struct nouveau_drm *
 diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c 
 b/drivers/gpu/drm/nouveau/nouveau_vga.c
 index 25d3495..d8af49c 100644
 --- a/drivers/gpu/drm/nouveau/nouveau_vga.c
 +++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
 @@ -27,6 +27,22 @@ nouveau_vga_set_decode(void *priv, bool state)
  }
  
  static void
 +nouveau_reenable_hdmi_device(struct drm_device *dev)
 +{
 + struct nouveau_drm *drm = nouveau_drm(dev);
 + struct nouveau_device *device = nv_device(drm-device);
 + uint32_t val;
 +
 + if (!drm-hdmi_device)
 + return;
 +
 + /* write magic value into magic place */
 + val = nv_rd32(device, 0x88488);
 + val |= (1  25);
 + nv_wr32(device, 0x88488, val);

Use a define for this nevertheless?

 +}
 +
 +static void
  nouveau_switcheroo_set_state(struct pci_dev *pdev,
enum vga_switcheroo_state state)
  {
 @@ -37,6 +53,7 @@ nouveau_switcheroo_set_state(struct pci_dev *pdev,
   dev-switch_power_state = DRM_SWITCH_POWER_CHANGING;
   nouveau_pmops_resume(pdev-dev);
   drm_kms_helper_poll_enable(dev);
 + nouveau_reenable_hdmi_device(dev);
   dev-switch_power_state = DRM_SWITCH_POWER_ON;
   } else {
   printk(KERN_ERR VGA switcheroo: switched nouveau off\n);

Otherwise this looks good,


Thanks,

Paul


signature.asc
Description: This is a digitally signed message part
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Radeon HD 6310 (AMD Wrestler): [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!

2013-07-11 Thread Paul Menzel
Dear Linux folks,


using a Linux 3.10 with the drm-next-3.11 tree from Alex Deuscher merged
and built with `make deb-pkg`, it failed the last boot.

[drm:evergreen_startup] *ERROR* radeon: error initializing UVD (-1).

The strange thing is that it worked the last time I tried with the same
Linux kernel image as I posted to the list [1]. I just booted an old
3.2.46 distribution Linux in between. (And it looked like the
modesetting did not work when doing so. But I did not look further into
it.)

$ uname -a
Linux myhostname 3.10.0+ #105 SMP Sat Jul 6 13:33:47 CEST 2013 i686 
GNU/Linux
$ lspci -s 01.0
00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Wrestler [Radeon HD 6310]
$ md5sum /lib/firmware/3.10.0+/radeon/SUMO_uvd.bin
51d9e0e2247c313c5bfc8fa7bb5b213d  
/lib/firmware/3.10.0+/radeon/SUMO_uvd.bin
$ cut -d " " -f 6- /var/log/kern.log
[?]
[0.152534] calling  trace_init_flags_sys_exit+0x0/0xd @ 1
[0.152540] initcall trace_init_flags_sys_exit+0x0/0xd returned 0 
after 0 usecs
[0.152545] calling  trace_init_flags_sys_enter+0x0/0xd @ 1
[0.152550] initcall trace_init_flags_sys_enter+0x0/0xd returned 0 
after 0 usecs
[0.152555] calling  init_hw_perf_events+0x0/0x47e @ 1
[0.152557] Performance Events: AMD PMU driver.
[0.152565] ... version:0
[0.152567] ... bit width:  48
[0.152569] ... generic registers:  4
[0.152572] ... value mask: 
[0.152575] ... max period: 7fff
[0.152577] ... fixed-purpose events:   0
[0.152580] ... event mask: 000f
[0.152597] initcall init_hw_perf_events+0x0/0x47e returned 0 after 
0 usecs
[0.152603] calling  register_trigger_all_cpu_backtrace+0x0/0xf @ 1
[0.152609] initcall register_trigger_all_cpu_backtrace+0x0/0xf 
returned 0 after 0 usecs
[0.152615] calling  spawn_ksoftirqd+0x0/0x1d @ 1
[0.152657] initcall spawn_ksoftirqd+0x0/0x1d returned 0 after 0 
usecs
[0.152662] calling  init_workqueues+0x0/0x289 @ 1
[0.152800] initcall init_workqueues+0x0/0x289 returned 0 after 0 
usecs
[0.152805] calling  check_cpu_stall_init+0x0/0x12 @ 1
[0.152810] initcall check_cpu_stall_init+0x0/0x12 returned 0 after 
0 usecs
[0.152814] calling  migration_init+0x0/0x55 @ 1
[0.152820] initcall migration_init+0x0/0x55 returned 0 after 0 usecs
[0.152826] calling  cpu_stop_init+0x0/0x57 @ 1
[0.152863] initcall cpu_stop_init+0x0/0x57 returned 0 after 0 usecs
[0.152868] calling  rcu_register_oom_notifier+0x0/0xd @ 1
[0.152874] initcall rcu_register_oom_notifier+0x0/0xd returned 0 
after 0 usecs
[0.152879] calling  rcu_scheduler_really_started+0x0/0xd @ 1
[0.152883] initcall rcu_scheduler_really_started+0x0/0xd returned 0 
after 0 usecs
[0.152888] calling  rcu_spawn_gp_kthread+0x0/0x6b @ 1
[0.152936] initcall rcu_spawn_gp_kthread+0x0/0x6b returned 0 after 
0 usecs
[0.152941] calling  relay_init+0x0/0xd @ 1
[0.152946] initcall relay_init+0x0/0xd returned 0 after 0 usecs
[0.152950] calling  tracer_alloc_buffers+0x0/0x18b @ 1
[0.153030] initcall tracer_alloc_buffers+0x0/0x18b returned 0 after 
0 usecs
[0.153034] calling  init_events+0x0/0x57 @ 1
[0.153041] initcall init_events+0x0/0x57 returned 0 after 0 usecs
[0.153046] calling  init_trace_printk+0x0/0xa @ 1
[0.153051] initcall init_trace_printk+0x0/0xa returned 0 after 0 
usecs
[0.153055] calling  event_trace_memsetup+0x0/0x4a @ 1
[0.153067] initcall event_trace_memsetup+0x0/0x4a returned 0 after 
0 usecs
[0.153158] NMI watchdog: enabled on all CPUs, permanently consumes 
one hw-PMU counter.
[0.153361] CPU 1 irqstacks, hard=f40ae000 soft=f40b
[0.153365] smpboot: Booting Node   0, Processors  #1 OK
[0.163398] Initializing CPU#1
[0.166591] Brought up 2 CPUs
[0.166596] smpboot: Total of 2 processors activated (6399.62 
BogoMIPS)
[0.168897] devtmpfs: initialized
[0.169277] calling  ipc_ns_init+0x0/0xd @ 1
[0.169284] initcall ipc_ns_init+0x0/0xd returned 0 after 0 usecs
[0.169289] calling  init_mmap_min_addr+0x0/0xd @ 1
[0.169294] initcall init_mmap_min_addr+0x0/0xd returned 0 after 0 
usecs
[0.169302] calling  init_cpufreq_transition_notifier_list+0x0/0x14 
@ 1
[0.169315] initcall init_cpufreq_transition_notifier_list+0x0/0x14 
returned 0 after 0 usecs
[0.169320] calling  net_ns_init+0x0/0xca @ 1
[

  1   2   3   4   >