Re: [REGRESSION]: acpi/nouveau: Hardware unavailable upon resume or suspend fails

2023-11-16 Thread Owen T. Heisler

On 11/12/23 14:43, Hans de Goede wrote:

Owen, Kai-Heng thank you for testing. I've submitted these patches
to Rafael (the ACPI maintainer) now (with you on the Cc).
Hopefully they will get merged soon.


That's great, thanks!

Owen



Re: [REGRESSION]: acpi/nouveau: Hardware unavailable upon resume or suspend fails

2023-11-12 Thread Hans de Goede
Hi,

On 11/10/23 17:58, Owen T. Heisler wrote:
> Hi everyone,
> 
> On 11/10/23 06:52, Kai-Heng Feng wrote:
>> On Fri, Nov 10, 2023 at 2:19 PM Hans de Goede  wrote:
>>> On 11/10/23 07:09, Kai-Heng Feng wrote:
 On Fri, Nov 10, 2023 at 5:55 AM Owen T. Heisler  wrote:
> #regzbot introduced: 89c290ea758911e660878e26270e084d862c03b0
> #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/273
> #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=218124

 Thanks for the bug report. Do you prefer to continue the discussion
 here, on gitlab or on bugzilla?
> 
> Kai-Heng, you're welcome and thank you too. By email is fine with me.
> 
>>> Owen, as Kai-Heng said thank you for reporting this.
> 
> Hans, you're welcome, and thanks for your help too.
> 
> ## Reproducing
>
> 1. Boot system to framebuffer console.
> 2. Run `systemctl suspend`. If undocked without secondary display,
> suspend fails. If docked with secondary display, suspend succeeds.
> 3. Resume from suspend if applicable.
> 4. System is now in a broken state.

 So I guess we need to put those devices to ACPI D3 for suspend. Let's
 discuss this on your preferred platform.
>>>
>>> Ok, so I was already sort of afraid we might see something like this
>>> happening because of:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=89c290ea758911e660878e26270e084d862c03b0
>>>
>>> As I mentioned during the review of that, it might be better to
>>> not touch the video-card ACPI power-state at all and instead
>>> only do acpi_device_fix_up_power() on the child devices.
>>
>> Or the child devices need to be put to D3 during suspend.
>>
>>> Owen, attached are 2 patches which implement only
>>> calling acpi_device_fix_up_power() on the child devices,
>>> can you build a v6.6 kernel with these 2 patches added
>>> on top please and see if that fixes things ?
> 
> Yes, with those patches v6.6 suspend works normally. That's great, thanks!
> 
> I tested with v6.6 with the 2 patches at 
> 
>  using 
> .
>  I tested both docked and un-docked, just in case.
> 
> Tested-by: Owen T. Heisler 
> 
>>> Kai-Heng can you test that the issue on the HP ZBook Fury 16 G10
>>> is still resolved after applying these patches ?
>>
>> Yes. Thanks for the patch.
>>
>> If this patch also fixes Owen's issue, then
>> Tested-by: Kai-Heng Feng 

Re: [REGRESSION]: acpi/nouveau: Hardware unavailable upon resume or suspend fails

2023-11-11 Thread Owen T. Heisler

Hi everyone,

On 11/10/23 06:52, Kai-Heng Feng wrote:

On Fri, Nov 10, 2023 at 2:19 PM Hans de Goede  wrote:

On 11/10/23 07:09, Kai-Heng Feng wrote:

On Fri, Nov 10, 2023 at 5:55 AM Owen T. Heisler  wrote:

#regzbot introduced: 89c290ea758911e660878e26270e084d862c03b0
#regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/273
#regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=218124


Thanks for the bug report. Do you prefer to continue the discussion
here, on gitlab or on bugzilla?


Kai-Heng, you're welcome and thank you too. By email is fine with me.


Owen, as Kai-Heng said thank you for reporting this.


Hans, you're welcome, and thanks for your help too.


## Reproducing

1. Boot system to framebuffer console.
2. Run `systemctl suspend`. If undocked without secondary display,
suspend fails. If docked with secondary display, suspend succeeds.
3. Resume from suspend if applicable.
4. System is now in a broken state.


So I guess we need to put those devices to ACPI D3 for suspend. Let's
discuss this on your preferred platform.


Ok, so I was already sort of afraid we might see something like this
happening because of:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=89c290ea758911e660878e26270e084d862c03b0

As I mentioned during the review of that, it might be better to
not touch the video-card ACPI power-state at all and instead
only do acpi_device_fix_up_power() on the child devices.


Or the child devices need to be put to D3 during suspend.


Owen, attached are 2 patches which implement only
calling acpi_device_fix_up_power() on the child devices,
can you build a v6.6 kernel with these 2 patches added
on top please and see if that fixes things ?


Yes, with those patches v6.6 suspend works normally. That's great, thanks!

I tested with v6.6 with the 2 patches at 
 
using 
. 
I tested both docked and un-docked, just in case.


Tested-by: Owen T. Heisler 


Kai-Heng can you test that the issue on the HP ZBook Fury 16 G10
is still resolved after applying these patches ?


Yes. Thanks for the patch.

If this patch also fixes Owen's issue, then
Tested-by: Kai-Heng Feng 

Please let me know if anything else is needed from me.

Many thanks,
Owen


Re: [REGRESSION]: acpi/nouveau: Hardware unavailable upon resume or suspend fails

2023-11-10 Thread Kai-Heng Feng
Hi Hans,

On Fri, Nov 10, 2023 at 2:19 PM Hans de Goede  wrote:
>
> Hi All,
>
> On 11/10/23 07:09, Kai-Heng Feng wrote:
> > Hi Owen,
> >
> > On Fri, Nov 10, 2023 at 5:55 AM Owen T. Heisler  wrote:
> >>
> >> #regzbot introduced: 89c290ea758911e660878e26270e084d862c03b0
> >> #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/273
> >> #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=218124
> >
> > Thanks for the bug report. Do you prefer to continue the discussion
> > here, on gitlab or on bugzilla?
>
> Owen, as Kai-Heng said thank you for reporting this.
>
> >> ## Reproducing
> >>
> >> 1. Boot system to framebuffer console.
> >> 2. Run `systemctl suspend`. If undocked without secondary display,
> >> suspend fails. If docked with secondary display, suspend succeeds.
> >> 3. Resume from suspend if applicable.
> >> 4. System is now in a broken state.
> >
> > So I guess we need to put those devices to ACPI D3 for suspend. Let's
> > discuss this on your preferred platform.
>
> Ok, so I was already sort of afraid we might see something like this
> happening because of:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=89c290ea758911e660878e26270e084d862c03b0
>
> As I mentioned during the review of that, it might be better to
> not touch the video-card ACPI power-state at all and instead
> only do acpi_device_fix_up_power() on the child devices.

Or the child devices need to be put to D3 during suspend.

>
> Owen, attached are 2 patches which implement only
> calling acpi_device_fix_up_power() on the child devices,
> can you build a v6.6 kernel with these 2 patches added
> on top please and see if that fixes things ?
>
> Kai-Heng can you test that the issue on the HP ZBook Fury 16 G10
> is still resolved after applying these patches ?

Yes. Thanks for the patch.

If this patch also fixes Owen's issue, then
Tested-by: Kai-Heng Feng 


>
> Regards,
>
> Hans
>


Re: [REGRESSION]: acpi/nouveau: Hardware unavailable upon resume or suspend fails

2023-11-10 Thread Hans de Goede
Hi All,

On 11/10/23 07:09, Kai-Heng Feng wrote:
> Hi Owen,
> 
> On Fri, Nov 10, 2023 at 5:55 AM Owen T. Heisler  wrote:
>>
>> #regzbot introduced: 89c290ea758911e660878e26270e084d862c03b0
>> #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/273
>> #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=218124
> 
> Thanks for the bug report. Do you prefer to continue the discussion
> here, on gitlab or on bugzilla?

Owen, as Kai-Heng said thank you for reporting this.

>> ## Reproducing
>>
>> 1. Boot system to framebuffer console.
>> 2. Run `systemctl suspend`. If undocked without secondary display,
>> suspend fails. If docked with secondary display, suspend succeeds.
>> 3. Resume from suspend if applicable.
>> 4. System is now in a broken state.
> 
> So I guess we need to put those devices to ACPI D3 for suspend. Let's
> discuss this on your preferred platform.

Ok, so I was already sort of afraid we might see something like this
happening because of:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=89c290ea758911e660878e26270e084d862c03b0

As I mentioned during the review of that, it might be better to
not touch the video-card ACPI power-state at all and instead
only do acpi_device_fix_up_power() on the child devices.

Owen, attached are 2 patches which implement only
calling acpi_device_fix_up_power() on the child devices,
can you build a v6.6 kernel with these 2 patches added
on top please and see if that fixes things ?

Kai-Heng can you test that the issue on the HP ZBook Fury 16 G10
is still resolved after applying these patches ?

Regards,

Hans

From 68a819101c580bb89f34a31196ace81244ca8eb5 Mon Sep 17 00:00:00 2001
From: Hans de Goede 
Date: Fri, 10 Nov 2023 12:52:39 +0100
Subject: [PATCH 1/2] ACPI: PM: Add acpi_device_fix_up_power_children()
 function

In some cases it is necessary to fix-up the power-state of an ACPI
device's children without touching the ACPI device itself add
a new acpi_device_fix_up_power_children() function for this.

Signed-off-by: Hans de Goede 
---
 drivers/acpi/device_pm.c | 13 +
 include/acpi/acpi_bus.h  |  1 +
 2 files changed, 14 insertions(+)

diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
index f007116a8427..3b4d048c4941 100644
--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -397,6 +397,19 @@ void acpi_device_fix_up_power_extended(struct acpi_device *adev)
 }
 EXPORT_SYMBOL_GPL(acpi_device_fix_up_power_extended);
 
+/**
+ * acpi_device_fix_up_power_children - Force a device's children into D0.
+ * @adev: Parent device object whose children's power state is to be fixed up.
+ *
+ * Call acpi_device_fix_up_power() for @adev's children so long as they
+ * are reported as present and enabled.
+ */
+void acpi_device_fix_up_power_children(struct acpi_device *adev)
+{
+	acpi_dev_for_each_child(adev, fix_up_power_if_applicable, NULL);
+}
+EXPORT_SYMBOL_GPL(acpi_device_fix_up_power_children);
+
 int acpi_device_update_power(struct acpi_device *device, int *state_p)
 {
 	int state;
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index 254685085c82..0b7eab0ef7d7 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -539,6 +539,7 @@ int acpi_device_set_power(struct acpi_device *device, int state);
 int acpi_bus_init_power(struct acpi_device *device);
 int acpi_device_fix_up_power(struct acpi_device *device);
 void acpi_device_fix_up_power_extended(struct acpi_device *adev);
+void acpi_device_fix_up_power_children(struct acpi_device *adev);
 int acpi_bus_update_power(acpi_handle handle, int *state_p);
 int acpi_device_update_power(struct acpi_device *device, int *state_p);
 bool acpi_bus_power_manageable(acpi_handle handle);
-- 
2.41.0

From 33e2d55917ac7082e8d98dc2a678a856f8d48352 Mon Sep 17 00:00:00 2001
From: Hans de Goede 
Date: Fri, 10 Nov 2023 13:10:02 +0100
Subject: [PATCH 2/2] ACPI: video: Use acpi_device_fix_up_power_children()

Commit 89c290ea7589 ("ACPI: video: Put ACPI video and its child devices
into D0 on boot") introduced calling acpi_device_fix_up_power_extended()
on the video card for which the ACPI video bus is the companion device.

This unnecessarily touches the power-state of the GPU itself, while
the issue it tries to address only requires calling _PS0 on the child
devices.

Touching the power-state of the GPU itself is causing suspend / resume
issues on e.g. a Lenovo ThinkPad W530.

Instead use acpi_device_fix_up_power_children(), which only touches
the child devices, to fix this.

Fixes: 89c290ea7589 ("ACPI: video: Put ACPI video and its child devices into D0 on boot")
Reported-by: Owen T. Heisler 
Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/273
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218124
Signed-off-by: Hans de Goede 
---
 drivers/acpi/acpi_video.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c
index b411948594ff..4e868454b38d 

[REGRESSION]: acpi/nouveau: Hardware unavailable upon resume or suspend fails

2023-11-10 Thread Owen T. Heisler

#regzbot introduced: 89c290ea758911e660878e26270e084d862c03b0
#regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/273
#regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=218124

## Reproducing

1. Boot system to framebuffer console.
2. Run `systemctl suspend`. If undocked without secondary display, 
suspend fails. If docked with secondary display, suspend succeeds.

3. Resume from suspend if applicable.
4. System is now in a broken state.

## Testing

- culprit commit is 89c290ea758911e660878e26270e084d862c03b0
- v6.6 fails
- v6.6 with culprit commit reverted does not fail
- Compiled with 



## Hardware

- ThinkPad W530 2438-52U
- Dock with Nvidia-connected DVI ports
- Secondary display connected via DVI
- Nvidia Optimus GPU switching system

```console
$ lspci | grep -i vga
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core 
processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: NVIDIA Corporation GK107GLM [Quadro 
K2000M] (rev a1)

```

## Decoded logs from v6.6

- System is not docked and fails to suspend: 

- System is docked and fails after resume: 



Re: [REGRESSION]: acpi/nouveau: Hardware unavailable upon resume or suspend fails

2023-11-09 Thread Kai-Heng Feng
Hi Owen,

On Fri, Nov 10, 2023 at 5:55 AM Owen T. Heisler  wrote:
>
> #regzbot introduced: 89c290ea758911e660878e26270e084d862c03b0
> #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/273
> #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=218124

Thanks for the bug report. Do you prefer to continue the discussion
here, on gitlab or on bugzilla?

>
> ## Reproducing
>
> 1. Boot system to framebuffer console.
> 2. Run `systemctl suspend`. If undocked without secondary display,
> suspend fails. If docked with secondary display, suspend succeeds.
> 3. Resume from suspend if applicable.
> 4. System is now in a broken state.

So I guess we need to put those devices to ACPI D3 for suspend. Let's
discuss this on your preferred platform.

Kai-Heng

>
> ## Testing
>
> - culprit commit is 89c290ea758911e660878e26270e084d862c03b0
> - v6.6 fails
> - v6.6 with culprit commit reverted does not fail
> - Compiled with
> 
>
> ## Hardware
>
> - ThinkPad W530 2438-52U
> - Dock with Nvidia-connected DVI ports
> - Secondary display connected via DVI
> - Nvidia Optimus GPU switching system
>
> ```console
> $ lspci | grep -i vga
> 00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core
> processor Graphics Controller (rev 09)
> 01:00.0 VGA compatible controller: NVIDIA Corporation GK107GLM [Quadro
> K2000M] (rev a1)
> ```
>
> ## Decoded logs from v6.6
>
> - System is not docked and fails to suspend:
> 
> - System is docked and fails after resume:
>