On Thu, Mar 17, 2022 at 12:42 PM Hans de Goede <hdego...@redhat.com> wrote: > > Hi Daniel, > > On 3/17/22 14:28, Daniel Dadap wrote: > > > >> On Mar 17, 2022, at 07:17, Hans de Goede <hdego...@redhat.com> wrote: > >> > >> Hi, > >> > >>> On 3/16/22 21:33, Daniel Dadap wrote: > >>> Some notebook systems with EC-driven backlight control appear to have a > >>> firmware bug which causes the system to use GPU-driven backlight control > >>> upon a fresh boot, but then switches to EC-driven backlight control > >>> after completing a suspend/resume cycle. All the while, the firmware > >>> reports that the backlight is under EC control, regardless of what is > >>> actually controlling the backlight brightness. > >>> > >>> This leads to the following behavior: > >>> > >>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the > >>> WMI-wrapped ACPI method erroneously reporting EC control. > >>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume > >>> cycle, due to the backlight control actually being GPU-driven. > >>> * GPU drivers also register their own backlight handlers: in the case > >>> of the notebook system where this behavior has been observed, both > >>> amdgpu and the NVIDIA proprietary driver register backlight handlers. > >>> * The GPU which has backlight control upon a fresh boot (amdgpu in the > >>> case observed so far) can successfully control the backlight through > >>> its backlight driver's sysfs interface, but stops working after the > >>> first suspend/resume cycle. > >>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a > >>> fresh boot, but begins to work after the first suspend/resume cycle. > >>> * The GPU which does not have backlight control (NVIDIA in this case) > >>> is not able to control the backlight at any point while the system > >>> is in operation. On similar hybrid systems with an EC-controlled > >>> backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver > >>> does not register its backlight handler. It has not been determined > >>> whether the non-functional handler registered by the NVIDIA driver > >>> is due to another firmware bug, or a bug in the NVIDIA driver. > >>> > >>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type > >>> device, it takes precedence over the BACKLIGHT_RAW devices registered > >>> by the GPU drivers. This in turn leads to backlight control appearing > >>> to be non-functional until after completing a suspend/resume cycle. > >>> However, it is still possible to control the backlight through direct > >>> interaction with the working GPU driver's backlight sysfs interface. > >>> > >>> These systems also appear to have a second firmware bug which resets > >>> the EC's brightness level to 100% on resume, but leaves the state in > >>> the kernel at the pre-suspend level. This causes attempts to save > >>> and restore the backlight level across the suspend/resume cycle to > >>> fail, due to the level appearing not to change even though it did. > >>> > >>> In order to work around these issues, add a quirk table to detect > >>> systems that are known to show these behaviors. So far, there is > >>> only one known system that requires these workarounds, and both > >>> issues are present on that system, but the quirks are tracked > >>> separately to make it easier to add them to other systems which > >>> may exhibit one of the bugs, but not the other. The original systems > >>> that this driver was tested on during development do not exhibit > >>> either of these quirks. > >>> > >>> If a system with the "GPU driver has backlight control" quirk is > >>> detected, nvidia-wmi-ec-backlight will grab a reference to the working > >>> (when freshly booted) GPU backlight handler and relays any backlight > >>> brightness level change requests directed at the EC to also be applied > >>> to the GPU backlight interface. This leads to redundant updates > >>> directed at the GPU backlight driver after a suspend/resume cycle, but > >>> it does allow the EC backlight control to work when the system is > >>> freshly booted. > >> > >> Ugh, I'm really not a fan of the backlight proxy plan here. I have > >> plans to clean-up the whole x86 backlight mess soon and an important part > >> of that is to stop registering multiple backlight interfaces for the > >> same panel/screen. > >> > >> Where as going with this workaround requires us to have 2 active > >> backlight interfaces active. Also this will very likely work to > >> (subtly) different backlight behavior before and after the first > >> suspend/resume. > > > > I understand. Having multiple backlight devices for the same panel is > > indeed annoying. Out of curiosity, what is the plan for determining that > > multiple backlight interfaces are all supposed to control the same panel? > > ATM the kernel basically only supports a bunch of different methods > to control the backlight of 1 internal panel. The plan is to tie this > to the panel from a userspace pov by making the brightness + > max_brightness properties on the drm_connector object for the > internal-panel. > > The in kernel tying of the backlight device to the internal panel > will be done hardcoded inside the drm driver(s) based on the > drivers already knowing which connector is the internal panel. > > This all naively assumes there is only 1 internal panel, which > for the majority of cases is true. My plan for devices with > 2 internal panels is to cross that bridge when we get there > (I expect those mostly in phone/tablet like devices for now > which will likely use devicetree where solving this is trivial). > > I do realize we will eventually get some x86/acpi device with > 2 internal panels. Hopefully we can just figure out what > the Windows drivers are doing there and parse e.g. the ACPI > info which Windows is using for this. > > As part of the move to properties on the drm_connector object > the /sys/class/backlight interface will become deprecated, > but will be kept for backward compat and will eventually > be put behind a Kconfig option. > > The kernel internal backlight_device stuff will be kept > since we need some internal representation anyways and > I don't see much value in reworking that, esp. since > we need to have /sys/class/backlight backward compat. > > Note this is all based on discussions which I had with > mainly Daniel Vetter @plumbers 2019 in Lisabon. I have > never gotten around to actually start working on this, > but this has resurfaced recently and I plan to actually > take a stab at implementing this plan sometime during 2022. > > > I’m not sure I’m aware of any driver-agnostic ways of identifying a > > particular panel instance uniquely, and if there is one, that would > > actually help with something else I’m working on at the moment. If the idea > > involves e.g. the EDID, that could be troublesome for backlight drivers > > such as this one which aren’t associated with any GPU. > > Right, see above the main idea is to make this > "the kernel's problem" and I expect us to fix this in > the kernel in a variety of different ways depending on > the actual hardware. > > As for "troublesome for backlight drivers such as this one > which aren’t associated with any GPU.", the idea is that: > > 1. E.g the i915 driver (which I have the most experience with) > knows which connector is the internal panel > > 2. The acpi_video_get_backlight_type() helper from > drivers/acpi/video_detect.c will get extended to make sure > that there is always only *1* /sys/class/backlight device. > > To be specific atm code supporting old vendor specific backlight > fw interfaces, e.g. drivers/platform/x86/dell-laptop.c: > already does: > > if (acpi_video_get_backlight_type() != acpi_backlight_vendor) > return 0; > > And drivers/acpi/acpi_video.c also already does: > > if (acpi_video_get_backlight_type() != acpi_backlight_video) > return 0; > > Currently looking at the 3 main x86 backlight interfaces: vendor, > generic-ACPI and native-drm-driver, only the native driver's > backlight registers unconditionally. The plan is to make those also > do a similar check (*) and to also add special backlight drivers like > nvidia-wmi-ec-backlight and drivers/video/backlight/apple_bl.c > to this mechanism. > > 3. 1 + 2 means that the drm_driver can just tie the single > backlight_device which will be registered on the system to > the internal panel. > > Again I'm completely ignoring dual-internal-panel devices here > for simplicity's sake. > > Note this is getting a bit off-topic, but if you have insights > in this, or already can think of ways how this is not going to > work :) please let me know. > > > *) And adding that check + the presence of nvidia-wmi-ec-backlight > support will make the native drm-driver not register it's > backlight_device at all at which point the backlight-proxy workaround > from this patch breaks. > > > > This also gives me an idea for another experiment I didn’t think to try > > earlier. Alex: what happens if you hack > > amdgpu_atombios_encoder_init_backlight() in the amdgpu driver to just > > return right away? I wonder if the AMD GPU’s attempt to take over backlight > > control is what makes the firmware give control to the GPU rather than the > > EC initially. > > > > Regardless of the backlight proxy workaround, does the force refresh one > > seem reasonable? That one at least addresses a condition that happens at > > every suspend/resume cycle. >
Sorry for jumping in here, but I can't seem to find the original thread with this comment. amdgpu_atombios_encoder_init_backlight() is not applicable to these systems. That is the old pre-DC code path. You want amdgpu_dm_register_backlight_device() for modern hardware. Alex > Good question, I must admit I stopped reading the patch after seeing > the proxy thing. > > I see that you are using a pm_notifer for this. I wonder if you > can try (on your own system) to add a pm_ops struct and make > wmi_driver.driver.pm point to that and check if that gets called > by adding e.g. a pr_info (I don't see why it would not get called). > > And assuming that works, using that would be a bit cleaner IMHO. > Although that does have resume-ordering implications. But I would > expect the EC to basically be always ready to get talked to at > the point in the resume cycle where normal (non early) resume > handlers are called. > > To be clear the idea would be to always have the suspend handler > (so that the driver and pm_ops structs can be const) and to check > a quirk flag inside the resume handler. Or maybe even just always > read back the brightness from the hw and check if it has changed? > Does this need to be behind a quirk ? > > >> Is there no other way to solve this issue? Maybe we need to poke > >> vgaswitcheroo to set the current GPU mode even though this is > >> already reported as active to get things to switch to the ECs > >> control right away ? > > > > There isn’t a vgaswitcheroo handler for this particular mux device (yet), > > but there are separate ACPI interfaces for the mux itself. Poking the mux > > *shouldn’t* have any effect on what device is controlling the backlight for > > the panel, since when the system is in dynamic mux mode the EC is always > > supposed to be in control, but this system is already showing some weird > > behavior, so it doesn’t hurt to try. > > Right, as you said the EC is always supposed to be in control, but > it is not. I would not be surprised if making the ACPI call to put > things in dynamic mode (even though they already are) fixes this, > assuming there is such an ACPI call... > > >> I'm pretty certain that Windows is not doing this backlight proxying, > >> IMHO we need to figure out what causes the switch after suspend/resume > >> and then do that thing at boot. > > > > I’ll put together an instrumented driver for Alex to try on his system, to > > capture some more data and see if I can get some more insight into that. I > > also have a dump of his ACPI tables, and can check there for some other > > potential leads. Hopefully whatever changes the state across the > > suspend/resume cycle is response to something that Linux does or doesn’t > > do, and not some state that is only handled internally within the firmware. > > Great, thank you. > > Regards, > > Hans > > > > >>> If a system with the "backlight level reset to full on resume" quirk > >>> is detected, nvidia-wmi-ec-backlight will register a PM notifier to > >>> reset the backlight to the previous level upon resume. > >>> > >>> These workarounds are also plumbed through to kernel module parameters, > >>> to make it easier for users who suspect they may be affected by one or > >>> both of these bugs to test whether these workarounds are effective on > >>> their systems as well. > >>> > >>> Signed-off-by: Daniel Dadap <dda...@nvidia.com> > >>> Tested-by: Alexandru Dinu <alex.din...@gmail.com> > >>> --- > >>> Note: the Tested-by: line above applies to the previous version of this > >>> patch; an explicit ACK from the tester is required for it to apply to > >>> the current version. > >>> > >>> v2: > >>> * Add readable sysfs files for module params, use linear interpolation > >>> from fixp-arith.h, fix return value of notifier callback, use devm_*() > >>> for kzalloc and put_device. (Barnabás Pőcze <po...@protonmail.com>) > >>> * Add comment to denote known firmware versions that exhibit the bugs. > >>> (Mario Limonciello <mario.limoncie...@amd.com>) > >>> * Unify separate per-quirk tables. (Hans de Goede <hdego...@redhat.com>) > >>> > >>> .../platform/x86/nvidia-wmi-ec-backlight.c | 196 +++++++++++++++++- > >>> 1 file changed, 194 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c > >>> b/drivers/platform/x86/nvidia-wmi-ec-backlight.c > >>> index 61e37194df70..95e1ddf780fc 100644 > >>> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c > >>> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c > >>> @@ -3,8 +3,12 @@ > >>> * Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. > >>> */ > >>> > >>> +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n" > >>> + > >>> #include <linux/acpi.h> > >>> #include <linux/backlight.h> > >>> +#include <linux/dmi.h> > >>> +#include <linux/fixp-arith.h> > >>> #include <linux/mod_devicetable.h> > >>> #include <linux/module.h> > >>> #include <linux/types.h> > >>> @@ -75,6 +79,73 @@ struct wmi_brightness_args { > >>> u32 ignored[3]; > >>> }; > >>> > >>> +/** > >>> + * struct nvidia_wmi_ec_backlight_priv - driver private data > >>> + * @bl_dev: the associated backlight device > >>> + * @proxy_target: backlight device which receives relayed brightness > >>> changes > >>> + * @notifier: notifier block for resume callback > >>> + */ > >>> +struct nvidia_wmi_ec_backlight_priv { > >>> + struct backlight_device *bl_dev; > >>> + struct backlight_device *proxy_target; > >>> + struct notifier_block nb; > >>> +}; > >>> + > >>> +static char *backlight_proxy_target; > >>> +module_param(backlight_proxy_target, charp, 0444); > >>> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change > >>> requests to the named backlight driver, on systems which erroneously > >>> report EC backlight control."); > >>> + > >>> +static int max_reprobe_attempts = 128; > >>> +module_param(max_reprobe_attempts, int, 0444); > >>> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when > >>> relaying brightness change requests."); > >>> + > >>> +static bool restore_level_on_resume; > >>> +module_param(restore_level_on_resume, bool, 0444); > >>> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level > >>> when resuming from suspend, on systems which reset the EC's backlight > >>> level on resume."); > >>> + > >>> +/* Bit field values for quirks table */ > >>> + > >>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME BIT(0) > >>> + > >>> +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device > >>> names */ > >>> + > >>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1 BIT(8) > >>> + > >>> +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name > >>> +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name)) > >>> + > >>> +static int assign_quirks(const struct dmi_system_id *id) > >>> +{ > >>> + if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME)) > >>> + restore_level_on_resume = 1; > >>> + > >>> + /* If the module parameter is set, override the quirks table */ > >>> + if (!backlight_proxy_target) { > >>> + if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1)) > >>> + backlight_proxy_target = "amdgpu_bl1"; > >>> + } > >>> + > >>> + return true; > >>> +} > >>> + > >>> +#define QUIRK_ENTRY(vendor, product, quirks) { \ > >>> + .callback = assign_quirks, \ > >>> + .matches = { \ > >>> + DMI_MATCH(DMI_SYS_VENDOR, vendor), \ > >>> + DMI_MATCH(DMI_PRODUCT_VERSION, product) \ > >>> + }, \ > >>> + .driver_data = (void *)(quirks) \ > >>> +} > >>> + > >>> +static const struct dmi_system_id quirks_table[] = { > >>> + QUIRK_ENTRY( > >>> + /* This quirk is preset as of firmware revision HACN31WW */ > >>> + "LENOVO", "Legion S7 15ACH6", > >>> + QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1) > >>> + ), > >>> + { } > >>> +}; > >>> + > >>> /** > >>> * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI > >>> method > >>> * @w: Pointer to the struct wmi_device identified by > >>> %WMI_BRIGHTNESS_GUID > >>> @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device > >>> *w, enum wmi_brightness_metho > >>> return 0; > >>> } > >>> > >>> +/* Scale the current brightness level of 'from' to the range of 'to'. */ > >>> +static int scale_backlight_level(const struct backlight_device *from, > >>> + const struct backlight_device *to) > >>> +{ > >>> + int from_max = from->props.max_brightness; > >>> + int from_level = from->props.brightness; > >>> + int to_max = to->props.max_brightness; > >>> + > >>> + return fixp_linear_interpolate(0, 0, from_max, to_max, from_level); > >>> +} > >>> + > >>> static int nvidia_wmi_ec_backlight_update_status(struct backlight_device > >>> *bd) > >>> { > >>> struct wmi_device *wdev = bl_get_data(bd); > >>> + struct nvidia_wmi_ec_backlight_priv *priv = > >>> dev_get_drvdata(&wdev->dev); > >>> + struct backlight_device *proxy_target = priv->proxy_target; > >>> + > >>> + if (proxy_target) { > >>> + int level = scale_backlight_level(bd, proxy_target); > >>> + > >>> + if (backlight_device_set_brightness(proxy_target, level)) > >>> + pr_warn("Failed to relay backlight update to \"%s\"", > >>> + backlight_proxy_target); > >>> + } > >>> > >>> return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL, > >>> WMI_BRIGHTNESS_MODE_SET, > >>> @@ -147,13 +239,78 @@ static const struct backlight_ops > >>> nvidia_wmi_ec_backlight_ops = { > >>> .get_brightness = nvidia_wmi_ec_backlight_get_brightness, > >>> }; > >>> > >>> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block > >>> *nb, unsigned long event, void *d) > >>> +{ > >>> + > >>> + /* > >>> + * On some systems, the EC backlight level gets reset to 100% when > >>> + * resuming from suspend, but the backlight device state still > >>> reflects > >>> + * the pre-suspend value. Refresh the existing state to sync the EC's > >>> + * state back up with the kernel's. > >>> + */ > >>> + if (event == PM_POST_SUSPEND) { > >>> + struct nvidia_wmi_ec_backlight_priv *p; > >>> + int ret; > >>> + > >>> + p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb); > >>> + ret = backlight_update_status(p->bl_dev); > >>> + > >>> + if (ret) > >>> + pr_warn("failed to refresh backlight level: %d", ret); > >>> + > >>> + return NOTIFY_OK; > >>> + } > >>> + > >>> + return NOTIFY_DONE; > >>> +} > >>> + > >>> +static void putdev(void *data) > >>> +{ > >>> + struct device *dev = data; > >>> + > >>> + put_device(dev); > >>> +} > >>> + > >>> static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const > >>> void *ctx) > >>> { > >>> + struct backlight_device *bdev, *target = NULL; > >>> + struct nvidia_wmi_ec_backlight_priv *priv; > >>> struct backlight_properties props = {}; > >>> - struct backlight_device *bdev; > >>> u32 source; > >>> int ret; > >>> > >>> + /* > >>> + * Check quirks tables to see if this system needs any of the > >>> firmware > >>> + * bug workarounds. > >>> + */ > >>> + dmi_check_system(quirks_table); > >>> + > >>> + if (backlight_proxy_target && backlight_proxy_target[0]) { > >>> + static int num_reprobe_attempts; > >>> + > >>> + target = backlight_device_get_by_name(backlight_proxy_target); > >>> + > >>> + if (target) { > >>> + ret = devm_add_action_or_reset(&wdev->dev, putdev, > >>> + &target->dev); > >>> + if (ret) > >>> + return ret; > >>> + } else { > >>> + /* > >>> + * The target backlight device might not be ready; > >>> + * try again and disable backlight proxying if it > >>> + * fails too many times. > >>> + */ > >>> + if (num_reprobe_attempts < max_reprobe_attempts) { > >>> + num_reprobe_attempts++; > >>> + return -EPROBE_DEFER; > >>> + } > >>> + > >>> + pr_warn("Unable to acquire %s after %d attempts. Disabling > >>> backlight proxy.", > >>> + backlight_proxy_target, max_reprobe_attempts); > >>> + } > >>> + } > >>> + > >>> ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE, > >>> WMI_BRIGHTNESS_MODE_GET, &source); > >>> if (ret) > >>> @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct > >>> wmi_device *wdev, const void *ct > >>> &wdev->dev, wdev, > >>> &nvidia_wmi_ec_backlight_ops, > >>> &props); > >>> - return PTR_ERR_OR_ZERO(bdev); > >>> + > >>> + if (IS_ERR(bdev)) > >>> + return PTR_ERR(bdev); > >>> + > >>> + priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL); > >>> + if (!priv) > >>> + return -ENOMEM; > >>> + > >>> + priv->bl_dev = bdev; > >>> + > >>> + dev_set_drvdata(&wdev->dev, priv); > >>> + > >>> + if (target) { > >>> + int level = scale_backlight_level(target, bdev); > >>> + > >>> + if (backlight_device_set_brightness(bdev, level)) > >>> + pr_warn("Unable to import initial brightness level from %s.", > >>> + backlight_proxy_target); > >>> + priv->proxy_target = target; > >>> + } > >>> + > >>> + if (restore_level_on_resume) { > >>> + priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier; > >>> + register_pm_notifier(&priv->nb); > >>> + } > >>> + > >>> + return 0; > >>> +} > >>> + > >>> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev) > >>> +{ > >>> + struct nvidia_wmi_ec_backlight_priv *priv = > >>> dev_get_drvdata(&wdev->dev); > >>> + > >>> + if (priv->nb.notifier_call) > >>> + unregister_pm_notifier(&priv->nb); > >>> } > >>> > >>> #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7" > >>> @@ -204,6 +395,7 @@ static struct wmi_driver > >>> nvidia_wmi_ec_backlight_driver = { > >>> .name = "nvidia-wmi-ec-backlight", > >>> }, > >>> .probe = nvidia_wmi_ec_backlight_probe, > >>> + .remove = nvidia_wmi_ec_backlight_remove, > >>> .id_table = nvidia_wmi_ec_backlight_id_table, > >>> }; > >>> module_wmi_driver(nvidia_wmi_ec_backlight_driver); > >> >