Re: [PATCH v2 3/3] ACPI / button: Send "open" state after boot/resume

2016-06-01 Thread Benjamin Tissoires
On Wed, Jun 1, 2016 at 9:51 AM, Zheng, Lv  wrote:
> Hi,
>
> [cut]
>> > We could have a parameter (say "send_lid_open_on_resume" or
>> > "use_bios_lid_value") in acpi/button.c which would allow people to
>> > choose if they want the "new" behavior or the old one. And we could
>> > also add some DMI matching for the messed up laptops/tablets where
>> we
>> > force one or the other quirk. When we agree that user space now
>> > behaves gently with us, we could then remove entirely the quirk and
>> > the dmi matching.
>> >
>> > How does that sound?
>> [Lv Zheng]
>> The choices are in my first revision.
>> I could restore it back and re-send this series.
>> Also I need to update PATCH 02 to eliminate wrong IS_ERR_VALUE().
> [Lv Zheng]
> I forgot to mention.
> IMO, if the issue is because of uncertain gaps, not a confirmed BIOS bug, or 
> a confirmed gap that has to be solved in a spirit of compromise, we should 
> not add quirks.
> We could just provide boot parameters for users to choose.
> Because the quirk table could grow bigger and bigger, exceeding our control.
>
> So I probably would just implement the parameter part, without adding the DMI 
> entries.
>

Works for me. I think in Fedora, we might default to keep the current
behavior, but document that some tablets need the parameter to be set.
As soon as systemd changes its policy, we can change the default
value.


As for the Surface3, I think the LID state and irq is just not enough
robust to be used through the ACPI node PNP0C0D. I get all the
information from the touchscreen ACPI Node and the WMI, so I think
there might be a way to remove the standard ACPI LID and use our own,
or simply override the ACPI call by a different one (through the WMI).
It's a shame logind only expects one LID event node :).

Cheers,
Benjamin


RE: [PATCH v2 3/3] ACPI / button: Send "open" state after boot/resume

2016-06-01 Thread Zheng, Lv
Hi,

[cut]
> > We could have a parameter (say "send_lid_open_on_resume" or
> > "use_bios_lid_value") in acpi/button.c which would allow people to
> > choose if they want the "new" behavior or the old one. And we could
> > also add some DMI matching for the messed up laptops/tablets where
> we
> > force one or the other quirk. When we agree that user space now
> > behaves gently with us, we could then remove entirely the quirk and
> > the dmi matching.
> >
> > How does that sound?
> [Lv Zheng]
> The choices are in my first revision.
> I could restore it back and re-send this series.
> Also I need to update PATCH 02 to eliminate wrong IS_ERR_VALUE().
[Lv Zheng] 
I forgot to mention.
IMO, if the issue is because of uncertain gaps, not a confirmed BIOS bug, or a 
confirmed gap that has to be solved in a spirit of compromise, we should not 
add quirks.
We could just provide boot parameters for users to choose.
Because the quirk table could grow bigger and bigger, exceeding our control.

So I probably would just implement the parameter part, without adding the DMI 
entries.

Thanks and best regards
-Lv



RE: [PATCH v2 3/3] ACPI / button: Send "open" state after boot/resume

2016-05-31 Thread Zheng, Lv
Hi,

> From: Benjamin Tissoires [mailto:benjamin.tissoi...@gmail.com]
> Subject: Re: [PATCH v2 3/3] ACPI / button: Send "open" state after
> boot/resume
> 
> Hi Lv,
> 
> On Tue, May 31, 2016 at 4:55 AM, Zheng, Lv  wrote:
> > Hi,
> >
> >> From: Benjamin Tissoires [mailto:benjamin.tissoi...@gmail.com]
> >> Subject: Re: [PATCH v2 3/3] ACPI / button: Send "open" state after
> >> boot/resume
> >>
> >> On Fri, May 27, 2016 at 9:16 AM, Lv Zheng 
> wrote:
> [snipped
> ]>> As Valdis replied on 0/3, I don't think this is a good solution (even
> >> temporary). Linux should not assume the current state of a input
> >> device, and sending unconditionally 1 here is wrong. If the device is
> >> on a docking station, you will wake up the wrong monitor and screw
> the
> >> user session (and this will be a regression).
> > [Lv Zheng]
> > We are doing the test to see how this behaves on several different
> platforms.
> >
> >>
> >> How about we simply send the current LID state stored in the ACPI?
> >> something like calling acpi_lid_send_state() directly?
> > [Lv Zheng]
> > This is what we are going to eliminate in [PATCH 01].
> > We have several real bugs related to sending a wrong state to the
> userspace.
> > Userspace will suspend right after resume because of the 'close' state.
> 
> On the other hand, you are trying to remove 23de5d9ef2a4bbc4f733f, a
> patch that has been around for 9 years and we only start seeing
> devices where this logic is not working...
> 
> I am not saying your approach is wrong, I am just saying that instead
> of a plain revert, we should probably be more conservative and add a
> quirk for those buggy machines. Ideally, we should try to understand
> why there is such an issue that Windows doesn't have (the solution
> might just be that given Windows doesn't care, we are screwed).
> 
> BTW, on the Surface 3, there is a WMI
> (f7cc25ec-d20b-404c-8903-0ed4359c18ae -> WQHE) which returns the
> actual value of the LID, without using PNP0C0D at all. I have a
> feeling that Windows might use it when it is in trouble or in an
> unsure state. I couldn't find this WMI on the 2 other systems so that
> may also be just a one shot for the Surface 3.
[Lv Zheng] 
Thanks for the information.
But it seems Surface 3 is not a good example for this.
It is a runtime idle platform.
And the root cause of the Surface 3 issue should be in the freeze code.
After waking the system up via LID irq, the irq is dropped.
But I guess it is risky to invoke irq handler right there, doing so could break 
may existing drivers.


> 
> [snipped]
> > [Lv Zheng]
> > The understanding here is incorrect.
> > We have 3 bogus devices.
> > 1 of them is surface 3 which is a hardware reduced platform.
> > The others are all traditional platforms.
> >
> > =
> > The facts are:
> >
> > Both the platforms return cached lid state from _LID.
> > The cached value will be updated by lid irq (via GPIO IRQ, GPE, or EC
> event).
> > AML tables will send lid notification in the irq handler.
> >
> > Some AML tables will update the cached value in _WAK (I'll describe why
> it is necessary below).
> > But updating the cached value in _WAK is not guaranteed by all AML
> tables.
> >
> > For the 'close' state irq, all tables will send lid close notification.
> > For the 'open' state irq, it seems there are tables never sending lid open
> notification (sounds like Windows do not care about lid open).
> > =
> >
> > Surface 3 is entirely a different case.
> > It is a runtime idle system and hardware reduced.
> > On that kind of system, lid open is handled by OS not by BIOS.
> > Surface 3 is exactly the platform that doesn't send lid open notification.
> > I guess the AML is intentionally written in this way to be compliant to
> the traditional platforms.
> >
> > While on the traditional platforms:
> > When lid is opened, BIOS handles the lid irq and wakes the system from
> the FACS waking vector.
> > So it is likely that there is no lid open irq after the system is resumed.
> > BIOS may forget to update the cached lid value in the _WAK or some
> other control methods that could be executed after resuming.
> > Then if we send _LID result to the user space, the cached value could
> apparently be 'close'.
> >
> > That explains why there is no "lid open" configuration in the "Windows
> Device Manager".
> >
> >>
> >> I propose as a workaround 

Re: [PATCH v2 3/3] ACPI / button: Send "open" state after boot/resume

2016-05-31 Thread Benjamin Tissoires
Hi Lv,

On Tue, May 31, 2016 at 4:55 AM, Zheng, Lv  wrote:
> Hi,
>
>> From: Benjamin Tissoires [mailto:benjamin.tissoi...@gmail.com]
>> Subject: Re: [PATCH v2 3/3] ACPI / button: Send "open" state after
>> boot/resume
>>
>> On Fri, May 27, 2016 at 9:16 AM, Lv Zheng  wrote:
[snipped
]>> As Valdis replied on 0/3, I don't think this is a good solution (even
>> temporary). Linux should not assume the current state of a input
>> device, and sending unconditionally 1 here is wrong. If the device is
>> on a docking station, you will wake up the wrong monitor and screw the
>> user session (and this will be a regression).
> [Lv Zheng]
> We are doing the test to see how this behaves on several different platforms.
>
>>
>> How about we simply send the current LID state stored in the ACPI?
>> something like calling acpi_lid_send_state() directly?
> [Lv Zheng]
> This is what we are going to eliminate in [PATCH 01].
> We have several real bugs related to sending a wrong state to the userspace.
> Userspace will suspend right after resume because of the 'close' state.

On the other hand, you are trying to remove 23de5d9ef2a4bbc4f733f, a
patch that has been around for 9 years and we only start seeing
devices where this logic is not working...

I am not saying your approach is wrong, I am just saying that instead
of a plain revert, we should probably be more conservative and add a
quirk for those buggy machines. Ideally, we should try to understand
why there is such an issue that Windows doesn't have (the solution
might just be that given Windows doesn't care, we are screwed).

BTW, on the Surface 3, there is a WMI
(f7cc25ec-d20b-404c-8903-0ed4359c18ae -> WQHE) which returns the
actual value of the LID, without using PNP0C0D at all. I have a
feeling that Windows might use it when it is in trouble or in an
unsure state. I couldn't find this WMI on the 2 other systems so that
may also be just a one shot for the Surface 3.

[snipped]
> [Lv Zheng]
> The understanding here is incorrect.
> We have 3 bogus devices.
> 1 of them is surface 3 which is a hardware reduced platform.
> The others are all traditional platforms.
>
> =
> The facts are:
>
> Both the platforms return cached lid state from _LID.
> The cached value will be updated by lid irq (via GPIO IRQ, GPE, or EC event).
> AML tables will send lid notification in the irq handler.
>
> Some AML tables will update the cached value in _WAK (I'll describe why it is 
> necessary below).
> But updating the cached value in _WAK is not guaranteed by all AML tables.
>
> For the 'close' state irq, all tables will send lid close notification.
> For the 'open' state irq, it seems there are tables never sending lid open 
> notification (sounds like Windows do not care about lid open).
> =
>
> Surface 3 is entirely a different case.
> It is a runtime idle system and hardware reduced.
> On that kind of system, lid open is handled by OS not by BIOS.
> Surface 3 is exactly the platform that doesn't send lid open notification.
> I guess the AML is intentionally written in this way to be compliant to the 
> traditional platforms.
>
> While on the traditional platforms:
> When lid is opened, BIOS handles the lid irq and wakes the system from the 
> FACS waking vector.
> So it is likely that there is no lid open irq after the system is resumed.
> BIOS may forget to update the cached lid value in the _WAK or some other 
> control methods that could be executed after resuming.
> Then if we send _LID result to the user space, the cached value could 
> apparently be 'close'.
>
> That explains why there is no "lid open" configuration in the "Windows Device 
> Manager".
>
>>
>> I propose as a workaround to enable a kthread that will monitor the
>> lid state and update the correct value to userspace (5 sec of polling
>> time should be enough given that systemd checks every 20 sec).
>> We should probably have this workaround only for a set of known
>> devices, as it might just be temporary for those until the actual
>> underlying problem is fixed (wrong DSDT in the Surface 3 case that
>> doesn't notify at all, issue in the EC for the Surface Pro 1 and the
>> Samsung N210).
> [Lv Zheng]
> That cannot help to solve the issue/gap.
>
> The problem is Linux userspace has a facility re-checking lid state when lid 
> state is "close".
> If it failed to update the lid state to "open" for a timeout period, it would 
> suspend the platform.
>
> We actually are recommending Linus userspace to introduce a new lid handling 
> mode to only handle "lid close" event an

RE: [PATCH v2 3/3] ACPI / button: Send "open" state after boot/resume

2016-05-30 Thread Zheng, Lv
Hi,

> From: Benjamin Tissoires [mailto:benjamin.tissoi...@gmail.com]
> Subject: Re: [PATCH v2 3/3] ACPI / button: Send "open" state after
> boot/resume
> 
> On Fri, May 27, 2016 at 9:16 AM, Lv Zheng  wrote:
> > Linux userspace (systemd-logind) keeps on rechecking lid state when the
> > lid state is closed. If it failed to update the lid state to open after
> > boot/resume, the system suspending right after the boot/resume could
> be
> > resulted.
> > Graphics drivers also uses the lid notifications to implment
> > MODESET_ON_LID_OPEN option.
> >
> > Before the situation is improved from the userspace and from the
> graphics
> > driver, simply send initial "open" lid state to avoid issues. After this is
> > improved from the userspace and from the graphics driver, Linux kernel
> > could simply revert this minimal commit.
> >
> > Link 1: https://lkml.org/2016/3/7/460
> > Link 2: https://github.com/systemd/systemd/issues/2087
> > Signed-off-by: Lv Zheng 
> > Cc: Bastien Nocera: 
> > ---
> >  drivers/acpi/button.c |3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> > index e706e4b..6e77312 100644
> > --- a/drivers/acpi/button.c
> > +++ b/drivers/acpi/button.c
> > @@ -342,6 +342,8 @@ static int acpi_button_resume(struct device
> *dev)
> > struct acpi_button *button = acpi_driver_data(device);
> >
> > button->suspended = false;
> > +   if (button->type == ACPI_BUTTON_TYPE_LID)
> > +   return acpi_lid_notify_state(device, 1);
> 
> As Valdis replied on 0/3, I don't think this is a good solution (even
> temporary). Linux should not assume the current state of a input
> device, and sending unconditionally 1 here is wrong. If the device is
> on a docking station, you will wake up the wrong monitor and screw the
> user session (and this will be a regression).
[Lv Zheng] 
We are doing the test to see how this behaves on several different platforms.

> 
> How about we simply send the current LID state stored in the ACPI?
> something like calling acpi_lid_send_state() directly?
[Lv Zheng] 
This is what we are going to eliminate in [PATCH 01].
We have several real bugs related to sending a wrong state to the userspace.
Userspace will suspend right after resume because of the 'close' state.


> 
> [15 min later, after writing a long email]
> 
> Well, it looks like we already have that in the kernel and for a long
> time apparently.
> 
> So, I think the issue is that Microsoft does not wake up the system by
> lid opening (seen in one of the comments in the mentioned bugs and by
> looking at the behavior on the surface devices). It must be just
> querying it's state on resume or might even not care at all until it
> receives a close event.
> 
> If I read correctly, we managed to get the 3 bogus devices to a
> correct state at the ACPI level (/proc/acpi/button/lid/LID0/state),
> but we did not get the notification. Given that the LID state is
> triggered by an ACPI operation region, there is no guarantee that the
> resume of the acpi/button.c driver will be called after the region has
> been updated/called.
[Lv Zheng] 
The understanding here is incorrect.
We have 3 bogus devices.
1 of them is surface 3 which is a hardware reduced platform.
The others are all traditional platforms.

=
The facts are:

Both the platforms return cached lid state from _LID.
The cached value will be updated by lid irq (via GPIO IRQ, GPE, or EC event).
AML tables will send lid notification in the irq handler.

Some AML tables will update the cached value in _WAK (I'll describe why it is 
necessary below).
But updating the cached value in _WAK is not guaranteed by all AML tables.

For the 'close' state irq, all tables will send lid close notification.
For the 'open' state irq, it seems there are tables never sending lid open 
notification (sounds like Windows do not care about lid open).
=

Surface 3 is entirely a different case.
It is a runtime idle system and hardware reduced.
On that kind of system, lid open is handled by OS not by BIOS.
Surface 3 is exactly the platform that doesn't send lid open notification.
I guess the AML is intentionally written in this way to be compliant to the 
traditional platforms.

While on the traditional platforms:
When lid is opened, BIOS handles the lid irq and wakes the system from the FACS 
waking vector.
So it is likely that there is no lid open irq after the system is resumed.
BIOS may forget to update the cached lid value in the _WAK or some other 
control methods that could be executed after resuming.
Then if we send _LID result to the user space

Re: [PATCH v2 3/3] ACPI / button: Send "open" state after boot/resume

2016-05-30 Thread Benjamin Tissoires
On Fri, May 27, 2016 at 9:16 AM, Lv Zheng  wrote:
> Linux userspace (systemd-logind) keeps on rechecking lid state when the
> lid state is closed. If it failed to update the lid state to open after
> boot/resume, the system suspending right after the boot/resume could be
> resulted.
> Graphics drivers also uses the lid notifications to implment
> MODESET_ON_LID_OPEN option.
>
> Before the situation is improved from the userspace and from the graphics
> driver, simply send initial "open" lid state to avoid issues. After this is
> improved from the userspace and from the graphics driver, Linux kernel
> could simply revert this minimal commit.
>
> Link 1: https://lkml.org/2016/3/7/460
> Link 2: https://github.com/systemd/systemd/issues/2087
> Signed-off-by: Lv Zheng 
> Cc: Bastien Nocera: 
> ---
>  drivers/acpi/button.c |3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> index e706e4b..6e77312 100644
> --- a/drivers/acpi/button.c
> +++ b/drivers/acpi/button.c
> @@ -342,6 +342,8 @@ static int acpi_button_resume(struct device *dev)
> struct acpi_button *button = acpi_driver_data(device);
>
> button->suspended = false;
> +   if (button->type == ACPI_BUTTON_TYPE_LID)
> +   return acpi_lid_notify_state(device, 1);

As Valdis replied on 0/3, I don't think this is a good solution (even
temporary). Linux should not assume the current state of a input
device, and sending unconditionally 1 here is wrong. If the device is
on a docking station, you will wake up the wrong monitor and screw the
user session (and this will be a regression).

How about we simply send the current LID state stored in the ACPI?
something like calling acpi_lid_send_state() directly?

[15 min later, after writing a long email]

Well, it looks like we already have that in the kernel and for a long
time apparently.

So, I think the issue is that Microsoft does not wake up the system by
lid opening (seen in one of the comments in the mentioned bugs and by
looking at the behavior on the surface devices). It must be just
querying it's state on resume or might even not care at all until it
receives a close event.

If I read correctly, we managed to get the 3 bogus devices to a
correct state at the ACPI level (/proc/acpi/button/lid/LID0/state),
but we did not get the notification. Given that the LID state is
triggered by an ACPI operation region, there is no guarantee that the
resume of the acpi/button.c driver will be called after the region has
been updated/called.

I propose as a workaround to enable a kthread that will monitor the
lid state and update the correct value to userspace (5 sec of polling
time should be enough given that systemd checks every 20 sec).
We should probably have this workaround only for a set of known
devices, as it might just be temporary for those until the actual
underlying problem is fixed (wrong DSDT in the Surface 3 case that
doesn't notify at all, issue in the EC for the Surface Pro 1 and the
Samsung N210).

Cheers,
Benjamin


> return 0;
>  }
>  #endif
> @@ -422,6 +424,7 @@ static int acpi_button_add(struct acpi_device *device)
> if (error)
> goto err_remove_fs;
> if (button->type == ACPI_BUTTON_TYPE_LID) {
> +   (void)acpi_lid_notify_state(device, 1);
> /*
>  * This assumes there's only one lid device, or if there are
>  * more we only care about the last one...
> --
> 1.7.10
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html