RE: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-21 Thread Zheng, Lv
Hi,

> From: Bastien Nocera [mailto:had...@hadess.net]
> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch 
> exported by ACPI
> 
> On Tue, 2017-06-20 at 02:45 +, Zheng, Lv wrote:
> > Hi,
> >
> > > From: Bastien Nocera [mailto:had...@hadess.net]
> > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable
> > > LID switch exported by ACPI
> > >
> > > On Mon, 2017-06-19 at 01:43 +, Zheng, Lv wrote:
> > > > 
> > > > >
> > > > > If you implement it in such a way that GNOME settings daemon
> > > > > behaves weirdly, you'll get my revert
> > > > > request in the mail. Do. Not. Ever. Lie.
> > > >
> > > > First, I don't know what should be reverted...
> > > > I have 2 solutions here for review, and Benjamin has 1.
> > > > And none of them has been upstreamed.
> > > > We are just discussing.
> > >
> > > The discussion is getting tiring quite frankly. We've been over
> > > this
> > > for nearly a year now, and with no end in sight.
> >
> > We have concerns to introduce too complicated logics to such a
> > simple button driver especially the logics are related to platform
> > firmware, input ABI and user space behaviors.
> >
> > I understand the situation.
> > Anyway this shouldn't be a big deal.
> > Let's prepare a smarter series to collect all fixes and solutions
> > with runtime configurables and get that to the end users.
> > So that we can figure out which is the simplest solution.
> >
> > But before that, let me ask several questions about gnome-setting-
> > deamon.
> >
> > >
> > > > However we need to get 1 of them upstreamed in next cycle.
> > > >
> > > > I think users won't startup gnome-setting-daemon right after
> > > > resume.
> > > > It should have already been started.
> > > >
> > > > There is only 1 platform may see delayed state update after
> > > > resume.
> > > > Let's see if there is a practical issue.
> > > > 1. Before suspend, the "lid state" is "close", and
> > > > 2. After resume, the state might remain "close" for a while
> > > >Since libinput won't deliver close to userspace,
> > > >and gnome-setting-daemon listens to key switches, there is no
> > > > wrong behavior.
> > >
> > > It doesn't. It listens to UPower, which tells user-space whether
> > > there
> > > is a lid switch, and whether it's opened or closed.
> >
> > Thanks for the information.
> > However I don't see differences here.
> >
> > >
> > > > 3. Then after several seconds, "open" arrives.
> > > >gnome-setting-daemon re-arrange monitors and screen layouts in
> > > > response to the new event.
> > >
> > > Just how is anyone supposed to know that there is an event coming?
> >
> > Will UPower deliver EV_SW key events to gnome-setting-daemon?
> >
> > >
> > > > So there is no problem. IMO, there is no need to improve for
> > > > post-
> > > > resume case.
> > > >
> > > > Users will just startup gnome-setting-daemon once after boot.
> > > > And it's likely that when it is started, the state is correct.
> > >
> > > You cannot rely on when gnome-settings-daemon will be started to
> > > make
> > > *any* decision. Certainly not decisions on how the kernel should
> > > behave.
> >
> > My bad wording, I just meant:
> > When gnome-settings-daemon is started is not related to what we are
> > discussing.
> >
> > Do you want to fix regressions?
> > Or you want to fix new issues on recent platforms?
> > If you want to fix regressions, I think Benjamin has submitted a
> > revision
> > to use old method mode, there shouldn't be regressions for
> > gnome-settings-daemon.
> >
> > What else we want to do is to fix regressions related to systemd when
> > we go back to default method mode. Since there is no issue with
> > systemd
> > 233 and after just applying a small change, systemd 229 can also be
> > worked around, I mean dynamically add/remove input node is not
> > strictly
> > required for achieving our purposes.
> >
> > But if you want to fix new issues on new platforms, we can discuss
> > further an

Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-21 Thread Bastien Nocera
On Tue, 2017-06-20 at 02:45 +, Zheng, Lv wrote:
> Hi,
> 
> > From: Bastien Nocera [mailto:had...@hadess.net]
> > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable
> > LID switch exported by ACPI
> > 
> > On Mon, 2017-06-19 at 01:43 +, Zheng, Lv wrote:
> > > 
> > > > 
> > > > If you implement it in such a way that GNOME settings daemon
> > > > behaves weirdly, you'll get my revert
> > > > request in the mail. Do. Not. Ever. Lie.
> > > 
> > > First, I don't know what should be reverted...
> > > I have 2 solutions here for review, and Benjamin has 1.
> > > And none of them has been upstreamed.
> > > We are just discussing.
> > 
> > The discussion is getting tiring quite frankly. We've been over
> > this
> > for nearly a year now, and with no end in sight.
> 
> We have concerns to introduce too complicated logics to such a
> simple button driver especially the logics are related to platform
> firmware, input ABI and user space behaviors.
> 
> I understand the situation.
> Anyway this shouldn't be a big deal.
> Let's prepare a smarter series to collect all fixes and solutions
> with runtime configurables and get that to the end users.
> So that we can figure out which is the simplest solution.
> 
> But before that, let me ask several questions about gnome-setting-
> deamon.
> 
> > 
> > > However we need to get 1 of them upstreamed in next cycle.
> > > 
> > > I think users won't startup gnome-setting-daemon right after
> > > resume.
> > > It should have already been started.
> > > 
> > > There is only 1 platform may see delayed state update after
> > > resume.
> > > Let's see if there is a practical issue.
> > > 1. Before suspend, the "lid state" is "close", and
> > > 2. After resume, the state might remain "close" for a while
> > >Since libinput won't deliver close to userspace,
> > >and gnome-setting-daemon listens to key switches, there is no
> > > wrong behavior.
> > 
> > It doesn't. It listens to UPower, which tells user-space whether
> > there
> > is a lid switch, and whether it's opened or closed.
> 
> Thanks for the information.
> However I don't see differences here.
> 
> > 
> > > 3. Then after several seconds, "open" arrives.
> > >gnome-setting-daemon re-arrange monitors and screen layouts in
> > > response to the new event.
> > 
> > Just how is anyone supposed to know that there is an event coming?
> 
> Will UPower deliver EV_SW key events to gnome-setting-daemon?
> 
> > 
> > > So there is no problem. IMO, there is no need to improve for
> > > post-
> > > resume case.
> > > 
> > > Users will just startup gnome-setting-daemon once after boot.
> > > And it's likely that when it is started, the state is correct.
> > 
> > You cannot rely on when gnome-settings-daemon will be started to
> > make
> > *any* decision. Certainly not decisions on how the kernel should
> > behave.
> 
> My bad wording, I just meant:
> When gnome-settings-daemon is started is not related to what we are
> discussing.
> 
> Do you want to fix regressions?
> Or you want to fix new issues on recent platforms?
> If you want to fix regressions, I think Benjamin has submitted a
> revision
> to use old method mode, there shouldn't be regressions for
> gnome-settings-daemon.
> 
> What else we want to do is to fix regressions related to systemd when
> we go back to default method mode. Since there is no issue with
> systemd
> 233 and after just applying a small change, systemd 229 can also be
> worked around, I mean dynamically add/remove input node is not
> strictly
> required for achieving our purposes.
> 
> But if you want to fix new issues on new platforms, we can discuss
> further and determine which program should be changed and which
> program
> is the best candidate to stop all problems - the ACPI button driver
> or
> the user space.

I'm happy with Benjamin's patches which don't introduce any
dependencies on new user-space, and don't rely on undocumented
heuristics. What was the API is still the API.


RE: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-19 Thread Zheng, Lv
Hi,

> From: Bastien Nocera [mailto:had...@hadess.net]
> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch 
> exported by ACPI
> 
> On Mon, 2017-06-19 at 01:43 +, Zheng, Lv wrote:
> > 
> > >
> > > If you implement it in such a way that GNOME settings daemon
> > > behaves weirdly, you'll get my revert
> > > request in the mail. Do. Not. Ever. Lie.
> >
> > First, I don't know what should be reverted...
> > I have 2 solutions here for review, and Benjamin has 1.
> > And none of them has been upstreamed.
> > We are just discussing.
> 
> The discussion is getting tiring quite frankly. We've been over this
> for nearly a year now, and with no end in sight.

We have concerns to introduce too complicated logics to such a
simple button driver especially the logics are related to platform
firmware, input ABI and user space behaviors.

I understand the situation.
Anyway this shouldn't be a big deal.
Let's prepare a smarter series to collect all fixes and solutions
with runtime configurables and get that to the end users.
So that we can figure out which is the simplest solution.

But before that, let me ask several questions about gnome-setting-deamon.

> 
> > However we need to get 1 of them upstreamed in next cycle.
> >
> > I think users won't startup gnome-setting-daemon right after resume.
> > It should have already been started.
> >
> > There is only 1 platform may see delayed state update after resume.
> > Let's see if there is a practical issue.
> > 1. Before suspend, the "lid state" is "close", and
> > 2. After resume, the state might remain "close" for a while
> >Since libinput won't deliver close to userspace,
> >and gnome-setting-daemon listens to key switches, there is no
> > wrong behavior.
> 
> It doesn't. It listens to UPower, which tells user-space whether there
> is a lid switch, and whether it's opened or closed.

Thanks for the information.
However I don't see differences here.

> 
> > 3. Then after several seconds, "open" arrives.
> >gnome-setting-daemon re-arrange monitors and screen layouts in
> > response to the new event.
> 
> Just how is anyone supposed to know that there is an event coming?

Will UPower deliver EV_SW key events to gnome-setting-daemon?

> 
> > So there is no problem. IMO, there is no need to improve for post-
> > resume case.
> >
> > Users will just startup gnome-setting-daemon once after boot.
> > And it's likely that when it is started, the state is correct.
> 
> You cannot rely on when gnome-settings-daemon will be started to make
> *any* decision. Certainly not decisions on how the kernel should
> behave.

My bad wording, I just meant:
When gnome-settings-daemon is started is not related to what we are
discussing.

Do you want to fix regressions?
Or you want to fix new issues on recent platforms?
If you want to fix regressions, I think Benjamin has submitted a revision
to use old method mode, there shouldn't be regressions for
gnome-settings-daemon.

What else we want to do is to fix regressions related to systemd when
we go back to default method mode. Since there is no issue with systemd
233 and after just applying a small change, systemd 229 can also be
worked around, I mean dynamically add/remove input node is not strictly
required for achieving our purposes.

But if you want to fix new issues on new platforms, we can discuss
further and determine which program should be changed and which program
is the best candidate to stop all problems - the ACPI button driver or
the user space.

Cheers
Lv


Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-19 Thread Bastien Nocera
On Mon, 2017-06-19 at 01:43 +, Zheng, Lv wrote:
> 
> > 
> > If you implement it in such a way that GNOME settings daemon
> > behaves weirdly, you'll get my revert
> > request in the mail. Do. Not. Ever. Lie.
> 
> First, I don't know what should be reverted...
> I have 2 solutions here for review, and Benjamin has 1.
> And none of them has been upstreamed.
> We are just discussing.

The discussion is getting tiring quite frankly. We've been over this
for nearly a year now, and with no end in sight.

> However we need to get 1 of them upstreamed in next cycle.
> 
> I think users won't startup gnome-setting-daemon right after resume.
> It should have already been started.
> 
> There is only 1 platform may see delayed state update after resume.
> Let's see if there is a practical issue.
> 1. Before suspend, the "lid state" is "close", and
> 2. After resume, the state might remain "close" for a while
>Since libinput won't deliver close to userspace,
>and gnome-setting-daemon listens to key switches, there is no
> wrong behavior.

It doesn't. It listens to UPower, which tells user-space whether there
is a lid switch, and whether it's opened or closed.

> 3. Then after several seconds, "open" arrives.
>gnome-setting-daemon re-arrange monitors and screen layouts in
> response to the new event.

Just how is anyone supposed to know that there is an event coming?

> So there is no problem. IMO, there is no need to improve for post-
> resume case.
> 
> Users will just startup gnome-setting-daemon once after boot.
> And it's likely that when it is started, the state is correct.

You cannot rely on when gnome-settings-daemon will be started to make
*any* decision. Certainly not decisions on how the kernel should
behave.

> IMO, there might be a chance to improve for post-boot case using
> Benjamin's approach.


RE: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-18 Thread Zheng, Lv
Hi, Lennart

> From: Lennart Poettering [mailto:mzxre...@0pointer.de]
> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch 
> exported by ACPI
> 
> On Fri, 16.06.17 11:06, Bastien Nocera (had...@hadess.net) wrote:
> 
> > > Let's consider this case with delay:
> > > After resume, gnome-setting-daemon queries SW_LID and got "close".
> > > Then it lights up the wrong monitors.
> > > Then I believe "open" will be delivered to it several seconds later.
> > > Should gnome-setting-daemon light-up correct monitors this time?
> > > So it just looks like user programs behave with a delay accordingly 
> > > because of the "platform
> turnaround" delay.
> >
> > If you implement it in such a way that GNOME settings daemon behaves 
> > weirdly, you'll get my revert
> request in the mail. Do. Not. Ever. Lie.
> 
> Just to mention this:
> 
> the reason logind applies the timeout and doesn't immediately react to
> lid changes is to be friendly to users, if they quickly close and
> reopen the lid. It's not supposed to be a work-around around broken
> input drivers.

I see, it's same reason for button driver to prepare "lid_report_interval".

I think all old user reports are meaningless to us.
At that time, we found 2 problems in systemd (version below 229):
1. If no "open" event received after resume, systemd suspends the platform.
2. If an "open" event received after a "close" event, the suspend cannot be
   cancelled, systemd still suspends the platform.
It looks the 2 problems are 1 single issue that has already been fixed in
recent systemd (I confirmed that this has been fixed in 233).
It's hard for a kernel driver to work these 2 problems around.

> 
> I am very sure that input drivers shouldn't lie to userspace. If you
> don't know the state of the switch, then you don#t know it, and should
> clarify that to userspace somehow.

Without considering "Surface Pro 1" case which requires a quirk anyway.

For my version 2 solution, for all other platforms, there is no "lie".
There is only a delay, and it's likely there is only 1 platform suffering
from such a delay.

Considering a platform that suffers from such a delay:
Before the platform sends the "open" event, the old cached state is "close".
And input layer automatically filters redundant "close".
Thus systemd won't see any event after resume.
 
Then after several seconds, (can be configured by HoldoffTimeoutSec),
The lid status is turned into correct and systemd can see an "open".
So there won't be a problem.
I can tell you that I tested systemd 229/233, no problem can be seen with
version 2 solution on all those platforms.

Since everything works, I mean why we need to change the ACPI driven
SW_LID into a "fade-in/out" input node.

On the contrary:
1. I feel the delay is common:
If an HID device is built on top of USBIP, there is always delays
(several seconds as network turnaround) for its SW_xxx keys if any.
So do we need to change all HID device drivers to export SW keys into
fade-in/out style just because the underlying transport layer may change?
For the case we are discussing, it's just the underlying transport layer
is the platform hardware/firmware and some of them have a huge delay.
2. I feel the delay is inevitable:
If kernel must ensure to resume userspace after determining the wakeup
reason and after the related wakeup source hardware or firmware driver
has synchronized the states. It then will be a long-time-consuming
suspend/resume cycle and cannot meet the fast-idle-entry/exit
requirements of the modern idle platforms. And even worse thing is,
for most of the hardware/firmware drivers, they don't even know that
the hardware/firmware driven by them are the waking the platform up.

I feel it's too early to say that we need such a big change.
We can wait and see if there are any further use cases requiring us to
handle before making such a big change.

Cheers,
Lv


RE: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-18 Thread Zheng, Lv
Hi,

> From: Bastien Nocera [mailto:had...@hadess.net]
> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch 
> exported by ACPI
> 
> 
> 
> > On 16 Jun 2017, at 10:53, Zheng, Lv  wrote:
> >
> > Hi,
> >
> >> From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> >> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID 
> >> switch exported by ACPI
> >>
> >>> On Jun 16 2017 or thereabouts, Zheng, Lv wrote:
> >>> Hi, Benjamin
> >>>
> >>> Let me just say something one more time.
> >>>
> >>>> From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> >>
> >> [snip]
> >>>>>>>
> >>>>>>> We can see:
> >>>>>>> "logind" has already implemented a timeout, and will not respond lid 
> >>>>>>> state
> >>>>>>> unless it can be stable within this timeout period.
> >>>>>>> I'm not an expert of logind, maybe this is because of 
> >>>>>>> "HoldOffTimeoutSec"?
> >>>>>>>
> >>>>>>> I feel "removing the input node for a period where its state is not 
> >>>>>>> trustful"
> >>>>>>> is technically identical to this mechanism.
> >>>>>>
> >>>>>> but you'd be making kernel policy based on one userspace 
> >>>>>> implementation.
> >>>>>> e.g. libinput doesn't have a timeout period, it assumes the state is
> >>>>>> correct when an input node is present.
> >>>>>
> >>>>> Do you see practical issues?
> >>>>
> >>>> Yes, libinput can't rely on the LID switch information to disable
> >>>> touchpads/touchscreens that are potentially sending false positive.
> >>>
> >>> "potential" doesn't mean "practical", right?
> >>
> >> I was using potential to say that some actual devices are sending
> >> rightful states, while others are not (we already named them a lot in
> >> those countless threads). So potential here is from a user space
> >> perspective where you are not sure if the state is reliable or not
> >> (given we currently don't have this information about reliability).
> >>
> >>> After applying my last version.
> >>> There are no false-positives IMO.
> >>> There are only delays for the reliable key events.
> >>>   ^^
> >>> While the "delay" is very common in computing world.
> >>
> >> No, if there is a delay, there is a false positive, because the initial
> >> state is wrong with respect to the device physical state.
> >>
> >>>
> >>>>> After resume, SW_LID state could remain unreliable "close" for a while.
> >>>>
> >>>> This is not an option. It is not part of the protocol, having an
> >>>> unreliable state.
> >>>>
> >>>>> But that's just a kind of delay happens in all computing programs.
> >>>>> I suppose all power managing programs have already handled that.
> >>>>> I confirmed no breakage for systemd 233.
> >>>>> For systemd 229, it cannot handle it well due to bugs.
> >>>>> But my latest patch series has worked the bug around.
> >>>>> So I don't see any breakage related to post-resume incorrect state 
> >>>>> period.
> >>>>> Do you see problems that my tests haven't covered?
> >>>>
> >>>> The problems are that you are not following the protocol. And if systemd
> >>>> 233 works around it, that's good, but systemd is not the only listener
> >>>> of the LID switch input node, and you are still breaking those by
> >>>> refusing to follow the specification of the evdev protocol.
> >>>
> >>> As you are talking about protocol, let me just ask once.
> >>>
> >>> In computing world,
> >>> 1. delay is very common
> >>>   There are bus turnaround, network turnaround, ...
> >>>   Even measurement itself has delay described by Shannon sampling.
> >>>   Should the delay be a part of the protocol?
> >>
> >> Please, you are either trolling or just kid

Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-16 Thread Lennart Poettering
On Fri, 16.06.17 11:06, Bastien Nocera (had...@hadess.net) wrote:

> > Let's consider this case with delay:
> > After resume, gnome-setting-daemon queries SW_LID and got "close".
> > Then it lights up the wrong monitors.
> > Then I believe "open" will be delivered to it several seconds later.
> > Should gnome-setting-daemon light-up correct monitors this time?
> > So it just looks like user programs behave with a delay accordingly because 
> > of the "platform turnaround" delay.
> 
> If you implement it in such a way that GNOME settings daemon behaves weirdly, 
> you'll get my revert request in the mail. Do. Not. Ever. Lie.

Just to mention this:

the reason logind applies the timeout and doesn't immediately react to
lid changes is to be friendly to users, if they quickly close and
reopen the lid. It's not supposed to be a work-around around broken
input drivers.

I am very sure that input drivers shouldn't lie to userspace. If you
don't know the state of the switch, then you don#t know it, and should
clarify that to userspace somehow.

Lennart

-- 
Lennart Poettering, Red Hat


Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-16 Thread Bastien Nocera


> On 16 Jun 2017, at 10:53, Zheng, Lv  wrote:
> 
> Hi,
> 
>> From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
>> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID 
>> switch exported by ACPI
>> 
>>> On Jun 16 2017 or thereabouts, Zheng, Lv wrote:
>>> Hi, Benjamin
>>> 
>>> Let me just say something one more time.
>>> 
>>>> From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
>> 
>> [snip]
>>>>>>> 
>>>>>>> We can see:
>>>>>>> "logind" has already implemented a timeout, and will not respond lid 
>>>>>>> state
>>>>>>> unless it can be stable within this timeout period.
>>>>>>> I'm not an expert of logind, maybe this is because of 
>>>>>>> "HoldOffTimeoutSec"?
>>>>>>> 
>>>>>>> I feel "removing the input node for a period where its state is not 
>>>>>>> trustful"
>>>>>>> is technically identical to this mechanism.
>>>>>> 
>>>>>> but you'd be making kernel policy based on one userspace implementation.
>>>>>> e.g. libinput doesn't have a timeout period, it assumes the state is
>>>>>> correct when an input node is present.
>>>>> 
>>>>> Do you see practical issues?
>>>> 
>>>> Yes, libinput can't rely on the LID switch information to disable
>>>> touchpads/touchscreens that are potentially sending false positive.
>>> 
>>> "potential" doesn't mean "practical", right?
>> 
>> I was using potential to say that some actual devices are sending
>> rightful states, while others are not (we already named them a lot in
>> those countless threads). So potential here is from a user space
>> perspective where you are not sure if the state is reliable or not
>> (given we currently don't have this information about reliability).
>> 
>>> After applying my last version.
>>> There are no false-positives IMO.
>>> There are only delays for the reliable key events.
>>>   ^^
>>> While the "delay" is very common in computing world.
>> 
>> No, if there is a delay, there is a false positive, because the initial
>> state is wrong with respect to the device physical state.
>> 
>>> 
>>>>> After resume, SW_LID state could remain unreliable "close" for a while.
>>>> 
>>>> This is not an option. It is not part of the protocol, having an
>>>> unreliable state.
>>>> 
>>>>> But that's just a kind of delay happens in all computing programs.
>>>>> I suppose all power managing programs have already handled that.
>>>>> I confirmed no breakage for systemd 233.
>>>>> For systemd 229, it cannot handle it well due to bugs.
>>>>> But my latest patch series has worked the bug around.
>>>>> So I don't see any breakage related to post-resume incorrect state period.
>>>>> Do you see problems that my tests haven't covered?
>>>> 
>>>> The problems are that you are not following the protocol. And if systemd
>>>> 233 works around it, that's good, but systemd is not the only listener
>>>> of the LID switch input node, and you are still breaking those by
>>>> refusing to follow the specification of the evdev protocol.
>>> 
>>> As you are talking about protocol, let me just ask once.
>>> 
>>> In computing world,
>>> 1. delay is very common
>>>   There are bus turnaround, network turnaround, ...
>>>   Even measurement itself has delay described by Shannon sampling.
>>>   Should the delay be a part of the protocol?
>> 
>> Please, you are either trolling or just kidding. If there are delays in
>> the "computing world", these has to be handled by the kernel, and not
>> exported to the user space if the kernel protocol says that the state is
>> reliable.
>> 
>>> 2. programs are acting according to rules (we call state machines)
>>>   States are only determined after measurement (like "quantum states")
>>>   I have Schroedinger's cat in my mind.
>>>   Events are determined as they always occur after measurement to trigger 
>>> "quantum jumps".
>>>   So for EV_SW protocol,

RE: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-16 Thread Zheng, Lv
Hi,

> From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch 
> exported by ACPI
> 
> On Jun 16 2017 or thereabouts, Zheng, Lv wrote:
> > Hi, Benjamin
> >
> > Let me just say something one more time.
> >
> > > From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> 
> [snip]
> > > > > >
> > > > > > We can see:
> > > > > > "logind" has already implemented a timeout, and will not respond 
> > > > > > lid state
> > > > > > unless it can be stable within this timeout period.
> > > > > > I'm not an expert of logind, maybe this is because of 
> > > > > > "HoldOffTimeoutSec"?
> > > > > >
> > > > > > I feel "removing the input node for a period where its state is not 
> > > > > > trustful"
> > > > > > is technically identical to this mechanism.
> > > > >
> > > > > but you'd be making kernel policy based on one userspace 
> > > > > implementation.
> > > > > e.g. libinput doesn't have a timeout period, it assumes the state is
> > > > > correct when an input node is present.
> > > >
> > > > Do you see practical issues?
> > >
> > > Yes, libinput can't rely on the LID switch information to disable
> > > touchpads/touchscreens that are potentially sending false positive.
> >
> > "potential" doesn't mean "practical", right?
> 
> I was using potential to say that some actual devices are sending
> rightful states, while others are not (we already named them a lot in
> those countless threads). So potential here is from a user space
> perspective where you are not sure if the state is reliable or not
> (given we currently don't have this information about reliability).
> 
> > After applying my last version.
> > There are no false-positives IMO.
> > There are only delays for the reliable key events.
> >^^
> > While the "delay" is very common in computing world.
> 
> No, if there is a delay, there is a false positive, because the initial
> state is wrong with respect to the device physical state.
> 
> >
> > > > After resume, SW_LID state could remain unreliable "close" for a while.
> > >
> > > This is not an option. It is not part of the protocol, having an
> > > unreliable state.
> > >
> > > > But that's just a kind of delay happens in all computing programs.
> > > > I suppose all power managing programs have already handled that.
> > > > I confirmed no breakage for systemd 233.
> > > > For systemd 229, it cannot handle it well due to bugs.
> > > > But my latest patch series has worked the bug around.
> > > > So I don't see any breakage related to post-resume incorrect state 
> > > > period.
> > > > Do you see problems that my tests haven't covered?
> > >
> > > The problems are that you are not following the protocol. And if systemd
> > > 233 works around it, that's good, but systemd is not the only listener
> > > of the LID switch input node, and you are still breaking those by
> > > refusing to follow the specification of the evdev protocol.
> >
> > As you are talking about protocol, let me just ask once.
> >
> > In computing world,
> > 1. delay is very common
> >There are bus turnaround, network turnaround, ...
> >Even measurement itself has delay described by Shannon sampling.
> >Should the delay be a part of the protocol?
> 
> Please, you are either trolling or just kidding. If there are delays in
> the "computing world", these has to be handled by the kernel, and not
> exported to the user space if the kernel protocol says that the state is
> reliable.
> 
> > 2. programs are acting according to rules (we call state machines)
> >States are only determined after measurement (like "quantum states")
> >I have Schroedinger's cat in my mind.
> >Events are determined as they always occur after measurement to trigger 
> > "quantum jumps".
> >So for EV_SW protocol,
> >Should programs rely on the reliable "quantum jumps",
> >Or should programs rely on the unreliable "quantum states"?
> 
> No comments, this won't get us anywhere.
> 
> >

Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-16 Thread Benjamin Tissoires
On Jun 16 2017 or thereabouts, Zheng, Lv wrote:
> Hi, Benjamin
> 
> Let me just say something one more time.
> 
> > From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]

[snip]
> > > > >
> > > > > We can see:
> > > > > "logind" has already implemented a timeout, and will not respond lid 
> > > > > state
> > > > > unless it can be stable within this timeout period.
> > > > > I'm not an expert of logind, maybe this is because of 
> > > > > "HoldOffTimeoutSec"?
> > > > >
> > > > > I feel "removing the input node for a period where its state is not 
> > > > > trustful"
> > > > > is technically identical to this mechanism.
> > > >
> > > > but you'd be making kernel policy based on one userspace implementation.
> > > > e.g. libinput doesn't have a timeout period, it assumes the state is
> > > > correct when an input node is present.
> > >
> > > Do you see practical issues?
> > 
> > Yes, libinput can't rely on the LID switch information to disable
> > touchpads/touchscreens that are potentially sending false positive.
> 
> "potential" doesn't mean "practical", right?

I was using potential to say that some actual devices are sending
rightful states, while others are not (we already named them a lot in
those countless threads). So potential here is from a user space
perspective where you are not sure if the state is reliable or not
(given we currently don't have this information about reliability).

> After applying my last version.
> There are no false-positives IMO.
> There are only delays for the reliable key events.
>^^
> While the "delay" is very common in computing world.

No, if there is a delay, there is a false positive, because the initial
state is wrong with respect to the device physical state.

> 
> > > After resume, SW_LID state could remain unreliable "close" for a while.
> > 
> > This is not an option. It is not part of the protocol, having an
> > unreliable state.
> > 
> > > But that's just a kind of delay happens in all computing programs.
> > > I suppose all power managing programs have already handled that.
> > > I confirmed no breakage for systemd 233.
> > > For systemd 229, it cannot handle it well due to bugs.
> > > But my latest patch series has worked the bug around.
> > > So I don't see any breakage related to post-resume incorrect state period.
> > > Do you see problems that my tests haven't covered?
> > 
> > The problems are that you are not following the protocol. And if systemd
> > 233 works around it, that's good, but systemd is not the only listener
> > of the LID switch input node, and you are still breaking those by
> > refusing to follow the specification of the evdev protocol.
> 
> As you are talking about protocol, let me just ask once.
> 
> In computing world,
> 1. delay is very common
>There are bus turnaround, network turnaround, ...
>Even measurement itself has delay described by Shannon sampling.
>Should the delay be a part of the protocol?

Please, you are either trolling or just kidding. If there are delays in
the "computing world", these has to be handled by the kernel, and not
exported to the user space if the kernel protocol says that the state is
reliable.

> 2. programs are acting according to rules (we call state machines)
>States are only determined after measurement (like "quantum states")
>I have Schroedinger's cat in my mind.
>Events are determined as they always occur after measurement to trigger 
> "quantum jumps".
>So for EV_SW protocol,
>Should programs rely on the reliable "quantum jumps",
>Or should programs rely on the unreliable "quantum states"?

No comments, this won't get us anywhere.

> 
> I think most UI programs care no about state stored in the input node,
> they only receive events raised from the input node.

Bullshit. When you launch such a program, you need to query the state
because you won't receive the event that happened way before the launch.

> Why should the kernel export a fade-in/fade-out input node to the UI programs 
> and ask them to change?
> 
> The only program that cares about the state stored in the input node is 
> libinput.
> So why should every user program be changed to make libinput easier?

No, all program that listen to LID switches input nodes care about the
state. We already told you that, you just don't listen:

- systemd cares about the state as it does polling on the input node in
  case it misses an event
- libinput cares about the state as previously mentioned
- gnome-setting-daemons cares about the state given it decides whether
  or not lighting up the monitors depending on the state. And if you
  relaunch the daemon, it'll query the state to decide what is the best
  arrangement for the screens
- KDE should do the same (as it is not a daemon that can catch any
  events)

And I really don't know why I am telling you that. The rule is simple:
there is a kernel protocol, which is a contract between the kernel and
the user space. With such "delays"

RE: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-16 Thread Zheng, Lv
Hi, Benjamin

Let me just say something one more time.

> From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch 
> exported by ACPI
> 
> On Jun 16 2017 or thereabouts, Zheng, Lv wrote:
> > Hi,
> >
> > > From: linux-acpi-ow...@vger.kernel.org 
> > > [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of
> Peter
> > > Hutterer
> > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID 
> > > switch exported by ACPI
> > >
> > > On Thu, Jun 15, 2017 at 07:33:58AM +, Zheng, Lv wrote:
> > > > Hi, Peter
> > > >
> > > > > From: Peter Hutterer [mailto:peter.hutte...@who-t.net]
> > > > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable 
> > > > > LID switch exported by ACPI
> > > > >
> > > > > On Thu, Jun 15, 2017 at 02:52:57AM +0000, Zheng, Lv wrote:
> > > > > > Hi, Benjamin
> > > > > >
> > > > > > > From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> > > > > > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the 
> > > > > > > unreliable LID switch exported by
> ACPI
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > [Sorry for the delay, I have been sidetracked from this]
> > > > > > >
> > > > > > > On Jun 07 2017 or thereabouts, Lennart Poettering wrote:
> > > > > > > > On Thu, 01.06.17 20:46, Benjamin Tissoires 
> > > > > > > > (benjamin.tissoi...@redhat.com) wrote:
> > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > Sending this as a WIP as it still need a few changes, but it 
> > > > > > > > > mostly works as
> > > > > > > > > expected (still not fully compliant yet).
> > > > > > > > >
> > > > > > > > > So this is based on Lennart's comment in [1]: if the LID 
> > > > > > > > > state is not reliable,
> > > > > > > > > the kernel should not export the LID switch device as long as 
> > > > > > > > > we are not sure
> > > > > > > > > about its state.
> > > > > > > >
> > > > > > > > Ah nice! I (obviously) like this approach.
> > > > > > >
> > > > > > > Heh. Now I just need to convince Lv that it's the right approach.
> > > > > >
> > > > > > I feel we don't have big conflicts.
> > > > > > And I already took part of your idea into this patchset:
> > > > > > https://patchwork.kernel.org/patch/9771121/
> > > > > > https://patchwork.kernel.org/patch/9771119/
> > > > > > I tested my surface pros with Ubuntu, they are working as expected.
> > > > > >
> > > > > > > > > Note that systemd currently doesn't sync the state when the 
> > > > > > > > > input node just
> > > > > > > > > appears. This is a systemd bug, and it should not be handled 
> > > > > > > > > by the kernel
> > > > > > > > > community.
> > > > > > > >
> > > > > > > > Uh if this is borked, we should indeed fix this in systemd. Is 
> > > > > > > > there
> > > > > > > > already a systemd github bug about this? If not, please create 
> > > > > > > > one,
> > > > > > > > and we'll look into it!
> > > > > > >
> > > > > > > I don't think there is. I haven't raised it yet because I am not 
> > > > > > > so sure
> > > > > > > this will not break again those worthless unreliable LID, and if 
> > > > > > > we play
> > > > > > > whack a mole between the kernel and user space, things are going 
> > > > > > > to be
> > > > > > > nasty. So I'd rather have this fixed in systemd along with the
> > > > > > > unreliable LID switch knowledge, so we are sure that the kernel 
> > > > > > > beha

Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-16 Thread Benjamin Tissoires
On Jun 16 2017 or thereabouts, Zheng, Lv wrote:
> Hi,
> 
> > From: linux-acpi-ow...@vger.kernel.org 
> > [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Peter
> > Hutterer
> > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID 
> > switch exported by ACPI
> > 
> > On Thu, Jun 15, 2017 at 07:33:58AM +, Zheng, Lv wrote:
> > > Hi, Peter
> > >
> > > > From: Peter Hutterer [mailto:peter.hutte...@who-t.net]
> > > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID 
> > > > switch exported by ACPI
> > > >
> > > > On Thu, Jun 15, 2017 at 02:52:57AM +, Zheng, Lv wrote:
> > > > > Hi, Benjamin
> > > > >
> > > > > > From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> > > > > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable 
> > > > > > LID switch exported by ACPI
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > [Sorry for the delay, I have been sidetracked from this]
> > > > > >
> > > > > > On Jun 07 2017 or thereabouts, Lennart Poettering wrote:
> > > > > > > On Thu, 01.06.17 20:46, Benjamin Tissoires 
> > > > > > > (benjamin.tissoi...@redhat.com) wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > Sending this as a WIP as it still need a few changes, but it 
> > > > > > > > mostly works as
> > > > > > > > expected (still not fully compliant yet).
> > > > > > > >
> > > > > > > > So this is based on Lennart's comment in [1]: if the LID state 
> > > > > > > > is not reliable,
> > > > > > > > the kernel should not export the LID switch device as long as 
> > > > > > > > we are not sure
> > > > > > > > about its state.
> > > > > > >
> > > > > > > Ah nice! I (obviously) like this approach.
> > > > > >
> > > > > > Heh. Now I just need to convince Lv that it's the right approach.
> > > > >
> > > > > I feel we don't have big conflicts.
> > > > > And I already took part of your idea into this patchset:
> > > > > https://patchwork.kernel.org/patch/9771121/
> > > > > https://patchwork.kernel.org/patch/9771119/
> > > > > I tested my surface pros with Ubuntu, they are working as expected.
> > > > >
> > > > > > > > Note that systemd currently doesn't sync the state when the 
> > > > > > > > input node just
> > > > > > > > appears. This is a systemd bug, and it should not be handled by 
> > > > > > > > the kernel
> > > > > > > > community.
> > > > > > >
> > > > > > > Uh if this is borked, we should indeed fix this in systemd. Is 
> > > > > > > there
> > > > > > > already a systemd github bug about this? If not, please create 
> > > > > > > one,
> > > > > > > and we'll look into it!
> > > > > >
> > > > > > I don't think there is. I haven't raised it yet because I am not so 
> > > > > > sure
> > > > > > this will not break again those worthless unreliable LID, and if we 
> > > > > > play
> > > > > > whack a mole between the kernel and user space, things are going to 
> > > > > > be
> > > > > > nasty. So I'd rather have this fixed in systemd along with the
> > > > > > unreliable LID switch knowledge, so we are sure that the kernel 
> > > > > > behaves
> > > > > > the way we expect it to be.
> > > > >
> > > > > This is my feeling:
> > > > > We needn't go that far.
> > > > > We can interpret "input node appears" into "default input node state".
> > > >
> > > > Sorry, can you clarify this bit please? I'm not sure what you mean here.
> > > > Note that there's an unknown amount of time between "device node 
> > > > appearing
> > > > in the system" and when a userspace proc

RE: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-15 Thread Zheng, Lv
Hi,

> From: linux-acpi-ow...@vger.kernel.org 
> [mailto:linux-acpi-ow...@vger.kernel.org] On Behalf Of Peter
> Hutterer
> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch 
> exported by ACPI
> 
> On Thu, Jun 15, 2017 at 07:33:58AM +, Zheng, Lv wrote:
> > Hi, Peter
> >
> > > From: Peter Hutterer [mailto:peter.hutte...@who-t.net]
> > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID 
> > > switch exported by ACPI
> > >
> > > On Thu, Jun 15, 2017 at 02:52:57AM +, Zheng, Lv wrote:
> > > > Hi, Benjamin
> > > >
> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> > > > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable 
> > > > > LID switch exported by ACPI
> > > > >
> > > > > Hi,
> > > > >
> > > > > [Sorry for the delay, I have been sidetracked from this]
> > > > >
> > > > > On Jun 07 2017 or thereabouts, Lennart Poettering wrote:
> > > > > > On Thu, 01.06.17 20:46, Benjamin Tissoires 
> > > > > > (benjamin.tissoi...@redhat.com) wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Sending this as a WIP as it still need a few changes, but it 
> > > > > > > mostly works as
> > > > > > > expected (still not fully compliant yet).
> > > > > > >
> > > > > > > So this is based on Lennart's comment in [1]: if the LID state is 
> > > > > > > not reliable,
> > > > > > > the kernel should not export the LID switch device as long as we 
> > > > > > > are not sure
> > > > > > > about its state.
> > > > > >
> > > > > > Ah nice! I (obviously) like this approach.
> > > > >
> > > > > Heh. Now I just need to convince Lv that it's the right approach.
> > > >
> > > > I feel we don't have big conflicts.
> > > > And I already took part of your idea into this patchset:
> > > > https://patchwork.kernel.org/patch/9771121/
> > > > https://patchwork.kernel.org/patch/9771119/
> > > > I tested my surface pros with Ubuntu, they are working as expected.
> > > >
> > > > > > > Note that systemd currently doesn't sync the state when the input 
> > > > > > > node just
> > > > > > > appears. This is a systemd bug, and it should not be handled by 
> > > > > > > the kernel
> > > > > > > community.
> > > > > >
> > > > > > Uh if this is borked, we should indeed fix this in systemd. Is there
> > > > > > already a systemd github bug about this? If not, please create one,
> > > > > > and we'll look into it!
> > > > >
> > > > > I don't think there is. I haven't raised it yet because I am not so 
> > > > > sure
> > > > > this will not break again those worthless unreliable LID, and if we 
> > > > > play
> > > > > whack a mole between the kernel and user space, things are going to be
> > > > > nasty. So I'd rather have this fixed in systemd along with the
> > > > > unreliable LID switch knowledge, so we are sure that the kernel 
> > > > > behaves
> > > > > the way we expect it to be.
> > > >
> > > > This is my feeling:
> > > > We needn't go that far.
> > > > We can interpret "input node appears" into "default input node state".
> > >
> > > Sorry, can you clarify this bit please? I'm not sure what you mean here.
> > > Note that there's an unknown amount of time between "device node appearing
> > > in the system" and when a userspace process actually opens it and looks at
> > > its state. By then, the node may have changed state again.
> >
> > We can see:
> > "logind" has already implemented a timeout, and will not respond lid state
> > unless it can be stable within this timeout period.
> > I'm not an expert of logind, maybe this is because of "HoldOffTimeoutSec"?
> >
> > I feel "removing the input node for a period where its state is not 
> > trustful"
> > is technically identical to this mechanism.
> 
> but you'd be making kernel policy based on one userspace implementation.
> e.g. libinput doesn't have a timeout period, it assumes the state is
> correct when an input node is present.

Do you see practical issues?
If not, should we avoid over-engineering at this moment?

After resume, SW_LID state could remain unreliable "close" for a while.
But that's just a kind of delay happens in all computing programs.
I suppose all power managing programs have already handled that.
I confirmed no breakage for systemd 233.
For systemd 229, it cannot handle it well due to bugs.
But my latest patch series has worked the bug around.
So I don't see any breakage related to post-resume incorrect state period.
Do you see problems that my tests haven't covered?

So I wonder if you mean:
After boot, button driver should create input node right before sending first 
input report.
Is this exactly what you want me to improve?
If so, please also let me know if you have seen real issues related to this?

Cheers,
Lv


Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-15 Thread Peter Hutterer
On Thu, Jun 15, 2017 at 07:33:58AM +, Zheng, Lv wrote:
> Hi, Peter
> 
> > From: Peter Hutterer [mailto:peter.hutte...@who-t.net]
> > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID 
> > switch exported by ACPI
> > 
> > On Thu, Jun 15, 2017 at 02:52:57AM +, Zheng, Lv wrote:
> > > Hi, Benjamin
> > >
> > > > From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> > > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID 
> > > > switch exported by ACPI
> > > >
> > > > Hi,
> > > >
> > > > [Sorry for the delay, I have been sidetracked from this]
> > > >
> > > > On Jun 07 2017 or thereabouts, Lennart Poettering wrote:
> > > > > On Thu, 01.06.17 20:46, Benjamin Tissoires 
> > > > > (benjamin.tissoi...@redhat.com) wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Sending this as a WIP as it still need a few changes, but it mostly 
> > > > > > works as
> > > > > > expected (still not fully compliant yet).
> > > > > >
> > > > > > So this is based on Lennart's comment in [1]: if the LID state is 
> > > > > > not reliable,
> > > > > > the kernel should not export the LID switch device as long as we 
> > > > > > are not sure
> > > > > > about its state.
> > > > >
> > > > > Ah nice! I (obviously) like this approach.
> > > >
> > > > Heh. Now I just need to convince Lv that it's the right approach.
> > >
> > > I feel we don't have big conflicts.
> > > And I already took part of your idea into this patchset:
> > > https://patchwork.kernel.org/patch/9771121/
> > > https://patchwork.kernel.org/patch/9771119/
> > > I tested my surface pros with Ubuntu, they are working as expected.
> > >
> > > > > > Note that systemd currently doesn't sync the state when the input 
> > > > > > node just
> > > > > > appears. This is a systemd bug, and it should not be handled by the 
> > > > > > kernel
> > > > > > community.
> > > > >
> > > > > Uh if this is borked, we should indeed fix this in systemd. Is there
> > > > > already a systemd github bug about this? If not, please create one,
> > > > > and we'll look into it!
> > > >
> > > > I don't think there is. I haven't raised it yet because I am not so sure
> > > > this will not break again those worthless unreliable LID, and if we play
> > > > whack a mole between the kernel and user space, things are going to be
> > > > nasty. So I'd rather have this fixed in systemd along with the
> > > > unreliable LID switch knowledge, so we are sure that the kernel behaves
> > > > the way we expect it to be.
> > >
> > > This is my feeling:
> > > We needn't go that far.
> > > We can interpret "input node appears" into "default input node state".
> > 
> > Sorry, can you clarify this bit please? I'm not sure what you mean here.
> > Note that there's an unknown amount of time between "device node appearing
> > in the system" and when a userspace process actually opens it and looks at
> > its state. By then, the node may have changed state again.
> 
> We can see:
> "logind" has already implemented a timeout, and will not respond lid state
> unless it can be stable within this timeout period.
> I'm not an expert of logind, maybe this is because of "HoldOffTimeoutSec"?
> 
> I feel "removing the input node for a period where its state is not trustful"
> is technically identical to this mechanism.

but you'd be making kernel policy based on one userspace implementation.
e.g. libinput doesn't have a timeout period, it assumes the state is
correct when an input node is present. 

Cheers,
   Peter



RE: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-15 Thread Zheng, Lv
Hi, Peter

> From: Peter Hutterer [mailto:peter.hutte...@who-t.net]
> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch 
> exported by ACPI
> 
> On Thu, Jun 15, 2017 at 02:52:57AM +, Zheng, Lv wrote:
> > Hi, Benjamin
> >
> > > From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> > > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID 
> > > switch exported by ACPI
> > >
> > > Hi,
> > >
> > > [Sorry for the delay, I have been sidetracked from this]
> > >
> > > On Jun 07 2017 or thereabouts, Lennart Poettering wrote:
> > > > On Thu, 01.06.17 20:46, Benjamin Tissoires 
> > > > (benjamin.tissoi...@redhat.com) wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Sending this as a WIP as it still need a few changes, but it mostly 
> > > > > works as
> > > > > expected (still not fully compliant yet).
> > > > >
> > > > > So this is based on Lennart's comment in [1]: if the LID state is not 
> > > > > reliable,
> > > > > the kernel should not export the LID switch device as long as we are 
> > > > > not sure
> > > > > about its state.
> > > >
> > > > Ah nice! I (obviously) like this approach.
> > >
> > > Heh. Now I just need to convince Lv that it's the right approach.
> >
> > I feel we don't have big conflicts.
> > And I already took part of your idea into this patchset:
> > https://patchwork.kernel.org/patch/9771121/
> > https://patchwork.kernel.org/patch/9771119/
> > I tested my surface pros with Ubuntu, they are working as expected.
> >
> > > > > Note that systemd currently doesn't sync the state when the input 
> > > > > node just
> > > > > appears. This is a systemd bug, and it should not be handled by the 
> > > > > kernel
> > > > > community.
> > > >
> > > > Uh if this is borked, we should indeed fix this in systemd. Is there
> > > > already a systemd github bug about this? If not, please create one,
> > > > and we'll look into it!
> > >
> > > I don't think there is. I haven't raised it yet because I am not so sure
> > > this will not break again those worthless unreliable LID, and if we play
> > > whack a mole between the kernel and user space, things are going to be
> > > nasty. So I'd rather have this fixed in systemd along with the
> > > unreliable LID switch knowledge, so we are sure that the kernel behaves
> > > the way we expect it to be.
> >
> > This is my feeling:
> > We needn't go that far.
> > We can interpret "input node appears" into "default input node state".
> 
> Sorry, can you clarify this bit please? I'm not sure what you mean here.
> Note that there's an unknown amount of time between "device node appearing
> in the system" and when a userspace process actually opens it and looks at
> its state. By then, the node may have changed state again.

We can see:
"logind" has already implemented a timeout, and will not respond lid state
unless it can be stable within this timeout period.
I'm not an expert of logind, maybe this is because of "HoldOffTimeoutSec"?

I feel "removing the input node for a period where its state is not trustful"
is technically identical to this mechanism.

Cheers,
Lv

> 
> Cheers,
>Peter
> 
> > That's what you want for acpi button driver - we now defaults to "method" 
> > mode.
> >
> > What's your opinion?
> >
> > Thanks
> > Lv
> >


Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-14 Thread Peter Hutterer
On Thu, Jun 15, 2017 at 02:52:57AM +, Zheng, Lv wrote:
> Hi, Benjamin
> 
> > From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> > Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID 
> > switch exported by ACPI
> > 
> > Hi,
> > 
> > [Sorry for the delay, I have been sidetracked from this]
> > 
> > On Jun 07 2017 or thereabouts, Lennart Poettering wrote:
> > > On Thu, 01.06.17 20:46, Benjamin Tissoires 
> > > (benjamin.tissoi...@redhat.com) wrote:
> > >
> > > > Hi,
> > > >
> > > > Sending this as a WIP as it still need a few changes, but it mostly 
> > > > works as
> > > > expected (still not fully compliant yet).
> > > >
> > > > So this is based on Lennart's comment in [1]: if the LID state is not 
> > > > reliable,
> > > > the kernel should not export the LID switch device as long as we are 
> > > > not sure
> > > > about its state.
> > >
> > > Ah nice! I (obviously) like this approach.
> > 
> > Heh. Now I just need to convince Lv that it's the right approach.
> 
> I feel we don't have big conflicts.
> And I already took part of your idea into this patchset:
> https://patchwork.kernel.org/patch/9771121/
> https://patchwork.kernel.org/patch/9771119/
> I tested my surface pros with Ubuntu, they are working as expected.
> 
> > > > Note that systemd currently doesn't sync the state when the input node 
> > > > just
> > > > appears. This is a systemd bug, and it should not be handled by the 
> > > > kernel
> > > > community.
> > >
> > > Uh if this is borked, we should indeed fix this in systemd. Is there
> > > already a systemd github bug about this? If not, please create one,
> > > and we'll look into it!
> > 
> > I don't think there is. I haven't raised it yet because I am not so sure
> > this will not break again those worthless unreliable LID, and if we play
> > whack a mole between the kernel and user space, things are going to be
> > nasty. So I'd rather have this fixed in systemd along with the
> > unreliable LID switch knowledge, so we are sure that the kernel behaves
> > the way we expect it to be.
> 
> This is my feeling:
> We needn't go that far.
> We can interpret "input node appears" into "default input node state".

Sorry, can you clarify this bit please? I'm not sure what you mean here.
Note that there's an unknown amount of time between "device node appearing
in the system" and when a userspace process actually opens it and looks at
its state. By then, the node may have changed state again.

Cheers,
   Peter

> That's what you want for acpi button driver - we now defaults to "method" 
> mode.
> 
> What's your opinion?
> 
> Thanks
> Lv
> 


RE: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-14 Thread Zheng, Lv
Hi, Benjamin

> From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com]
> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch 
> exported by ACPI
> 
> Hi,
> 
> [Sorry for the delay, I have been sidetracked from this]
> 
> On Jun 07 2017 or thereabouts, Lennart Poettering wrote:
> > On Thu, 01.06.17 20:46, Benjamin Tissoires (benjamin.tissoi...@redhat.com) 
> > wrote:
> >
> > > Hi,
> > >
> > > Sending this as a WIP as it still need a few changes, but it mostly works 
> > > as
> > > expected (still not fully compliant yet).
> > >
> > > So this is based on Lennart's comment in [1]: if the LID state is not 
> > > reliable,
> > > the kernel should not export the LID switch device as long as we are not 
> > > sure
> > > about its state.
> >
> > Ah nice! I (obviously) like this approach.
> 
> Heh. Now I just need to convince Lv that it's the right approach.

I feel we don't have big conflicts.
And I already took part of your idea into this patchset:
https://patchwork.kernel.org/patch/9771121/
https://patchwork.kernel.org/patch/9771119/
I tested my surface pros with Ubuntu, they are working as expected.

> > > Note that systemd currently doesn't sync the state when the input node 
> > > just
> > > appears. This is a systemd bug, and it should not be handled by the kernel
> > > community.
> >
> > Uh if this is borked, we should indeed fix this in systemd. Is there
> > already a systemd github bug about this? If not, please create one,
> > and we'll look into it!
> 
> I don't think there is. I haven't raised it yet because I am not so sure
> this will not break again those worthless unreliable LID, and if we play
> whack a mole between the kernel and user space, things are going to be
> nasty. So I'd rather have this fixed in systemd along with the
> unreliable LID switch knowledge, so we are sure that the kernel behaves
> the way we expect it to be.

This is my feeling:
We needn't go that far.
We can interpret "input node appears" into "default input node state".
That's what you want for acpi button driver - we now defaults to "method" mode.

What's your opinion?

Thanks
Lv


RE: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-14 Thread Zheng, Lv
Hi,

> From: Lennart Poettering [mailto:mzxre...@0pointer.de]
> Subject: Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch 
> exported by ACPI
> 
> On Thu, 01.06.17 20:46, Benjamin Tissoires (benjamin.tissoi...@redhat.com) 
> wrote:
> 
> > Hi,
> >
> > Sending this as a WIP as it still need a few changes, but it mostly works as
> > expected (still not fully compliant yet).
> >
> > So this is based on Lennart's comment in [1]: if the LID state is not 
> > reliable,
> > the kernel should not export the LID switch device as long as we are not 
> > sure
> > about its state.
> 
> Ah nice! I (obviously) like this approach.
> 
> > Note that systemd currently doesn't sync the state when the input node just
> > appears. This is a systemd bug, and it should not be handled by the kernel
> > community.
> 
> Uh if this is borked, we should indeed fix this in systemd. Is there
> already a systemd github bug about this? If not, please create one,
> and we'll look into it!

This is not my opinion.
My opinion is as follows.

We confirmed Ubuntu shipped systemd (version 229) with "reliable|unreliable" 
platforms.
We can see 2 problems:
1. LID_OPEN cannot cancel an on-going suspend sequence
   After boot, if user space receives "LID_CLOSE" key event,
   systemd may not suspend the platform right after seeing the event,
   it may suspend the platform several seconds later.
   This is not a problem.

   The problem is, if "LID_OPEN" is sent within this deferring period,
   Systemd doesn't cancel previously scheduled "suspend".
   And the platform may be suspended with lid opened.
   Then users need to close and re-open the lid to wake the system up.
   Causing another "LID_CLOSE/LID_OPEN" sequence delivered to the user space 
after resume.
   Users then can see a suspend/resume loop.
   This problem can even be seen on "reliable" platforms.
   It can be easily triggered by user actions.
2. Need explicit LID_OPEN to stay woken-up
   After boot, systemd seems to be wait for a significant "LID_OPEN".
   If it cannot see a "LID_OPEN" within several seconds,
   it suspends the platform.
   So if a platform doesn't send "LID_OPEN" or fails to send "LID_OPEN" within 
this period.
   Users then can see a suspend/resume loop.

However we've tested with github cloned systemd (version 233).
The 2 problems seem to have been fixed.
It works well with current ACPI button driver,
but you need to boot linux kernel with button.lid_init_state=ignore.
I don't know the story of the improvement.
Systemd developers should know that better than me.

So IMO, systemd needn't do any further improvement.
^^
^^

But the kernel button driver implements several "lid_init_state" modes.
It appears "method" mode is determined to be the default mode.
Thus we need to do:
1. improve button driver "method" mode to make systemd 233 work well with it.
2. determine if we need to improve button driver to make it work well with 
systemd 229.

Thanks and best regards
Lv 



Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-13 Thread Benjamin Tissoires
Hi,

[Sorry for the delay, I have been sidetracked from this]

On Jun 07 2017 or thereabouts, Lennart Poettering wrote:
> On Thu, 01.06.17 20:46, Benjamin Tissoires (benjamin.tissoi...@redhat.com) 
> wrote:
> 
> > Hi,
> > 
> > Sending this as a WIP as it still need a few changes, but it mostly works as
> > expected (still not fully compliant yet).
> > 
> > So this is based on Lennart's comment in [1]: if the LID state is not 
> > reliable,
> > the kernel should not export the LID switch device as long as we are not 
> > sure
> > about its state.
> 
> Ah nice! I (obviously) like this approach.

Heh. Now I just need to convince Lv that it's the right approach.

> 
> > Note that systemd currently doesn't sync the state when the input node just
> > appears. This is a systemd bug, and it should not be handled by the kernel
> > community.
> 
> Uh if this is borked, we should indeed fix this in systemd. Is there
> already a systemd github bug about this? If not, please create one,
> and we'll look into it!

I don't think there is. I haven't raised it yet because I am not so sure
this will not break again those worthless unreliable LID, and if we play
whack a mole between the kernel and user space, things are going to be
nasty. So I'd rather have this fixed in systemd along with the
unreliable LID switch knowledge, so we are sure that the kernel behaves
the way we expect it to be.

> 
> Thanks for working on this,
> 

No worries.

Cheers,
Benjamin



Re: [systemd-devel] [WIP PATCH 0/4] Rework the unreliable LID switch exported by ACPI

2017-06-07 Thread Lennart Poettering
On Thu, 01.06.17 20:46, Benjamin Tissoires (benjamin.tissoi...@redhat.com) 
wrote:

> Hi,
> 
> Sending this as a WIP as it still need a few changes, but it mostly works as
> expected (still not fully compliant yet).
> 
> So this is based on Lennart's comment in [1]: if the LID state is not 
> reliable,
> the kernel should not export the LID switch device as long as we are not sure
> about its state.

Ah nice! I (obviously) like this approach.

> Note that systemd currently doesn't sync the state when the input node just
> appears. This is a systemd bug, and it should not be handled by the kernel
> community.

Uh if this is borked, we should indeed fix this in systemd. Is there
already a systemd github bug about this? If not, please create one,
and we'll look into it!

Thanks for working on this,

Lennart

-- 
Lennart Poettering, Red Hat