subject:"Re\: \[PATH\] core\: implement a safe wl_signal

Re: [PATH] core: implement a safe wl_signal_emit

2018-02-22 Thread Derek Foreman


On 2018-02-22 01:36 PM, Markus Ongyerth wrote:

On 2018/2月/22 12:31, Derek Foreman wrote:

On 2018-02-22 10:48 AM, Markus Ongyerth wrote:

On 2018/2月/22 09:34, Derek Foreman wrote:

On 2018-02-22 08:58 AM, Daniel Stone wrote:

Hi,

On 22 February 2018 at 14:14, Markus Ongyerth  wrote:

It seems that this patch makes that assumption invalid, and we would
need patches to weston, enlightenment, and mutter to prevent a
use-after-free during the signal emit?  Now I'm seeing valgrind errors
on E and weston during buffer destroy.

Personally, I don't think we should change this assumption and declare
the existing code that's worked for years suddenly buggy. :/


The code was buggy the whole time. Just because it was never triggered, does
not imply it's not a bug.
free()ing these struct wl_list without removing them from the signal list
leaves other struct wl_list that are outside the control of the current code
in an invalid, prone to use-after-free, state.


There's a difference between something being 'buggy' and a design with
non-obvious details you might not like. If destroy handlers not
removing their list elements were buggy, we would be seeing bugs from
that. But instead it's part of the API contract: when a destroy signal
is invoked, you are guaranteed that this will be the first and only
access to your list member. This implies that anyone trying to remove
their link from the list (accessing other listeners in the list) is
buggy.


Suddenly allowing this is a breaking API change (*some* struct wl_list inside
a wl_listener) can suddenly become invalid for reasons outside the users
control.


I don't know if I've quite parsed this right, but as above, not
removing elements of a destroy listener list, when the listener is
invoked, is our current API.


Related to this entire thing:
In [1] you added tests for this and promote something, that is in essence, a
breaking change.


It's not a breaking change though: it's the API we've pushed on everyone so far.


Also, it doesn't prevent external libraries from doing whichever they want
if they have complete control of the destroy listener list contents.


So you suggest we break a now mandated api and expose ourselves to funny
implementation detail changes that are now justified, because *we break API*?


I'm sorry, I'm have a hard time parsing this.

The suggested mandate is that libwayland internals won't touch the listener
after you receive the notification.

Libwayland internals?
That would be fine. Then effectivly nobody can rely on this either way, if
they ever want to have code that can be integrated with other consumers of
libwayland.

The problem is, that a single library that relies on this will force anyone
that uses it to adhere to it.
If we now have a listener that does things properly it will use-after-free if
it ever shares a signal list with that library.


I've just sent an RFC patch to the list that resolves that issue.



That on receipt of a destroy notification you can free your stuff without
removing your listener from the list.


Which is exactly what we don't want. Since that implies whenever we share a
destroy signal list with a listener from somewhere that's not inherently our
code, we can't rely on us being allowed to remove ourselves form the list
(that's everything from libwayland btw.).
And if we do, we'd have to take the blame for any integration that fails,
since *WE* break API now.


See above.


doing whichever they want

So if you suggest that we jsut break api here, you actually suggest we do
something that will break as soon as someone wants to integrate a library that
also works with another codebase that relies on this mandate.


See above.





What is prevented is libwayland's destroy notifier list walk accessing an
element again after it is potentially freed by external code.


Which could be fixed by said node removing itself from a list, instead of
leaving a list in invalid states for asumed behaviour.


And, of course, break the whole external world in the process.

No.

These users break interop with qt. If we ever expect a library that uses qt
(maybe in a plugin for weston?) to be used together with the codebase then
things break.


See above.



We're not going to break years of working code built on what seems to have
been a quite reasonable assumption.


And the asumption that memory that's pointed into (by pointers that are
*really* easy to fix) sounds pretty reasonable as well.


The assumption that this is really easy to fix is incorrect.

Yeah, it's a one liner in every destroy handler everywhere in code we 
don't follow.  That part is trivial.


Making sure every distro that packages both wayland and something that 
calls wayland updates all those callers to new versions before they 
update wayland to something that violates these assumptions is less trivial.


While I generally despise the word "ecosystem" when used to describe 
anything that doesn't contain fish, we'd really be doing ours harm if we 
brok

Re: [PATH] core: implement a safe wl_signal_emit

2018-02-22 Thread Markus Ongyerth

On 2018/2月/22 12:31, Derek Foreman wrote:
> On 2018-02-22 10:48 AM, Markus Ongyerth wrote:
> > On 2018/2月/22 09:34, Derek Foreman wrote:
> > > On 2018-02-22 08:58 AM, Daniel Stone wrote:
> > > > Hi,
> > > > 
> > > > On 22 February 2018 at 14:14, Markus Ongyerth  wrote:
> > > > > > It seems that this patch makes that assumption invalid, and we would
> > > > > > need patches to weston, enlightenment, and mutter to prevent a
> > > > > > use-after-free during the signal emit?  Now I'm seeing valgrind 
> > > > > > errors
> > > > > > on E and weston during buffer destroy.
> > > > > > 
> > > > > > Personally, I don't think we should change this assumption and 
> > > > > > declare
> > > > > > the existing code that's worked for years suddenly buggy. :/
> > > > > 
> > > > > The code was buggy the whole time. Just because it was never 
> > > > > triggered, does
> > > > > not imply it's not a bug.
> > > > > free()ing these struct wl_list without removing them from the signal 
> > > > > list
> > > > > leaves other struct wl_list that are outside the control of the 
> > > > > current code
> > > > > in an invalid, prone to use-after-free, state.
> > > > 
> > > > There's a difference between something being 'buggy' and a design with
> > > > non-obvious details you might not like. If destroy handlers not
> > > > removing their list elements were buggy, we would be seeing bugs from
> > > > that. But instead it's part of the API contract: when a destroy signal
> > > > is invoked, you are guaranteed that this will be the first and only
> > > > access to your list member. This implies that anyone trying to remove
> > > > their link from the list (accessing other listeners in the list) is
> > > > buggy.
> > > > 
> > > > > Suddenly allowing this is a breaking API change (*some* struct 
> > > > > wl_list inside
> > > > > a wl_listener) can suddenly become invalid for reasons outside the 
> > > > > users
> > > > > control.
> > > > 
> > > > I don't know if I've quite parsed this right, but as above, not
> > > > removing elements of a destroy listener list, when the listener is
> > > > invoked, is our current API.
> > > > 
> > > > > Related to this entire thing:
> > > > > In [1] you added tests for this and promote something, that is in 
> > > > > essence, a
> > > > > breaking change.
> > > > 
> > > > It's not a breaking change though: it's the API we've pushed on 
> > > > everyone so far.
> > > 
> > > Also, it doesn't prevent external libraries from doing whichever they want
> > > if they have complete control of the destroy listener list contents.
> > 
> > So you suggest we break a now mandated api and expose ourselves to funny
> > implementation detail changes that are now justified, because *we break 
> > API*?
> 
> I'm sorry, I'm have a hard time parsing this.
> 
> The suggested mandate is that libwayland internals won't touch the listener
> after you receive the notification.
Libwayland internals?
That would be fine. Then effectivly nobody can rely on this either way, if 
they ever want to have code that can be integrated with other consumers of 
libwayland.

The problem is, that a single library that relies on this will force anyone 
that uses it to adhere to it.
If we now have a listener that does things properly it will use-after-free if 
it ever shares a signal list with that library.

> 
> That on receipt of a destroy notification you can free your stuff without
> removing your listener from the list.

Which is exactly what we don't want. Since that implies whenever we share a 
destroy signal list with a listener from somewhere that's not inherently our 
code, we can't rely on us being allowed to remove ourselves form the list 
(that's everything from libwayland btw.).
And if we do, we'd have to take the blame for any integration that fails, 
since *WE* break API now.

> doing whichever they want
So if you suggest that we jsut break api here, you actually suggest we do 
something that will break as soon as someone wants to integrate a library that 
also works with another codebase that relies on this mandate.

> 
> > > 
> > > What is prevented is libwayland's destroy notifier list walk accessing an
> > > element again after it is potentially freed by external code.
> > 
> > Which could be fixed by said node removing itself from a list, instead of
> > leaving a list in invalid states for asumed behaviour.
> 
> And, of course, break the whole external world in the process.
> 
> No.
These users break interop with qt. If we ever expect a library that uses qt 
(maybe in a plugin for weston?) to be used together with the codebase then 
things break.
> 
> We're not going to break years of working code built on what seems to have
> been a quite reasonable assumption.
> 
And the asumption that memory that's pointed into (by pointers that are 
*really* easy to fix) sounds pretty reasonable as well.

I point out qt here (which afaik implies KDE) to also get a few years, though 
I don't see a reason why 5 year old code that behaves

Re: [PATH] core: implement a safe wl_signal_emit

2018-02-22 Thread Markus Ongyerth

On 2018/2月/22 04:53, Daniel Stone wrote:
> Hi ongy,
> 
> On 22 February 2018 at 16:03, Markus Ongyerth  wrote:
> > On 2018/2月/22 02:58, Daniel Stone wrote:
> >> On 22 February 2018 at 14:14, Markus Ongyerth  wrote:
> >> > The code was buggy the whole time. Just because it was never triggered, 
> >> > does
> >> > not imply it's not a bug.
> >> > free()ing these struct wl_list without removing them from the signal list
> >> > leaves other struct wl_list that are outside the control of the current 
> >> > code
> >> > in an invalid, prone to use-after-free, state.
> >>
> >> There's a difference between something being 'buggy' and a design with
> >> non-obvious details you might not like. If destroy handlers not
> >> removing their list elements were buggy, we would be seeing bugs from
> >> that. But instead it's part of the API contract: when a destroy signal
> >> is invoked, you are guaranteed that this will be the first and only
> >> access to your list member.
> >
> > Could you kindly point me towards the point that states this?
> > Also the point that properly distinguishes two kinds of wl_signal_emit
> > semantics for wl_listener?
> >
> > I don't want to say it can't be made part of the API, I'm saying it is not.
> > And at the current point it's a bug. If the decision is to fix these bugs, 
> > by
> > changing the API or the code, is another issue.
> 
> I can't point you to a document where it is written in stone, no. But
> it would break Weston, WebKit, Clutter, and some others. Changing that
> would be a break of our effective API: we've exposed behaviour which
> people have come to rely on, regardless of whether or not it's
> documented.

I don't mind putting up bug reports with all of them for their broken code.
Also I don't really conside Weston an outside consumer, it's developed over 
the same channels as libwayland. If you want the fixes to weston attached in 
the patch series, I'd be glad to whip them up.

Qt relies on this not being the case (they remove listeners from the list in 
destructors), we rely on it not being the case. If I found a python or java 
wrapper, I would expect them to use detructors as well. It's not just a 
feature of nieche languages like Haskell.
IF we want to guarantee any kind of interop between different libraries, we 
need one (and only one) way to do this. Remove ourselves from the list, or 
free and leave the list broken.

I do agree, as long as we keep to ourselves, we could have the policy to just 
do it as we did so far. But what if we have a library that can be used 
standalone? Then we should keep to the official specification on what to do.
Thus all code has to be adjusted. There is no room for private breaking of 
this.

How do you determine which projects to piss off here?
One side has implemented measures to keep code correct and prevent accidently 
leaking memory or produce use-after-free by forcing a removal before memory is 
freed.
The other side broke basic C sanity by freeing memory that's still pointed 
into.

(Side note: I'm of course only slightly biased here, and this is not written 
in any suggestive way at all)

> 
> Equally, you could say that wl_display_get_fd() is 'buggy' because it
> should follow the established convention of duplicating FDs returned
> by a library, so users can be in control of lifetime. You could 'fix'
> that by having wl_display_get_fd() duplicate the FDs it returns, and
> when every Wayland client using an event loop complains that we're now
> causing them to leak FDs, tell them it's their own fault for relying
> on behaviour which was not formally documented as API/ABI. But that's
> really not a helpful answer, and it's a very good way to get people to
> stop using your software.
The doc of that one actually says "the", not "a" or "a copy of", so I'd 
actually say you are breaking things that you shouldn't :)

I have to admit, I'm making asumptions as well. Based on basic sanity (IMO).
It's convention that explicitly state when something returned by a call has to 
befreed by the caller, so not stating that implicitly states things.

I of course also asume memory I have pointers into doesn't get free()'d at 
arbitrary times without any way for me to detect it.
If this is an asumption that should not be may around libwayland I can 
guarantee, you will be free from my complaints.

And see the section above as to why I think we have to have a certain amount 
of breakage in either direction.
We can't just say "do whatever you want" if we expect code sharing instead of 
monolithic applications.

> > We could have a listener on a client destroy signal to clean ourself up, 
> > and
> > one on a seats.
> > If such a seat is created by a virtual keyboard (I'm going on a limb here, 
> > but
> > whatever), it would naturally be destroyed with the client.
> > Now if we are in the seat destroy signal, how do we determine what to do 
> > with
> > our client listener?
> > If this is caused by a client destruction, we can't remove ours

Re: [PATH] core: implement a safe wl_signal_emit

2018-02-22 Thread Derek Foreman


On 2018-02-22 10:48 AM, Markus Ongyerth wrote:

On 2018/2月/22 09:34, Derek Foreman wrote:

On 2018-02-22 08:58 AM, Daniel Stone wrote:

Hi,

On 22 February 2018 at 14:14, Markus Ongyerth  wrote:

It seems that this patch makes that assumption invalid, and we would
need patches to weston, enlightenment, and mutter to prevent a
use-after-free during the signal emit?  Now I'm seeing valgrind errors
on E and weston during buffer destroy.

Personally, I don't think we should change this assumption and declare
the existing code that's worked for years suddenly buggy. :/


The code was buggy the whole time. Just because it was never triggered, does
not imply it's not a bug.
free()ing these struct wl_list without removing them from the signal list
leaves other struct wl_list that are outside the control of the current code
in an invalid, prone to use-after-free, state.


There's a difference between something being 'buggy' and a design with
non-obvious details you might not like. If destroy handlers not
removing their list elements were buggy, we would be seeing bugs from
that. But instead it's part of the API contract: when a destroy signal
is invoked, you are guaranteed that this will be the first and only
access to your list member. This implies that anyone trying to remove
their link from the list (accessing other listeners in the list) is
buggy.


Suddenly allowing this is a breaking API change (*some* struct wl_list inside
a wl_listener) can suddenly become invalid for reasons outside the users
control.


I don't know if I've quite parsed this right, but as above, not
removing elements of a destroy listener list, when the listener is
invoked, is our current API.


Related to this entire thing:
In [1] you added tests for this and promote something, that is in essence, a
breaking change.


It's not a breaking change though: it's the API we've pushed on everyone so far.


Also, it doesn't prevent external libraries from doing whichever they want
if they have complete control of the destroy listener list contents.


So you suggest we break a now mandated api and expose ourselves to funny
implementation detail changes that are now justified, because *we break API*?


I'm sorry, I'm have a hard time parsing this.

The suggested mandate is that libwayland internals won't touch the 
listener after you receive the notification.


That on receipt of a destroy notification you can free your stuff 
without removing your listener from the list.




What is prevented is libwayland's destroy notifier list walk accessing an
element again after it is potentially freed by external code.


Which could be fixed by said node removing itself from a list, instead of
leaving a list in invalid states for asumed behaviour.


And, of course, break the whole external world in the process.

No.

We're not going to break years of working code built on what seems to 
have been a quite reasonable assumption.




We can completely replace the internal data structures in libwayland with
whatever we want, but we must preserve that behaviour.


Why can we change one implementation detail, but have to keep another one?


One is API, the other is implementation detail, so the question is 
irrelevant?


I understand what you're saying, I really do, but it's not pragmatic. 
Again, we can't break all external users of our library for very little 
real benefit.




It does not mean we can never rework the destroy signal emit path in
libwayland to allow some items to be removed by the notification handlers
and others just freed, or to allow a destroy notifier to touch the list.A


There is no destroy signal emit path.


Sure there is.  It's currently an implementation detail that it happens 
to be the exact same path as other signal emitters.



There is a signal emit path, which may be called on destroy signals, and
suddenly has to follow different semantics because of what exactly?


Because a large amount of software will break if we change the currently 
expected behavior.


I realize this leaves me open to all manner of ridiculous slippery slope 
arguments, such as "if my software depended on a bug in an 
authentication system that forgot to ask for a password...", so all 
snark aside can we back away from that ledge now?


As library authors we have to be pragmatic.  We have to avoid surprising 
our callers.  While this API constraint is annoying, it is harmless, and 
demanding all old code suddenly conform to a new, different constraint 
that was never enforced before is too onerous.



And how exactly do we expose that to our listeners? By the name of the signal
in the struct?


Why do we need to?  A notification callback written to be used as a 
destroy notifier will be written differently than one intended to be 
used for other things.


Most notably, a non-destruction notifier will be capable of being called 
more than once, or a crash could occur (right now).




It just makes it easy to verify that attempts to do that don't break
guaran

Re: [PATH] core: implement a safe wl_signal_emit

2018-02-22 Thread Daniel Stone

Hi ongy,

On 22 February 2018 at 16:03, Markus Ongyerth  wrote:
> On 2018/2月/22 02:58, Daniel Stone wrote:
>> On 22 February 2018 at 14:14, Markus Ongyerth  wrote:
>> > The code was buggy the whole time. Just because it was never triggered, 
>> > does
>> > not imply it's not a bug.
>> > free()ing these struct wl_list without removing them from the signal list
>> > leaves other struct wl_list that are outside the control of the current 
>> > code
>> > in an invalid, prone to use-after-free, state.
>>
>> There's a difference between something being 'buggy' and a design with
>> non-obvious details you might not like. If destroy handlers not
>> removing their list elements were buggy, we would be seeing bugs from
>> that. But instead it's part of the API contract: when a destroy signal
>> is invoked, you are guaranteed that this will be the first and only
>> access to your list member.
>
> Could you kindly point me towards the point that states this?
> Also the point that properly distinguishes two kinds of wl_signal_emit
> semantics for wl_listener?
>
> I don't want to say it can't be made part of the API, I'm saying it is not.
> And at the current point it's a bug. If the decision is to fix these bugs, by
> changing the API or the code, is another issue.

I can't point you to a document where it is written in stone, no. But
it would break Weston, WebKit, Clutter, and some others. Changing that
would be a break of our effective API: we've exposed behaviour which
people have come to rely on, regardless of whether or not it's
documented.

Equally, you could say that wl_display_get_fd() is 'buggy' because it
should follow the established convention of duplicating FDs returned
by a library, so users can be in control of lifetime. You could 'fix'
that by having wl_display_get_fd() duplicate the FDs it returns, and
when every Wayland client using an event loop complains that we're now
causing them to leak FDs, tell them it's their own fault for relying
on behaviour which was not formally documented as API/ABI. But that's
really not a helpful answer, and it's a very good way to get people to
stop using your software.

> (duplicated)
>>when a destroy signal
>> is invoked, you are guaranteed that this will be the first and only
>> access to your list member.
>
> This is wrong btw. This would prevent us from adding to a listener to a
> signal that already has listeners.
> Or use wl_signal_get on the destroy signal as soon as it has at least one
> listener.
>
> The semantics you want to state is, that it's the last time it is accessed.
> And even that gets iffy if someone wants to wl_signal_get a destroy listener
> at any point.

Thanks, your correction is accurate.

>>This implies that anyone trying to remove
>> their link from the list (accessing other listeners in the list) is
>> buggy.
>
> This would be rather intersting.
> It would be a fair point to add this to the emit callback of destroy
> listeners, but do we have any way to determine if a wl_signal_emit is
> currently in flight for a signal?
> What about the case where we have a struct that contains multiple destroy
> listeners to different components? How can we determine, the destroy event
> isn't a transitive event of another listener we have?

Right, generally speaking, that would be something you need to
explicitly communicate between those components.

> We could have a listener on a client destroy signal to clean ourself up, and
> one on a seats.
> If such a seat is created by a virtual keyboard (I'm going on a limb here, but
> whatever), it would naturally be destroyed with the client.
> Now if we are in the seat destroy signal, how do we determine what to do with
> our client listener?
> If this is caused by a client destruction, we can't remove ourselves form the
> client anymore.
> If this is not caused by a client destruction, we have to remove ourselves.

That's arguably somewhat contrived: you're not concerned about the
lifetime of the client per se, but just the lifetime of the object. So
you can just keep your destroy listener on the object, and know that
if the client gets destroyed then the object will as well. The same
goes for related objects, e.g. wl_surface -> xdg_surface. But it is a
good point, and definitely shows reason to be really careful about
what we do here.

>> > Related to this entire thing:
>> > In [1] you added tests for this and promote something, that is in essence, 
>> > a
>> > breaking change.
>>
>> It's not a breaking change though: it's the API we've pushed on everyone so 
>> far.
>
> It is a breaking change in libwayland.
> Or could you kindly point me to what currently forces me to not remove my
> listeners in libwayland?
> Current libwayland allows both versions as implementation detail, but one
> follows the wl_listener semantics, the other makes asumptions.

That's true, though in practice you have to have the entire stack of
listeners for that resource completely locked down so they do the same
thing. Given the pr

Re: [PATH] core: implement a safe wl_signal_emit

2018-02-22 Thread Markus Ongyerth

On 2018/2月/22 09:34, Derek Foreman wrote:
> On 2018-02-22 08:58 AM, Daniel Stone wrote:
> > Hi,
> > 
> > On 22 February 2018 at 14:14, Markus Ongyerth  wrote:
> > > > It seems that this patch makes that assumption invalid, and we would
> > > > need patches to weston, enlightenment, and mutter to prevent a
> > > > use-after-free during the signal emit?  Now I'm seeing valgrind errors
> > > > on E and weston during buffer destroy.
> > > > 
> > > > Personally, I don't think we should change this assumption and declare
> > > > the existing code that's worked for years suddenly buggy. :/
> > > 
> > > The code was buggy the whole time. Just because it was never triggered, 
> > > does
> > > not imply it's not a bug.
> > > free()ing these struct wl_list without removing them from the signal list
> > > leaves other struct wl_list that are outside the control of the current 
> > > code
> > > in an invalid, prone to use-after-free, state.
> > 
> > There's a difference between something being 'buggy' and a design with
> > non-obvious details you might not like. If destroy handlers not
> > removing their list elements were buggy, we would be seeing bugs from
> > that. But instead it's part of the API contract: when a destroy signal
> > is invoked, you are guaranteed that this will be the first and only
> > access to your list member. This implies that anyone trying to remove
> > their link from the list (accessing other listeners in the list) is
> > buggy.
> > 
> > > Suddenly allowing this is a breaking API change (*some* struct wl_list 
> > > inside
> > > a wl_listener) can suddenly become invalid for reasons outside the users
> > > control.
> > 
> > I don't know if I've quite parsed this right, but as above, not
> > removing elements of a destroy listener list, when the listener is
> > invoked, is our current API.
> > 
> > > Related to this entire thing:
> > > In [1] you added tests for this and promote something, that is in 
> > > essence, a
> > > breaking change.
> > 
> > It's not a breaking change though: it's the API we've pushed on everyone so 
> > far.
> 
> Also, it doesn't prevent external libraries from doing whichever they want
> if they have complete control of the destroy listener list contents.

So you suggest we break a now mandated api and expose ourselves to funny 
implementation detail changes that are now justified, because *we break API*?

> 
> What is prevented is libwayland's destroy notifier list walk accessing an
> element again after it is potentially freed by external code.

Which could be fixed by said node removing itself from a list, instead of 
leaving a list in invalid states for asumed behaviour.

> 
> We can completely replace the internal data structures in libwayland with
> whatever we want, but we must preserve that behaviour.

Why can we change one implementation detail, but have to keep another one?

> 
> It does not mean we can never rework the destroy signal emit path in
> libwayland to allow some items to be removed by the notification handlers
> and others just freed, or to allow a destroy notifier to touch the list.A

There is no destroy signal emit path.
There is a signal emit path, which may be called on destroy signals, and 
suddenly has to follow different semantics because of what exactly?
And how exactly do we expose that to our listeners? By the name of the signal 
in the struct?

> 
> It just makes it easy to verify that attempts to do that don't break
> guarantees we've always made.

Again, kindly point me to the point where an implementation detail made it 
towards a guarantee.
And for which signals exactly. Under which circumstances, oh and what 
guarantee exactly.

It would be nice if I had a proper proposal to take apart for this to be 
explictly added to the API.
Otherwise I can provide a (intentionaly snarky) one :)

Part one ammend [1] (wl_listener) with:
"The wl_list inside a wl_listener can be invalid (pointer towards free'd 
memory) at any time the listener notify is called. For further details see 
wl_signal.

Part two ammend [2] (wl_signal) with:
Signals with the name `*_destroy` have special semantics.
If they are currently emitted, any wl_signal_add/wl_signal_get on the signal 
or wl_list_remove on the link of any listener in it is invalid.
This is also the cause for invalid struct wl_list entries in wl_listener.

[1] https://wayland.freedesktop.org/docs/html/apc.html#Server-structwl__listener
[2] https://wayland.freedesktop.org/docs/html/apc.html#Server-structwl__signal

Sure, the exact way they are specified here is a bit funny.
We could also add that to the various `wl_*_add_destroy_listener` functions.
Then we'd have the (from libwayland side breaking) change that e.g.  
`wl_event_loop_get_destroy_listener` can't be called anymore under certain 
circumstances.

> 
> Thanks,
> Derek
> 
> > > It also makes good wrapper implementations into managed languages 
> > > annoying.
> > > For example (admittedly my own) [2] ensures a wl_listener can never be

Re: [PATH] core: implement a safe wl_signal_emit

2018-02-22 Thread Markus Ongyerth

On 2018/2月/22 02:58, Daniel Stone wrote:
> Hi,
> 
> On 22 February 2018 at 14:14, Markus Ongyerth  wrote:
> >> It seems that this patch makes that assumption invalid, and we would
> >> need patches to weston, enlightenment, and mutter to prevent a
> >> use-after-free during the signal emit?  Now I'm seeing valgrind errors
> >> on E and weston during buffer destroy.
> >>
> >> Personally, I don't think we should change this assumption and declare
> >> the existing code that's worked for years suddenly buggy. :/
> >
> > The code was buggy the whole time. Just because it was never triggered, does
> > not imply it's not a bug.
> > free()ing these struct wl_list without removing them from the signal list
> > leaves other struct wl_list that are outside the control of the current code
> > in an invalid, prone to use-after-free, state.
> 
> There's a difference between something being 'buggy' and a design with
> non-obvious details you might not like. If destroy handlers not
> removing their list elements were buggy, we would be seeing bugs from
> that. But instead it's part of the API contract: when a destroy signal
> is invoked, you are guaranteed that this will be the first and only
> access to your list member.

Could you kindly point me towards the point that states this?
Also the point that properly distinguishes two kinds of wl_signal_emit
semantics for wl_listener?

I don't want to say it can't be made part of the API, I'm saying it is not.
And at the current point it's a bug. If the decision is to fix these bugs, by 
changing the API or the code, is another issue.

(duplicated)
>when a destroy signal
> is invoked, you are guaranteed that this will be the first and only
> access to your list member.

This is wrong btw. This would prevent us from adding to a listener to a
signal that already has listeners.
Or use wl_signal_get on the destroy signal as soon as it has at least one 
listener.

The semantics you want to state is, that it's the last time it is accessed.
And even that gets iffy if someone wants to wl_signal_get a destroy listener 
at any point.

>This implies that anyone trying to remove
> their link from the list (accessing other listeners in the list) is
> buggy.

This would be rather intersting.
It would be a fair point to add this to the emit callback of destroy 
listeners, but do we have any way to determine if a wl_signal_emit is 
currently in flight for a signal?
What about the case where we have a struct that contains multiple destroy 
listeners to different components? How can we determine, the destroy event 
isn't a transitive event of another listener we have?
We could have a listener on a client destroy signal to clean ourself up, and 
one on a seats.
If such a seat is created by a virtual keyboard (I'm going on a limb here, but 
whatever), it would naturally be destroyed with the client.

Now if we are in the seat destroy signal, how do we determine what to do with 
our client listener?
If this is caused by a client destruction, we can't remove ourselves form the 
client anymore.
If this is not caused by a client destruction, we have to remove ourselves.

I can also easily point you towards usecases, where this would lead to 
pointless code duplication, since we can remove our destroy listeners on very 
similar events, with identical behaviour.
Except for whether we are allowed to (and in that cased forced to) 
wl_list_remove our listener, or we'd have to just do nothing with it, in the 
other case.

> 
> > Suddenly allowing this is a breaking API change (*some* struct wl_list 
> > inside
> > a wl_listener) can suddenly become invalid for reasons outside the users
> > control.
> 
> I don't know if I've quite parsed this right, but as above, not
> removing elements of a destroy listener list, when the listener is
> invoked, is our current API.
> 
> > Related to this entire thing:
> > In [1] you added tests for this and promote something, that is in essence, a
> > breaking change.
> 
> It's not a breaking change though: it's the API we've pushed on everyone so 
> far.

It is a breaking change in libwayland.
Or could you kindly point me to what currently forces me to not remove my 
listeners in libwayland?
Current libwayland allows both versions as implementation detail, but one 
follows the wl_listener semantics, the other makes asumptions.

> 
> > It also makes good wrapper implementations into managed languages annoying.
> > For example (admittedly my own) [2] ensures a wl_listener can never be lost
> > and leak memory. It is freed when the Handle is GC'd.
> > To prevent any use-after-free into this wl_listener, it removes the listener
> > from the list beforehand.
> > I would very much like to keep this code (since it is perfectly valid on the
> > current ABI) and is good design in managed languages.
> 
> Sure, that is annoying. In hindsight, it probably wasn't a good API
> for particularly the new generation of managed languages. In the
> meantime, probably the easiest way to do th

Re: [PATH] core: implement a safe wl_signal_emit

2018-02-22 Thread Derek Foreman


On 2018-02-22 08:58 AM, Daniel Stone wrote:

Hi,

On 22 February 2018 at 14:14, Markus Ongyerth  wrote:

It seems that this patch makes that assumption invalid, and we would
need patches to weston, enlightenment, and mutter to prevent a
use-after-free during the signal emit?  Now I'm seeing valgrind errors
on E and weston during buffer destroy.

Personally, I don't think we should change this assumption and declare
the existing code that's worked for years suddenly buggy. :/


The code was buggy the whole time. Just because it was never triggered, does
not imply it's not a bug.
free()ing these struct wl_list without removing them from the signal list
leaves other struct wl_list that are outside the control of the current code
in an invalid, prone to use-after-free, state.


There's a difference between something being 'buggy' and a design with
non-obvious details you might not like. If destroy handlers not
removing their list elements were buggy, we would be seeing bugs from
that. But instead it's part of the API contract: when a destroy signal
is invoked, you are guaranteed that this will be the first and only
access to your list member. This implies that anyone trying to remove
their link from the list (accessing other listeners in the list) is
buggy.


Suddenly allowing this is a breaking API change (*some* struct wl_list inside
a wl_listener) can suddenly become invalid for reasons outside the users
control.


I don't know if I've quite parsed this right, but as above, not
removing elements of a destroy listener list, when the listener is
invoked, is our current API.


Related to this entire thing:
In [1] you added tests for this and promote something, that is in essence, a
breaking change.


It's not a breaking change though: it's the API we've pushed on everyone so far.


Also, it doesn't prevent external libraries from doing whichever they 
want if they have complete control of the destroy listener list contents.


What is prevented is libwayland's destroy notifier list walk accessing 
an element again after it is potentially freed by external code.


We can completely replace the internal data structures in libwayland 
with whatever we want, but we must preserve that behaviour.


It does not mean we can never rework the destroy signal emit path in 
libwayland to allow some items to be removed by the notification 
handlers and others just freed, or to allow a destroy notifier to touch 
the list.


It just makes it easy to verify that attempts to do that don't break 
guarantees we've always made.


Thanks,
Derek


It also makes good wrapper implementations into managed languages annoying.
For example (admittedly my own) [2] ensures a wl_listener can never be lost
and leak memory. It is freed when the Handle is GC'd.
To prevent any use-after-free into this wl_listener, it removes the listener
from the list beforehand.
I would very much like to keep this code (since it is perfectly valid on the
current ABI) and is good design in managed languages.


Sure, that is annoying. In hindsight, it probably wasn't a good API
for particularly the new generation of managed languages. In the
meantime, probably the easiest way to do this, and come into line with
all the other users, would be to define a separate destroy-listener
type which intentionally leaks its wl_listener link after being
signaled, rather than removing it.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel



___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel

Re: [PATH] core: implement a safe wl_signal_emit

2018-02-22 Thread Daniel Stone

Hi,

On 22 February 2018 at 14:14, Markus Ongyerth  wrote:
>> It seems that this patch makes that assumption invalid, and we would
>> need patches to weston, enlightenment, and mutter to prevent a
>> use-after-free during the signal emit?  Now I'm seeing valgrind errors
>> on E and weston during buffer destroy.
>>
>> Personally, I don't think we should change this assumption and declare
>> the existing code that's worked for years suddenly buggy. :/
>
> The code was buggy the whole time. Just because it was never triggered, does
> not imply it's not a bug.
> free()ing these struct wl_list without removing them from the signal list
> leaves other struct wl_list that are outside the control of the current code
> in an invalid, prone to use-after-free, state.

There's a difference between something being 'buggy' and a design with
non-obvious details you might not like. If destroy handlers not
removing their list elements were buggy, we would be seeing bugs from
that. But instead it's part of the API contract: when a destroy signal
is invoked, you are guaranteed that this will be the first and only
access to your list member. This implies that anyone trying to remove
their link from the list (accessing other listeners in the list) is
buggy.

> Suddenly allowing this is a breaking API change (*some* struct wl_list inside
> a wl_listener) can suddenly become invalid for reasons outside the users
> control.

I don't know if I've quite parsed this right, but as above, not
removing elements of a destroy listener list, when the listener is
invoked, is our current API.

> Related to this entire thing:
> In [1] you added tests for this and promote something, that is in essence, a
> breaking change.

It's not a breaking change though: it's the API we've pushed on everyone so far.

> It also makes good wrapper implementations into managed languages annoying.
> For example (admittedly my own) [2] ensures a wl_listener can never be lost
> and leak memory. It is freed when the Handle is GC'd.
> To prevent any use-after-free into this wl_listener, it removes the listener
> from the list beforehand.
> I would very much like to keep this code (since it is perfectly valid on the
> current ABI) and is good design in managed languages.

Sure, that is annoying. In hindsight, it probably wasn't a good API
for particularly the new generation of managed languages. In the
meantime, probably the easiest way to do this, and come into line with
all the other users, would be to define a separate destroy-listener
type which intentionally leaks its wl_listener link after being
signaled, rather than removing it.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel

Re: [PATH] core: implement a safe wl_signal_emit

Re: [PATH] core: implement a safe wl_signal_emit

Re: [PATH] core: implement a safe wl_signal_emit

Re: [PATH] core: implement a safe wl_signal_emit

Re: [PATH] core: implement a safe wl_signal_emit

Re: [PATH] core: implement a safe wl_signal_emit

Re: [PATH] core: implement a safe wl_signal_emit

Re: [PATH] core: implement a safe wl_signal_emit

Re: [PATH] core: implement a safe wl_signal_emit

9 matches

Site Navigation

Mail list logo

Footer information