Re: epoll exposure

2023-08-16 Thread Mouse
>> It also is a wrong way to build self-configuration; such a test is
>> vulnerable to both false positives and false negatives.  It should
>> be reported upstream as a bug.  Much righter is to test whether
>> epoll, if present, produces the behaviour the program expects in the
>> uses it makes of it.

> As Linux introduced epoll (or so I think) I think it's appropriate --
> absent a SUS specification -- to assume it works as under Linux?

Probably.  But that wasn't what was being suggested, at least not as I
read it.  Here's fuller context:

> The problem is third-party software assumes epoll == Linux,
 Software that makes stupid assumptions will never go away.
 Is it better to work around it (not ship epoll.h), or to get it
 fixed (report it upstream as the bug it is)?  [...]
>>> I don't really see it as a bug.  You'd have to have all those
>>> problems have configure logic that says

>>>   if we find an epoll implementation, then we have a list of
>>>   operating systems that have implemented an epoll that has
>>>   different semantics and we have to reject it

>>> It seems far more reasonable to say that if an OS implements a
>>> different epoll, then it should call it something else.

>> [...]  It also is a wrong way to build self-configuration; [...]

What I was arguing is not "NetBSD should have epoll with different
semantics" but "the problematic programs would have to have a configure
test with a blacklist of OS/version pairs".  Blacklisting by OS/version
is what I was arguing against.

> How would you argue if some other OS was to introduce something
> called kqueue with semantics different from FreeBSD?

I would still say that a configure test that blacklisted them by
OS/version is a broken test.  I say it should either blindly assume the
semantics it expects or it should test for the semantics it cares
about, depending on the philosophy stance its authors prefer.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: epoll exposure

2023-08-15 Thread Edgar Fuß
> It also is a wrong way to build self-configuration; such a test is
> vulnerable to both false positives and false negatives.  It should be
> reported upstream as a bug.  Much righter is to test whether epoll, if
> present, produces the behaviour the program expects in the uses it
> makes of it.
As Linux introduced epoll (or so I think) I think it's appropriate -- 
absent a SUS specification -- to assume it works as under Linux?

How would you argue if some other OS was to introduce something called 
kqueue with semantics different from FreeBSD?


Re: epoll exposure

2023-08-14 Thread Mouse
>>> The problem is third-party software assumes epoll == Linux,

>> Software that makes stupid assumptions will never go away.

>> Is it better to work around it (not ship epoll.h), or to get it
>> fixed (report it upstream as the bug it is)?  I could argue that
>> either way.

> I don't really see it as a bug.  You'd have to have all those
> problems have configure logic that says

>   if we find an epoll implementation, then we have a list of
>   operating systems that have implemented an epoll that has different
>   semantics and we have to reject it

That's not about assuming "epoll == Linux".  That's about assuming
"epoll, if present, has exactly Linux's epoll semantics".  While it's
possible that either would break on the epoll under discussion, they
are not equivalent assumptions in general.

It also is a wrong way to build self-configuration; such a test is
vulnerable to both false positives and false negatives.  It should be
reported upstream as a bug.  Much righter is to test whether epoll, if
present, produces the behaviour the program expects in the uses it
makes of it.  (Also, "Linux" is not a single thing, so "epoll == Linux"
cannot be a correct thing to assume even conceptually.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: epoll exposure

2023-08-14 Thread nia
On Mon, Aug 14, 2023 at 07:39:11AM -0400, Greg Troxel wrote:
> Jonathan Perkin  writes:
> 
> > * On 2023-08-13 at 18:10 BST, Tobias Nygren wrote:
> >
> >>On Sat, 12 Aug 2023 19:21:06 -0400
> >>Christos Zoulas  wrote:
> >>
> >>> I really want to understand what's going on here (why do we think that
> >>> our epoll implementation is broken in a way that will affect 
> >>> applications).
> >>
> >>jperkin@ might be able to explain what the issues with Illumos are,
> >>but I guess the problems are of different nature than the NetBSD case.
> >
> > The problem is third-party software assumes epoll == Linux, and once
> 
> I see it that Linux was first with epoll and thus is the specification.
> In an ideal world, it would have gone through POSIX, but whatever.
> Once that exists, I think an epoll implementation in other operating
> systems has to have the exact same semantics, or it's asking for
> trouble, perhaps hard-to-find ways.
> 
> I don't really find it problematic that the world has assigned the word
> epoll to Linux's implementation.  To object to that one has to object to
> any non-POSIX feature in any OS.

It's about assuming every other Linuxism is available if epoll is,
not exact semantics.


Re: epoll exposure

2023-08-14 Thread Greg Troxel
Mouse  writes:

>> The problem is third-party software assumes epoll == Linux,
>
> Software that makes stupid assumptions will never go away.
>
> Is it better to work around it (not ship epoll.h), or to get it fixed
> (report it upstream as the bug it is)?  I could argue that either way.

I don't really see it as a bug.  You'd have to have all those problems
have configure logic that says

  if we find an epoll implementation, then we have a list of operating
  systems that have implemented an epoll that has different semantics
  and we have to reject it

It seems far more reasonable to say that if an OS implements a different
epoll, then it should call it something else.


Re: epoll exposure

2023-08-14 Thread Greg Troxel
Jonathan Perkin  writes:

> * On 2023-08-13 at 18:10 BST, Tobias Nygren wrote:
>
>>On Sat, 12 Aug 2023 19:21:06 -0400
>>Christos Zoulas  wrote:
>>
>>> I really want to understand what's going on here (why do we think that
>>> our epoll implementation is broken in a way that will affect applications).
>>
>>jperkin@ might be able to explain what the issues with Illumos are,
>>but I guess the problems are of different nature than the NetBSD case.
>
> The problem is third-party software assumes epoll == Linux, and once

I see it that Linux was first with epoll and thus is the specification.
In an ideal world, it would have gone through POSIX, but whatever.
Once that exists, I think an epoll implementation in other operating
systems has to have the exact same semantics, or it's asking for
trouble, perhaps hard-to-find ways.

I don't really find it problematic that the world has assigned the word
epoll to Linux's implementation.  To object to that one has to object to
any non-POSIX feature in any OS.

> our epoll implementation landed to support LX binaries, a bunch of
> packages that previously built natively broke.

Is that because while it appeared to offer the same API, it did not
exactly?

> In hindsight we would have not shipped sys/epoll.h, it provides no
> benefits, and obviously it's preferable to use native features (event
> ports in our case, kqueue in yours).

Was your epoll intended to be Linux compatible, or something else?  It
would seem that if so, it should be exactly compatible, and if not it
should just have a different name.  It really feels like you have gone
down the path we are on, so your experience seems useful.

> So now I'm stuck with shipping a faked-up epoll.h in my build chroots
> that #error out forever, and still have to patch up some software that
> finds it and continues on regardless thinking we're Linux.
>
> Just Say No.

Do you mean:

  do not implement epoll in NetBSD native at all?

or

  if you implement epoll, it needs to be 100% compatible with Linux
  semantics so that programs that use it will work as one expects?


There are some programs out there that support epoll but not kqueue;
guile-fibers used to be one but grew kqueue support.


Re: epoll exposure

2023-08-14 Thread Mouse
> The problem is third-party software assumes epoll == Linux,

Software that makes stupid assumptions will never go away.

Is it better to work around it (not ship epoll.h), or to get it fixed
(report it upstream as the bug it is)?  I could argue that either way.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: epoll exposure

2023-08-14 Thread Jonathan Perkin

* On 2023-08-13 at 18:10 BST, Tobias Nygren wrote:


On Sat, 12 Aug 2023 19:21:06 -0400
Christos Zoulas  wrote:


I really want to understand what's going on here (why do we think that
our epoll implementation is broken in a way that will affect applications).


jperkin@ might be able to explain what the issues with Illumos are,
but I guess the problems are of different nature than the NetBSD case.


The problem is third-party software assumes epoll == Linux, and once our 
epoll implementation landed to support LX binaries, a bunch of packages 
that previously built natively broke.


In hindsight we would have not shipped sys/epoll.h, it provides no 
benefits, and obviously it's preferable to use native features (event 
ports in our case, kqueue in yours).


So now I'm stuck with shipping a faked-up epoll.h in my build chroots 
that #error out forever, and still have to patch up some software that 
finds it and continues on regardless thinking we're Linux.


Just Say No.

--
Jonathan Perkinwww.perkin.org.uk
github.com/jperkin   federate.me.uk/@jperkin


Re: epoll exposure

2023-08-13 Thread Tobias Nygren
On Sun, 13 Aug 2023 19:10:32 +0200
Tobias Nygren  wrote:

> A similar situation arises with daemons that reload their configuration
> via the classic SIGHUP + fork(2) mechanism. If that is done from within
> the event loop I guess we lose?

Same goes of course for any trivial server process that forks into the
background if it does the listening socket setup before the fork
happens. It is a common pattern to do so in order to bail out on errors
before dropping privileges and forking. I wouldn't bet on everyone doing
epoll_create1 after the fork.


Re: epoll exposure

2023-08-13 Thread Tobias Nygren
On Sat, 12 Aug 2023 19:21:06 -0400
Christos Zoulas  wrote:

> I really want to understand what's going on here (why do we think that
> our epoll implementation is broken in a way that will affect applications).

jperkin@ might be able to explain what the issues with Illumos are,
but I guess the problems are of different nature than the NetBSD case.

The use case that concerns me as far as NetBSD goes is networking
daemons. Daemons that use worker process pools may pass an epoll
descriptor rather than the listening fd to workers via a fork(2). I
expect this to be an at least somewhat idiomatic pattern as it is
suggested to do it this way in the Linux epoll(7) man page[1].

A similar situation arises with daemons that reload their configuration
via the classic SIGHUP + fork(2) mechanism. If that is done from within
the event loop I guess we lose?

I would prefer the above scenarios are tested and known to work before
we advertise the functionality by default. But it doesn't sound from
the discussion that fixing descriptor inheritance is trivial.

Kind regards,
-Tobias

[1] Last paragraph of the "Level-triggered and edge-triggered" section.


Re: epoll exposure

2023-08-13 Thread Taylor R Campbell
> Date: Sat, 12 Aug 2023 19:21:06 -0400
> From: Christos Zoulas 
> 
> 2. Nobody has given an example of an application that breaks, or answered
> the question if we understand how the Illumos feature is breaking things,
> or even if the Illumos implementation is similar to ours. Theodore 
> mentioned
> that aside from the fork issue, we should be fully compatible. Do we know
> of any applications that open an epoll file descriptor and then fork? Was
> that the reason the the Illumos implementation is failing?

I'm not sure and I'd be curious to hear more.  The illumos man page
details some semantic differences: .
But I think the pkgsrc experience should definitely give us pause and
we should try to coordinate it, e.g. by doing bulk builds and testing
applications that we determine use epoll, rather than barge ahead and
assume pkgsrc developers will find and fix the fallout.

Backing out the change later, if we might decide to do that, is very
painful, and it's why we're still stuck with the awful getrandom(2)
API that I made the mistake of adopting back in 2020.  (Although if we
have to rebuild all netbsd-10 packages anyway for openssl3, maybe
now's our chance to ditch it...)

> 3. We discussed adding epoll as a native syscall in tech-kern and there were
> no objections.

It was proposed on 2023-06-21 at
.

The first reply within hours on 2023-06-21 at

asked why it is a good API to adopt, and whether for testing the
compat syscall we could add an emul_syscall so it doesn't get exposed
to applications.

The second reply within a week on 2023-06-25 at

asked that it be done in a way that can be used by ATF but will not be
detected by configure scripts.

I read both of these as objections.  No reply answering either of
these, and the commit went in in a way that is exposed to configure
scripts and not limited to the ATF tests.

I think it is fair for nia to find this frustrating: mixed feedback on
adopting the syscall, two objections with specific requests for how to
start, and it goes in in exactly the way that the objections objected
to.

>It is common courtesy to discuss reversions before taking
> action, specially when things do not appear to be broken. It would have
> taken a minute or so for Nia to write an email and one or two more days
> with epoll(2) exposed would not have harmed anyone. I was under the
> impression that the conversation was still going on.

The commit went in on Friday, 2023-07-28, and on Monday, 2023-07-31,
nia re-raised the same objection as before, with more detail:
.

nia suggested a concrete alternative to enable testing (installing it
at sys/epoll_compat.h) on 2023-08-01:

and there was silence in the thread for another ten days, suggesting
the discussion was over and nia's objections were being ignored.

It is true that our rule is for reverts to go through discussion and
then core@ first, which this could be seen to violate.  But:

(a) nia did try to go through discussion by raising objections twice
over the course of a month, with no response;
(b) this wasn't entirely a revert -- all the code is still there, just
not exposed from libc; and
(c) I think a lot of people have gotten the impression that core@ is
anemic and inattentive, and it's on us to dispel that impression
by acting responsively and responsibly in the community, not just
by asserting authority by insisting on the text of rules.

So, moving forward:

How about we prioritize implementing an emul_syscall or something so
we can automatically test compat_linux?

This will be a much bigger win, I think, than adopting epoll(2), and
if epoll(2) is really worth it (and I'm not sure it is; I'd like to
see a more compelling positive argument for adopting it) we can
separately coordinate that with pkgsrc resources.


Re: epoll exposure

2023-08-12 Thread Christos Zoulas
1. This was committed to HEAD and not to a release branch. HEAD
is supposed to contain some experimental features so that we can verify
that things work correctly. Now that it has been backed out, we'll never 
know.
2. Nobody has given an example of an application that breaks, or answered
the question if we understand how the Illumos feature is breaking things,
or even if the Illumos implementation is similar to ours. Theodore mentioned
that aside from the fork issue, we should be fully compatible. Do we know
of any applications that open an epoll file descriptor and then fork? Was
that the reason the the Illumos implementation is failing?
3. We discussed adding epoll as a native syscall in tech-kern and there were
no objections. It is common courtesy to discuss reversions before taking
action, specially when things do not appear to be broken. It would have
taken a minute or so for Nia to write an email and one or two more days
with epoll(2) exposed would not have harmed anyone. I was under the
impression that the conversation was still going on.

I really want to understand what's going on here (why do we think that
our epoll implementation is broken in a way that will affect applications).

Best,

christos

> On Aug 12, 2023, at 6:58 PM, Greg Troxel  wrote:
> 
> nia  writes:
> 
>> On Fri, Aug 11, 2023 at 06:52:41PM -, Christos Zoulas wrote:
>>> I see that you removed with without further discussion which is not the
>>> way we do things on NetBSD. Do you have an example where the epoll emulation
>>> breaks, because either forking matters or the implementation is
>>> incorrect/different?
>> 
>> We maybe need some clarification there - it's come up before
>> that changes should be backed out while the discussion is ongoing.
>> I've generally complied and backed out my changes when someone
>> wants to discuss them.
> 
> As someone who has given others a hard time about this I'd like to
> strongly second what nia@ said.  I agree with and sympathize with the
> idea that we don't randomly revert, but the total situation is more
> subtle.
> 
> First, we have a norm that changes that others object to should be
> proposed and discussed, and only if there is pretty strong rough
> consensus or better that they get committed.  For weaker rough
> consensus, I see it as a call for core or pmc.  A real problem is when
> people just commit anyway, and people object.  The right thing is for
> the committer to say "didn't realize this was so controversial" (often
> it's hard to tell) and revert.  It can easily be an honest mistake to
> not realize that other people will have issues with something.
> 
> I believe it is very important to have the discussion with the
> controversial code not in tree, because that frames the argument as "I
> would like to commit this, is that ok" rather than "someone else wants
> to revert this thing that we already have".  In some sense that's the
> same, but we are talking people and it's really clear to me that it is
> not the same.
> 
> I view committing controversial things and not reverting them as a
> bigger problem than someone else doing the revert after it has been
> called for.  In my view core should be asking for things to be reverted
> for discussion when they turn out to be controversial.
> 
>> Riastradh raised that having a symbol named "epoll_create" in libc
>> may be enough to change the way build systems behave, so we need
>> to be really careful here. In my opinion it's better to back out
>> ABI changes that may be problematic early before we're stuck with
>> something we might regret later.
> 
> Agreed.  In this case, the complaints were rapid and significant and it
> was obviously tricky.
> 
> And, my read of the discussion was that adding an emulated epoll for
> linux binaries was ok but adding native epoll had nowhere near reached
> consensus.
> 
>>> I would agree on principle that it is better to use kqueue on BSD systems,
>>> but if it is not broken, why not advertise it?
>> 
>> We need to be 100% sure that this code will remain 100% compatible
>> with Linux for the forceeable future. Exposing epoll involves
>> compiling code that has only ever been tested on Linux on
>> NetBSD for the first time, while avoiding tried and tested code.
>> 
>> This is really scary, and can create all sorts of headaches in
>> porting work, especially if the epoll-enabled source code uses
>> all sorts of other Linuxisms.
> 
> Agreed.  Porting linuxy code is hard enough.  I think we need to talk
> about exact semantics match, and if not to really have the rationale
> super clear and widely accepted.   I think a  lot of people are just not
> wanting to bite off that trouble.



signature.asc
Description: Message signed with OpenPGP


Re: epoll exposure

2023-08-12 Thread Greg Troxel
nia  writes:

> On Fri, Aug 11, 2023 at 06:52:41PM -, Christos Zoulas wrote:
>> I see that you removed with without further discussion which is not the
>> way we do things on NetBSD. Do you have an example where the epoll emulation
>> breaks, because either forking matters or the implementation is
>> incorrect/different?
>
> We maybe need some clarification there - it's come up before
> that changes should be backed out while the discussion is ongoing.
> I've generally complied and backed out my changes when someone
> wants to discuss them.

As someone who has given others a hard time about this I'd like to
strongly second what nia@ said.  I agree with and sympathize with the
idea that we don't randomly revert, but the total situation is more
subtle.

First, we have a norm that changes that others object to should be
proposed and discussed, and only if there is pretty strong rough
consensus or better that they get committed.  For weaker rough
consensus, I see it as a call for core or pmc.  A real problem is when
people just commit anyway, and people object.  The right thing is for
the committer to say "didn't realize this was so controversial" (often
it's hard to tell) and revert.  It can easily be an honest mistake to
not realize that other people will have issues with something.

I believe it is very important to have the discussion with the
controversial code not in tree, because that frames the argument as "I
would like to commit this, is that ok" rather than "someone else wants
to revert this thing that we already have".  In some sense that's the
same, but we are talking people and it's really clear to me that it is
not the same.

I view committing controversial things and not reverting them as a
bigger problem than someone else doing the revert after it has been
called for.  In my view core should be asking for things to be reverted
for discussion when they turn out to be controversial.

> Riastradh raised that having a symbol named "epoll_create" in libc
> may be enough to change the way build systems behave, so we need
> to be really careful here. In my opinion it's better to back out
> ABI changes that may be problematic early before we're stuck with
> something we might regret later.

Agreed.  In this case, the complaints were rapid and significant and it
was obviously tricky.

And, my read of the discussion was that adding an emulated epoll for
linux binaries was ok but adding native epoll had nowhere near reached
consensus.

>> I would agree on principle that it is better to use kqueue on BSD systems,
>> but if it is not broken, why not advertise it?
>
> We need to be 100% sure that this code will remain 100% compatible
> with Linux for the forceeable future. Exposing epoll involves
> compiling code that has only ever been tested on Linux on
> NetBSD for the first time, while avoiding tried and tested code.
>
> This is really scary, and can create all sorts of headaches in
> porting work, especially if the epoll-enabled source code uses
> all sorts of other Linuxisms.

Agreed.  Porting linuxy code is hard enough.  I think we need to talk
about exact semantics match, and if not to really have the rationale
super clear and widely accepted.   I think a  lot of people are just not
wanting to bite off that trouble.


Re: epoll exposure

2023-08-12 Thread nia
On Fri, Aug 11, 2023 at 06:52:41PM -, Christos Zoulas wrote:
> I see that you removed with without further discussion which is not the
> way we do things on NetBSD. Do you have an example where the epoll emulation
> breaks, because either forking matters or the implementation is
> incorrect/different?

We maybe need some clarification there - it's come up before
that changes should be backed out while the discussion is ongoing.
I've generally complied and backed out my changes when someone
wants to discuss them.

Riastradh raised that having a symbol named "epoll_create" in libc
may be enough to change the way build systems behave, so we need
to be really careful here. In my opinion it's better to back out
ABI changes that may be problematic early before we're stuck with
something we might regret later.

> I would agree on principle that it is better to use kqueue on BSD systems,
> but if it is not broken, why not advertise it?

We need to be 100% sure that this code will remain 100% compatible
with Linux for the forceeable future. Exposing epoll involves
compiling code that has only ever been tested on Linux on
NetBSD for the first time, while avoiding tried and tested code.

This is really scary, and can create all sorts of headaches in
porting work, especially if the epoll-enabled source code uses
all sorts of other Linuxisms.


Re: epoll exposure

2023-08-11 Thread Christos Zoulas


> On Aug 11, 2023, at 3:09 PM, Tobias Nygren  wrote:
> 
> On Fri, 11 Aug 2023 18:52:41 - (UTC)
> chris...@astron.com (Christos Zoulas) wrote:
> 
>> In article ,
>> nia   wrote:
>>> On Mon, Jul 31, 2023 at 07:18:38PM -0700, Jason Thorpe wrote:
 Anyway, like I said, I think the best way forward is to make it
>>> possible for kq descriptors to be inherited? it?s a little tricky
>>> because of some of the wacky stuff kqueue can track, but I think NetBSD
>>> can lead on this and define a set of semantics that makes sense.
>>> 
>>> Can we agree on renaming the header to sys/epoll_compat.h?
>> 
>> I see that you removed with without further discussion which is not the
>> way we do things on NetBSD. Do you have an example where the epoll emulation
>> breaks, because either forking matters or the implementation is
>> incorrect/different?
> 
> Some examples of packages where forking _could_ have implications and
> are not even possible to audit due to undefined number of downstream
> consumers, some possibly outside of pkgsrc:
> 
> python, libevent, apr, libev

Thank you for the explanation So is the Illumos implementation similar to ours?
Does it have the same fork limitation? I.e. is it built on top of kqueue, or 
something
else? I think that it is better in the long run to have a more portable fix 
(that is not
Illumos-specific) that specifies to prefer kqueue over epoll if both are 
available.
Did we see any actual breaks with our implementation?

> 
>> I would agree on principle that it is better to use kqueue on BSD systems,
>> but if it is not broken, why not advertise it?
> 
> pkgsrc had a lots of issues with epoll getting picked up on Illumos and
> don't want to add more patches to deal with exceptions.
> 
> I see your point but I also don't see the point of rushing this
> feature in by default and leaving pkgsrc developers and users to
> deal with potential fallout, knowing there are rough corner cases.
> 
> If it can be agreed that we put back the header under a different name
> that would be great.
> 

Well, then it will not be used. Do you have a list of the packages affected?

christos



signature.asc
Description: Message signed with OpenPGP


Re: epoll exposure

2023-08-11 Thread Tobias Nygren
On Fri, 11 Aug 2023 18:52:41 - (UTC)
chris...@astron.com (Christos Zoulas) wrote:

> In article ,
> nia   wrote:
> >On Mon, Jul 31, 2023 at 07:18:38PM -0700, Jason Thorpe wrote:
> >> Anyway, like I said, I think the best way forward is to make it
> >possible for kq descriptors to be inherited? it?s a little tricky
> >because of some of the wacky stuff kqueue can track, but I think NetBSD
> >can lead on this and define a set of semantics that makes sense.
> >
> >Can we agree on renaming the header to sys/epoll_compat.h?
> 
> I see that you removed with without further discussion which is not the
> way we do things on NetBSD. Do you have an example where the epoll emulation
> breaks, because either forking matters or the implementation is
> incorrect/different?

Some examples of packages where forking _could_ have implications and
are not even possible to audit due to undefined number of downstream
consumers, some possibly outside of pkgsrc:

python, libevent, apr, libev

> I would agree on principle that it is better to use kqueue on BSD systems,
> but if it is not broken, why not advertise it?

pkgsrc had a lots of issues with epoll getting picked up on Illumos and
don't want to add more patches to deal with exceptions.

I see your point but I also don't see the point of rushing this
feature in by default and leaving pkgsrc developers and users to
deal with potential fallout, knowing there are rough corner cases.

If it can be agreed that we put back the header under a different name
that would be great.

-Tobias


Re: epoll exposure

2023-08-11 Thread Christos Zoulas
In article ,
nia   wrote:
>On Mon, Jul 31, 2023 at 07:18:38PM -0700, Jason Thorpe wrote:
>> Anyway, like I said, I think the best way forward is to make it
>possible for kq descriptors to be inherited… it’s a little tricky
>because of some of the wacky stuff kqueue can track, but I think NetBSD
>can lead on this and define a set of semantics that makes sense.
>
>Can we agree on renaming the header to sys/epoll_compat.h?

I see that you removed with without further discussion which is not the
way we do things on NetBSD. Do you have an example where the epoll emulation
breaks, because either forking matters or the implementation is
incorrect/different?

I would agree on principle that it is better to use kqueue on BSD systems,
but if it is not broken, why not advertise it?

Best,

christos



Re: epoll exposure

2023-08-01 Thread nia
On Mon, Jul 31, 2023 at 07:18:38PM -0700, Jason Thorpe wrote:
> Anyway, like I said, I think the best way forward is to make it possible for 
> kq descriptors to be inherited… it’s a little tricky because of some of the 
> wacky stuff kqueue can track, but I think NetBSD can lead on this and define 
> a set of semantics that makes sense.

Can we agree on renaming the header to sys/epoll_compat.h?


Re: epoll exposure

2023-07-31 Thread Jason Thorpe


> On Jul 31, 2023, at 6:53 PM, Theodore Preduta  wrote:
>> 
>> epoll(2) for COMPAT_LINUX is implemented in as similar manner to
>> NetBSD's. Therefore, it should also have this compatible issue.
> 
> The epoll implementation is literally a direct port of the one from
> FreeBSD's Linux compatibility layer, and so has the same limitations.
> 
>> Is is still useful for real Linux applications? Or do you have a
>> plan to lift this limitation?
> 
> As for if it is still useful, yes!  I've tested a bunch of Linux
> binaries (admittedly mostly ones written in Go) and they all worked as
> expected.  So from what I can tell, the fork() limitation is not that
> big of a deal in practice.

I want to make clear — I’m not trying to knock on the work that’s been done 
here… in fact, kudos to you, Theo, for doing this!

I only wanted to raise the concern about it as a *native* call… If it’s exposed 
natively, application developers might expect it to behave the same way, 
including the inherit-across-fork thing… I think if there’s a minor glitch in 
emulation, it’s a little easier to shrug off.  When I did timerfd and eventfd, 
I made sure the behavioral differences between the native NetBSD implementation 
and Linux were limited to things that didn’t make sense on NetBSD anyway, and 
would only be noticed by things that are elbow-deep in weird Linux-specific 
sh*t for other reasons :-)

Anyway, like I said, I think the best way forward is to make it possible for kq 
descriptors to be inherited… it’s a little tricky because of some of the wacky 
stuff kqueue can track, but I think NetBSD can lead on this and define a set of 
semantics that makes sense.

-- thorpej



Re: epoll exposure

2023-07-31 Thread Theodore Preduta
On 2023-07-31 21:25, Rin Okuyama wrote:
> (theo and christos added to cc:)
> 
> On 2023/07/31 23:04, Tobias Nygren wrote:
>> On Mon, 31 Jul 2023 12:05:21 +
>> nia  wrote:
>>
>>> On Mon, Jul 31, 2023 at 04:27:32AM -0700, Jason Thorpe wrote:

> On Jul 31, 2023, at 1:38 AM, nia  wrote:
>
> Hey, I regret that epoll was committed without further discussion with
> pkgsrc developers. We have a lot of experience with this already
> (illumos/SmartOS exposes a compatibility epoll) and the situation is
> not entirely great and requires lots of workarounds.

 What?s even worse ? I think our epoll() emulation isn?t 100%
 compatible w/ Linux?s.  It wouldn?t bother me so much if it were,
 but I?m pretty sure it?s not.

 (IIRC, epoll descriptors are inherited across fork, and if ours is
 emulated with kqueue under the covers, then that can?t work?)
>>
>> Then I'm in favor of renaming the header until such time we feel the
>> interface is 100% compatible.
> 
> epoll(2) for COMPAT_LINUX is implemented in as similar manner to
> NetBSD's. Therefore, it should also have this compatible issue.

The epoll implementation is literally a direct port of the one from
FreeBSD's Linux compatibility layer, and so has the same limitations.

> Is is still useful for real Linux applications? Or do you have a
> plan to lift this limitation?

As for if it is still useful, yes!  I've tested a bunch of Linux
binaries (admittedly mostly ones written in Go) and they all worked as
expected.  So from what I can tell, the fork() limitation is not that
big of a deal in practice.

Theo(dore)



Re: epoll exposure

2023-07-31 Thread Jason Thorpe


> On Jul 31, 2023, at 6:25 PM, Rin Okuyama  wrote:
> 
> epoll(2) for COMPAT_LINUX is implemented in as similar manner to
> NetBSD's. Therefore, it should also have this compatible issue.
> Is is still useful for real Linux applications? Or do you have a
> plan to lift this limitation?

When I first looked at epoll() (back when I wrote eventfd and timerfd), I 
looked at the FreeBSD implementation and specifically chose not to use it 
because it was based on kqueue, and thus has the “kq descriptors not inherited 
across forks” wart.  The extent to which this will impact real Linux 
implementations … honestly I have no idea.

Perhaps the BEST solution is to fix this stupid limitation in kqueue (obviously 
version the system call for backward BSD compatibility).

-- thorpej



Re: epoll exposure

2023-07-31 Thread Rin Okuyama

(theo and christos added to cc:)

On 2023/07/31 23:04, Tobias Nygren wrote:

On Mon, 31 Jul 2023 12:05:21 +
nia  wrote:


On Mon, Jul 31, 2023 at 04:27:32AM -0700, Jason Thorpe wrote:



On Jul 31, 2023, at 1:38 AM, nia  wrote:

Hey, I regret that epoll was committed without further discussion with
pkgsrc developers. We have a lot of experience with this already
(illumos/SmartOS exposes a compatibility epoll) and the situation is
not entirely great and requires lots of workarounds.


What?s even worse ? I think our epoll() emulation isn?t 100% compatible w/ 
Linux?s.  It wouldn?t bother me so much if it were, but I?m pretty sure it?s 
not.

(IIRC, epoll descriptors are inherited across fork, and if ours is emulated 
with kqueue under the covers, then that can?t work?)


Then I'm in favor of renaming the header until such time we feel the
interface is 100% compatible.


epoll(2) for COMPAT_LINUX is implemented in as similar manner to
NetBSD's. Therefore, it should also have this compatible issue.
Is is still useful for real Linux applications? Or do you have a
plan to lift this limitation?

Thanks,
rin


Re: epoll exposure

2023-07-31 Thread Tobias Nygren
On Mon, 31 Jul 2023 12:05:21 +
nia  wrote:

> On Mon, Jul 31, 2023 at 04:27:32AM -0700, Jason Thorpe wrote:
> > 
> > > On Jul 31, 2023, at 1:38 AM, nia  wrote:
> > > 
> > > Hey, I regret that epoll was committed without further discussion with
> > > pkgsrc developers. We have a lot of experience with this already
> > > (illumos/SmartOS exposes a compatibility epoll) and the situation is
> > > not entirely great and requires lots of workarounds.
> > 
> > What?s even worse ? I think our epoll() emulation isn?t 100% compatible w/ 
> > Linux?s.  It wouldn?t bother me so much if it were, but I?m pretty sure 
> > it?s not.
> > 
> > (IIRC, epoll descriptors are inherited across fork, and if ours is emulated 
> > with kqueue under the covers, then that can?t work?)

Then I'm in favor of renaming the header until such time we feel the
interface is 100% compatible.


Re: epoll exposure

2023-07-31 Thread nia
On Mon, Jul 31, 2023 at 04:27:32AM -0700, Jason Thorpe wrote:
> 
> > On Jul 31, 2023, at 1:38 AM, nia  wrote:
> > 
> > Hey, I regret that epoll was committed without further discussion with
> > pkgsrc developers. We have a lot of experience with this already
> > (illumos/SmartOS exposes a compatibility epoll) and the situation is
> > not entirely great and requires lots of workarounds.
> 
> What’s even worse … I think our epoll() emulation isn’t 100% compatible w/ 
> Linux’s.  It wouldn’t bother me so much if it were, but I’m pretty sure it’s 
> not.
> 
> (IIRC, epoll descriptors are inherited across fork, and if ours is emulated 
> with kqueue under the covers, then that can’t work…)
> 
> -- thorpej
> 

Oops. Then it definitely shouldn't be such a public interface...


Re: epoll exposure

2023-07-31 Thread Jason Thorpe


> On Jul 31, 2023, at 1:38 AM, nia  wrote:
> 
> Hey, I regret that epoll was committed without further discussion with
> pkgsrc developers. We have a lot of experience with this already
> (illumos/SmartOS exposes a compatibility epoll) and the situation is
> not entirely great and requires lots of workarounds.

What’s even worse … I think our epoll() emulation isn’t 100% compatible w/ 
Linux’s.  It wouldn’t bother me so much if it were, but I’m pretty sure it’s 
not.

(IIRC, epoll descriptors are inherited across fork, and if ours is emulated 
with kqueue under the covers, then that can’t work…)

-- thorpej



Re: epoll exposure

2023-07-31 Thread nia
On Mon, Jul 31, 2023 at 11:27:37AM +0200, Tobias Nygren wrote:
> On Mon, 31 Jul 2023 08:38:31 +
> nia  wrote:
> 
> > Hey, I regret that epoll was committed without further discussion with
> > pkgsrc developers. We have a lot of experience with this already
> > (illumos/SmartOS exposes a compatibility epoll) and the situation is
> > not entirely great and requires lots of workarounds.
> > 
> > Mostly, the situation of having epoll freely usable in public headers
> > but discouraged in the man pages doesn't really work. Either we have
> > barriers against its use, or we encourage its use.
> > 
> > Software is mostly tested on Linux and FreeBSD, which have either epoll
> > or kqueue. Build systems will detect epoll and use it. Autotools is
> > falling out of favour, so it is getting harder and harder to override
> > the detection of operating system functions.  Build systems need to be
> > patched.
> > 
> > I don't want pkgsrc to be full of "don't use epoll on NetBSD" workaronds.
> > 
> > I think the compatibility support should be opt-in.
> 
> The problem is even if we hide this under a feature test macro, some
> build systems only check for existence of the  header, so
> we will still end up patching some packages unless we rename the header
> also. And the workarounds will be mostly the same as those for Illumos
> so we can't get rid of them either.
> 
> If the software supports kqueue(2) we obviously want to do that,
> and I think we should tell the respective upstreams to default
> to that on NetBSD so that we don't have to patch things.
> 
> But what if the alternatives are select(2) or epoll(2)?
> Is/will the implementation be mature enough to replace select(2) or
> is this only ever intended for software that only supports epoll(2)?
> 
> -Tobias

We can easily introduce a feature to pkgsrc that will cause the header
to be automatically exposed for software that opts-in. We already
do this for packages that expect ncurses. If FAKE_NCURSES is set,
an artificial ncurses.h will be created in the build directory.
If ALLOW_COMPAT_EPOLL is set, we could rename epoll_compat.h to
epoll.h.

NetBSD supports _much more_ Linux-centric software than illumos does.


Re: epoll exposure

2023-07-31 Thread Tobias Nygren
On Mon, 31 Jul 2023 08:38:31 +
nia  wrote:

> Hey, I regret that epoll was committed without further discussion with
> pkgsrc developers. We have a lot of experience with this already
> (illumos/SmartOS exposes a compatibility epoll) and the situation is
> not entirely great and requires lots of workarounds.
> 
> Mostly, the situation of having epoll freely usable in public headers
> but discouraged in the man pages doesn't really work. Either we have
> barriers against its use, or we encourage its use.
> 
> Software is mostly tested on Linux and FreeBSD, which have either epoll
> or kqueue. Build systems will detect epoll and use it. Autotools is
> falling out of favour, so it is getting harder and harder to override
> the detection of operating system functions.  Build systems need to be
> patched.
> 
> I don't want pkgsrc to be full of "don't use epoll on NetBSD" workaronds.
> 
> I think the compatibility support should be opt-in.

The problem is even if we hide this under a feature test macro, some
build systems only check for existence of the  header, so
we will still end up patching some packages unless we rename the header
also. And the workarounds will be mostly the same as those for Illumos
so we can't get rid of them either.

If the software supports kqueue(2) we obviously want to do that,
and I think we should tell the respective upstreams to default
to that on NetBSD so that we don't have to patch things.

But what if the alternatives are select(2) or epoll(2)?
Is/will the implementation be mature enough to replace select(2) or
is this only ever intended for software that only supports epoll(2)?

-Tobias