Re: Managing userland data pointers in kqueue/kevent

2013-05-19 Thread Jilles Tjoelker
On Wed, May 15, 2013 at 01:34:58PM +0100, Paul LeoNerd Evans wrote:
> On Wed, 15 May 2013 13:29:59 +0100
> Paul "LeoNerd" Evans  wrote:

> > Is that not the exact thing I suggested?

> > The "extension to create register a kevent to catch these events" is
> > that you put the EV_DROPWATCH bit flag in the event at the time you
> > register it.

> > The "returned event [that] could have all the appropriate informaiton
> > for the event being dropped" is that you receive an event with
> > EV_DROPPED set on it. It being a real event includes of course the
> > udata pointer, so you can handle it.

> In fact, to requote the original PR I wrote[1] on the subject:

> ---

>   I propose the addition of a new flag applicable to any kevent watch
>   structure, documented thusly:

> The flags field can contain the following values:
> ..
> EV_DROPWATCH Requests that the kernel will send an EV_DROPPED event
>  on this watch when it has finished watching it for any
>  reason, including EV_DELETE, expiry because of
>  EV_ONESHOT, or because the filehandle was closed by
>  close(2).
> 
> EV_DROPPED   This flag is returned by the kernel if it is now about
>  to drop the watch. After this flag has been received,
>  no further events will occur on this watch.

>   This flag then makes it trivial to build a generic wrapper for kqueue
>   that can always manage its memory correctly.

>   a) at EV_ADD time, simply set flags |= EV_DROPWATCH

>   b) after an event has been processed that included the EV_DROPPED
>   flag, free() the pointer given in the udata field.

An important detail is missing: how do you avoid using up all kernel
memory on knotes if someone keeps adding new file descriptors with
EV_ADD | EV_DROPWATCH and closing the file descriptors again without
ever draining the kqueue?

This problem did not use to exist for file descriptor events before: the
number of such knotes was limited to the number of open file
descriptors. However, it does already exist for most of the other event
types. For example, pwait -v will return the exit status even if it was
suspended (^Z) while the process terminated and the parent reaped the
zombie. For EVFILT_TIMER, the worst effect is a denial of service of
EVFILT_TIMER on all other processes in the system. EVFILT_USER does not
appear to check anything and appears to allow arbitrary kernel memory
consumption.

The EVFILT_TIMER needs to keep its global limit and EVFILT_USER needs
something similar.

For the rest, call an event that is no longer associated to a kernel
object (e.g. EVFILT_READ whose file descriptor is closed, EVFILT_PROC
whose process has terminated and been reaped by the parent or EVFILT_AIO
whose I/O request is completed) "unbound". The number of events that are
not unbound is limited by existing limits on the other kernel objects. A
possible fix is to reject (such as with [ENOMEM]) adding new events when
there are too many unbound events in the queue already. The application
should then allow kevent() to return pending events first before it adds
new ones. If the kernel returns unbound events in preference to other
events, a kevent() call with nevents >= 2 * nchanges cannot result in a
net increase in the number of current and potential unbound events,
since it allows the kernel to return (and forget) as many unbound events
as it may add (nchanges entries are required for EV_ERROR leaving
nchanges for returning other events).

>   It is not required that these two flags have distinct values; since
>   one is userland->kernel and the other kernel->userland, they could for
>   neatness reuse the same bit field.

I think it would be consistent with other EV_* to use the same name and
value for both.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Managing userland data pointers in kqueue/kevent

2013-05-18 Thread Dirk Engling
On 13.05.13 11:19, Eugen-Andrei Gavriloaie wrote:

> Regardless, it has all the ingredients for memory leaks and/or, the
> worst one, use of corpse pointers which are bound to crash the app. I

I earlier pointed out other things that prevent me from using kqueue as
a proper storage for user land pointers:

http://lists.freebsd.org/pipermail/freebsd-hackers/2013-March/042181.html

To sum it up, once I pass a pointer via udata to the kqueue system,
kqueue becomes the owner and the expected behaviour is

  (a) never silently swallow the pointer
  (b) provide an interface to retrieve the pointer anytime

Besides the way you pointed out to violate (a), there is another way,
re-adding an existing event. So I propose a flag EV_EXCL that will fail
adding an existing event with EEXIST in data.

As an alternative or in addition to that, re-adding an existing event
without the EV_EXCL flag will cause the old event to be returned with
the EV_DROPPED mechanism proposed.

This can also be used to fulfill property (b). kqueue is an efficient
store for the per-event-data. In an event base application, I usually
allocate resources per new session, pass the metadata via udata to
kevent and will see the udata pointer whenever the event triggers.

But in order to clean up resources due to external events (like program
termination), I have no way to ask my kqueue for the kevents (and thus
the udata) I stored in them ... without knowing about them in the first
place.

For the program termination case, it would be enough to extend EV_DELETE
with a flag to drop all filters and catch them by checking for the
EV_DROPPED flag. This means that EV_DELETE could return a list of
filters and can be called several times until no events are returned.

Of course this can be extended to specifically drop events that match a
certain filter, flag or fflag value, but for now you can also use
different kqueues.

Regards,

  erdgeist
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Managing userland data pointers in kqueue/kevent

2013-05-15 Thread LeoNerd
On Wed, 15 May 2013 13:29:59 +0100
Paul "LeoNerd" Evans  wrote:

> Is that not the exact thing I suggested?
> 
> The "extension to create register a kevent to catch these events" is
> that you put the EV_DROPWATCH bit flag in the event at the time you
> register it.
> 
> The "returned event [that] could have all the appropriate informaiton
> for the event being dropped" is that you receive an event with
> EV_DROPPED set on it. It being a real event includes of course the
> udata pointer, so you can handle it.

In fact, to requote the original PR I wrote[1] on the subject:

---

  I propose the addition of a new flag applicable to any kevent watch
  structure, documented thusly:

The flags field can contain the following values:
..
EV_DROPWATCH Requests that the kernel will send an EV_DROPPED event
 on this watch when it has finished watching it for any
 reason, including EV_DELETE, expiry because of
 EV_ONESHOT, or because the filehandle was closed by
 close(2).

EV_DROPPED   This flag is returned by the kernel if it is now about
 to drop the watch. After this flag has been received,
 no further events will occur on this watch.

  This flag then makes it trivial to build a generic wrapper for kqueue
  that can always manage its memory correctly.

  a) at EV_ADD time, simply set flags |= EV_DROPWATCH

  b) after an event has been processed that included the EV_DROPPED
  flag, free() the pointer given in the udata field.

  It is not required that these two flags have distinct values; since
  one is userland->kernel and the other kernel->userland, they could for
  neatness reuse the same bit field.

---

[1]: http://www.freebsd.org/cgi/query-pr.cgi?pr=153254

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: PGP signature


Re: Managing userland data pointers in kqueue/kevent

2013-05-15 Thread LeoNerd
On Wed, 15 May 2013 02:14:55 -0400
Julian Elischer  wrote:

> I would suggest that one answer would be to create an extension to 
> register a
> kevent to catch these events..
> 
> (the knote_drop())
> 
> The returned event could have all the appropriate information for the
> event being dropped..

Is that not the exact thing I suggested?

The "extension to create register a kevent to catch these events" is
that you put the EV_DROPWATCH bit flag in the event at the time you
register it.

The "returned event [that] could have all the appropriate informaiton
for the event being dropped" is that you receive an event with
EV_DROPPED set on it. It being a real event includes of course the
udata pointer, so you can handle it.

It's really simple to use:

 When you register,

ev->flags |= EV_DROPWATCH


 When you receive an event, process it in the normal way, then

if(ev->flags & EV_DROPPED)
  free(ev->udata);

and that is all there is to it.

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: PGP signature


Re: Managing userland data pointers in kqueue/kevent

2013-05-14 Thread Julian Elischer

On 5/13/13 2:44 PM, Paul LeoNerd Evans wrote:

On Mon, 13 May 2013 11:23:45 -0700
Adrian Chadd  wrote:


Just as a data point, I managed 50,000 + connections, at 5,000 + a
second, doing a gigabit + of traffic, mid-2000s, with the userland
management of all of the socket/disk FD stuff.

The biggest overhead at the time was actually the read/write
copyin/copyout, NOT the locking overhead of managing this stuff. Why?
Because I architected the HTTP side of things to specifically pin FDs
to threads, and not allow arbitrary threads to deal with arbitrary
FDs. This removed the need for almost all of the state locking that
people are concerned about here.

I think then this comes from different experiences.

I'm guessing this application was:

   a) Written in C
   b) Entirely filled with identically-typed identical-purpose file
  descriptors
   c) Didn't really use any EV_ONESHOT events
   d) Didn't close sockets apart from when it received EOF
and perhaps most importantly
   e) Was entirely self-contained - did everything from one unified
  block of source code.

I.e. a very simple set of semantics. I'll explain the situation that I
had.

The reason I ran into the problem needing EV_DROPWATCH/EV_DROPPED was
because I was trying to fix Perl's IO::KQueue.

IO::KQueue tries to wrap kqueue/kevent for Perl, allowing the userland
Perl code to store an arbitrary Perl data pointer in the udata field.
This data is reference-counted. Userland might let the kernel store the
only copy of that data, because it comes back in event notifications
anyway. Because of this, the reference count has to be artificially
incremented to account for the extra pointer in the kernel. Without
knowing when the kernel will decide to drop that pointer, I never know
when I should decrement the refcount myself.

It has no knowledge of what userland is doing with this. It can't know
when userland might be EV_ONESHOT'ing. It doesn't really know what
events will be oneshot anyway (such as the process exit watches).
Finally, it has no idea what other modules are going to call close() on
it. This final problem was the real killer - while the first two
-could- be worked around with more complex code structures, not knowing
what other CPAN modules will ever call close() makes it impossible to
handle. Simply asking every CPAN module to "please just call fd_close()
instead of close()" doesn't work here.

As compared: having the kernel tell userland when it calls knote_drop()
is much simpler. It knows exactly when it is doing this, so simply
pushing an event up to userland to tell it it did so is simple. If any
more cases than the three known (EV_ONESHOT or other single-shot events;
EV_DELETE, close()) are added, userland - and in particular, the
IO::KQueue module, will not need updating. It will continue to
decrement refcounts and free data perfectly happily when kernel has
dropped the watch.

I've used this pattern before in C libraries + higher-level language
wrappers, and found it to be nicely simple to both implement and use.
Because it follows the -same- event notification path that userland is
already using, it manages to avoid quite a number of the
race-conditions that a secondary, separate data structure and locking
often runs into; e.g. if userland is trying to add a new thing into it
just at the time there's a notification "in-flight" from the kernel
about an old thing that it used to have.

Principly - the fact that kernel tells -userland- about the delete,
means that it can atomically *guarantee* that this *will* be the last
event about this particular item. Userland must not delete its own data
structure about it until this notification happens. If it does this,
lots of semantics become a lot simpler.


I was responsible for the u_data field. It was not in the original 
design that was
proposed and I suggested it to Jonathan. I was thinking purely of a 
simple way for
an event to supply added information to its handler that would obviate 
the need for
the app to keep complicated tracking structures. I was not thinking in 
terms of
"badly behaved" (sic) third party high level ops using it through a 
language binding.

I admit that I did not think about the close issue at that time.

Your suggested changes are not unreasonable however we could do with more
discussion. The point about tracking objects that may be arbitrarily 
destroyed without
the framework being notified is valid and aligns well with general 
robustness principals.


I would suggest that one answer would be to create an extension to 
register a

kevent to catch these events..

(the knote_drop())

The returned event could have all the appropriate information for the event 
being dropped..





___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread LeoNerd
On Mon, 13 May 2013 11:23:45 -0700
Adrian Chadd  wrote:

> Just as a data point, I managed 50,000 + connections, at 5,000 + a
> second, doing a gigabit + of traffic, mid-2000s, with the userland
> management of all of the socket/disk FD stuff.
> 
> The biggest overhead at the time was actually the read/write
> copyin/copyout, NOT the locking overhead of managing this stuff. Why?
> Because I architected the HTTP side of things to specifically pin FDs
> to threads, and not allow arbitrary threads to deal with arbitrary
> FDs. This removed the need for almost all of the state locking that
> people are concerned about here.

I think then this comes from different experiences.

I'm guessing this application was:

  a) Written in C
  b) Entirely filled with identically-typed identical-purpose file
 descriptors
  c) Didn't really use any EV_ONESHOT events
  d) Didn't close sockets apart from when it received EOF
and perhaps most importantly
  e) Was entirely self-contained - did everything from one unified
 block of source code.

I.e. a very simple set of semantics. I'll explain the situation that I
had.

The reason I ran into the problem needing EV_DROPWATCH/EV_DROPPED was
because I was trying to fix Perl's IO::KQueue.

IO::KQueue tries to wrap kqueue/kevent for Perl, allowing the userland
Perl code to store an arbitrary Perl data pointer in the udata field.
This data is reference-counted. Userland might let the kernel store the
only copy of that data, because it comes back in event notifications
anyway. Because of this, the reference count has to be artificially
incremented to account for the extra pointer in the kernel. Without
knowing when the kernel will decide to drop that pointer, I never know
when I should decrement the refcount myself.

It has no knowledge of what userland is doing with this. It can't know
when userland might be EV_ONESHOT'ing. It doesn't really know what
events will be oneshot anyway (such as the process exit watches).
Finally, it has no idea what other modules are going to call close() on
it. This final problem was the real killer - while the first two
-could- be worked around with more complex code structures, not knowing
what other CPAN modules will ever call close() makes it impossible to
handle. Simply asking every CPAN module to "please just call fd_close()
instead of close()" doesn't work here.

As compared: having the kernel tell userland when it calls knote_drop()
is much simpler. It knows exactly when it is doing this, so simply
pushing an event up to userland to tell it it did so is simple. If any
more cases than the three known (EV_ONESHOT or other single-shot events;
EV_DELETE, close()) are added, userland - and in particular, the
IO::KQueue module, will not need updating. It will continue to
decrement refcounts and free data perfectly happily when kernel has
dropped the watch.

I've used this pattern before in C libraries + higher-level language
wrappers, and found it to be nicely simple to both implement and use.
Because it follows the -same- event notification path that userland is
already using, it manages to avoid quite a number of the
race-conditions that a secondary, separate data structure and locking
often runs into; e.g. if userland is trying to add a new thing into it
just at the time there's a notification "in-flight" from the kernel
about an old thing that it used to have.

Principly - the fact that kernel tells -userland- about the delete,
means that it can atomically *guarantee* that this *will* be the last
event about this particular item. Userland must not delete its own data
structure about it until this notification happens. If it does this,
lots of semantics become a lot simpler.

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: PGP signature


Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread LeoNerd
On Mon, 13 May 2013 11:10:44 -0700
Adrian Chadd  wrote:

> ... also, want to code up a test implementation?
> 
> And some stress testing cases to throw in the regression tree?

I already mostly fixed Perl's IO::KQueue wrapper to use this
hypothetical feature, I can easily provide that somewhere for someone
to test it against. I actually wrote that bit first, before I found
such a feature did not exist.

That would allow some highly-parallel Perl code to use it. All the main
Perl event systems can use IO::KQueue so that easily provides a lot of
good test cases.

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: PGP signature


Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread Adrian Chadd
Just as a data point, I managed 50,000 + connections, at 5,000 + a
second, doing a gigabit + of traffic, mid-2000s, with the userland
management of all of the socket/disk FD stuff.

The biggest overhead at the time was actually the read/write
copyin/copyout, NOT the locking overhead of managing this stuff. Why?
Because I architected the HTTP side of things to specifically pin FDs
to threads, and not allow arbitrary threads to deal with arbitrary
FDs. This removed the need for almost all of the state locking that
people are concerned about here.



Adrian
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread Eugen-Andrei Gavriloaie
I also have a large project (crtmpserver) which makes heavy use of socket FDs 
(with my little Token workaround) and timers. Currently, can handle 2k 
streaming connections simultaneously, all of them full duplex. 

I would gladly patch it to use this new feature!!!

--
Eugen-Andrei Gavriloaie
Web: http://www.rtmpd.com

On May 13, 2013, at 9:15 PM, Paul "LeoNerd" Evans  
wrote:

> On Mon, 13 May 2013 11:10:44 -0700
> Adrian Chadd  wrote:
> 
>> ... also, want to code up a test implementation?
>> 
>> And some stress testing cases to throw in the regression tree?
> 
> I already mostly fixed Perl's IO::KQueue wrapper to use this
> hypothetical feature, I can easily provide that somewhere for someone
> to test it against. I actually wrote that bit first, before I found
> such a feature did not exist.
> 
> That would allow some highly-parallel Perl code to use it. All the main
> Perl event systems can use IO::KQueue so that easily provides a lot of
> good test cases.
> 
> -- 
> Paul "LeoNerd" Evans
> 
> leon...@leonerd.org.uk
> ICQ# 4135350   |  Registered Linux# 179460
> http://www.leonerd.org.uk/



smime.p7s
Description: S/MIME cryptographic signature


Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread Eugen-Andrei Gavriloaie

--
Eugen-Andrei Gavriloaie
Web: http://www.rtmpd.com

On May 13, 2013, at 9:02 PM, Adrian Chadd  wrote:

> Hi,
> 
> The reason I tend to suggest this is for portability and debugging
> reasons. (Before and even since libevent came into existence.)
> 
> If you do it right, you can stub / inline out all of the wrapper
> functions in userland and translate them to straight system or library
> calls.
> 
> Anyway. I'm all for making kqueue better. I just worry that adding
> little hacks here and there isn't the right way to do it. If you want
> to guarantee specific behaviours with kqueue, you should likely define
> how it should work in its entirety and see if it will cause
> architectural difficulties down the track.
And it caused some so far. We have workarounds for it, no problem.

> Until that is done, I think
> you have no excuse to get your code working as needed.
Yes. I agree. But when I look at the user space code without that feature, and 
when thinking how it would have been with that feature, it kinda makes me cry. 
A little more pain and I will make that patch myself. I'm just hoping that kq 
kernel side code will be handled by more capable hands before me. Ideally, by 
the creators.

> 
> Don't blame kqueue because what (iirc) is not defined behaviour isn't
> defined in a way that makes you happy :)
Nobody blamed kqueue. I'm just saying that it would be better for me (and I'm 
not the only one) who could use a ltle more help from it. It was born from 
needs, it evolved because of needs, why stop now? I dare to say it will become 
a standard on linux and other OSs very soon. It is the best fd reactor. Hands 
down!

Best regards,
Andrei

> 
> 
> 
> Adrian
> 
> On 13 May 2013 09:36, Eugen-Andrei Gavriloaie  wrote:
>> Hi Adrian,
>> 
>> All the tricks, work arounds, paradigms suggested/implemented by us, the kq 
>> users, are greatly simplified by simply adding that thing that Paul is 
>> suggesting. What you are saying here is to basically do not-so-natural 
>> things to overcome a real problem which can be very easy and non-intrusivly 
>> solved at lower levels. Seriously, if you truly believe that you can put the 
>> equal sign between the complexity of the user space code and the wanted 
>> patch in kqueue kernel side, than I simply shut up.
>> 
>> Besides, one of the important points in kq philosophy is simplifying things. 
>> I underline the "one of". It is not the goal, of course. Complex things are 
>> complex things no matter how hard you try to simplify them. But this is 
>> definitely (should) not falling into that category.
>> 
>> --
>> Eugen-Andrei Gavriloaie
>> Web: http://www.rtmpd.com
>> 
>> On May 13, 2013, at 6:47 PM, Adrian Chadd  wrote:
>> 
>>> ... holy crap.
>>> 
>>> On 13 May 2013 08:37, Eugen-Andrei Gavriloaie  wrote:
 Hi,
 
 Well, Paul already asked this question like 3-4 times now. Even insisting 
 on it. I will also ask it again:
 If user code is responsible of tracking down the data associated with the 
 signalled entity, what is the point of having user data?
 Is rendered completely useless…
>>> 
>>> .. why does everything have to have a well defined purpose that is
>>> also suited for use in _all_ situations?
>> That is called perfection. I know we can't achieve it, but I like to walk in 
>> that direction at least.
>> 
>>> 
 Not to mention, that your suggestion with FD index is a definite no-go. 
 The FD values are re-used. Especially in MT environments. Imagine one 
 kqueue call taking place in thread A and another one in thread B. Both 
 threads waiting for events.
>>> 
>>> .. so don't do that. I mean, you're already having to write your code
>>> to _not_ touch FDs in other threads. I've done this before, it isn't
>>> that hard and it doesn't hurt performance.
>> Why not? This is how you achieve natural load balancing for multiple 
>> kevent() calls from multiple threads over the same kq fd. Otherwise, again, 
>> you have to write complex code to manually balance the threads. That brings 
>> locking again….
>> Why people always think that locking is cheap? Excessive locking hurts. A 
>> lot!
>> 
>>> 
 When A does his magic, because of internal business rules, it decides to 
 close FD number 123. It closes it and it connects somewhere else by 
 opening a new one. Surprise, we MAY  get the value 123 again as a new 
 socket, we put it on our index, etc. Now, thread B comes in and it has 
 stale/old events for the old 123 FD. Somethings bad like EOF for the OLD 
 version of FD number 123 (the one we just closed anyway). Guess what… 
 thread B will deallocate the perfectly good thingy inside the index 
 associated with 123.
>>> 
>>> So you just ensure that nothing at all calls a close(123); but calls
>>> fd_close(123) which will in turn close(123) and free all the state
>>> associated with it.
>> Once threads A and B returned from their kevent() calls, all bets are off. 
>> In between, yo

Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread Adrian Chadd
... also, want to code up a test implementation?

And some stress testing cases to throw in the regression tree?

I'll help shephard this in if this all works out.

thanks,



Adrian
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread Adrian Chadd
On 13 May 2013 10:53, Paul LeoNerd  wrote:
> [I'm not currently on the list so please forgive the manually-crafted
> reply]
>
>> I'm confused as to why this is still an issue. Sure, fix the kqueue
>> semantics and do it in a way that doesn't break backwards
>> compatibility.
>
> I suggested that. Add a user->kernel flag
>
>   EV_DROPWATCH
>
> which, if present, causes kernel to send back to userland events with
> the kernel->user flag
>
>   EV_DROPPING
>
> any time it drops the pointer. Then trivially userland just has to set
> that flag on all its events to the kernel, and remember to send those
> events back to userland when it does in fact drop them.

Cool!

Ok. I'll go bring this up at bsdcan and see what people think. I
haven't been knee deep in this stuff for a few years (but am about to
again, damned HTTP proxies!)  and I would love to have better
semantics here.

I just want to make sure it doesn't cause weird things for the
non-socket case - ie, files (local, NFS) and signals.




Adrian
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread Adrian Chadd
Hi,

The reason I tend to suggest this is for portability and debugging
reasons. (Before and even since libevent came into existence.)

If you do it right, you can stub / inline out all of the wrapper
functions in userland and translate them to straight system or library
calls.

Anyway. I'm all for making kqueue better. I just worry that adding
little hacks here and there isn't the right way to do it. If you want
to guarantee specific behaviours with kqueue, you should likely define
how it should work in its entirety and see if it will cause
architectural difficulties down the track. Until that is done, I think
you have no excuse to get your code working as needed.

Don't blame kqueue because what (iirc) is not defined behaviour isn't
defined in a way that makes you happy :)



Adrian

On 13 May 2013 09:36, Eugen-Andrei Gavriloaie  wrote:
> Hi Adrian,
>
> All the tricks, work arounds, paradigms suggested/implemented by us, the kq 
> users, are greatly simplified by simply adding that thing that Paul is 
> suggesting. What you are saying here is to basically do not-so-natural things 
> to overcome a real problem which can be very easy and non-intrusivly solved 
> at lower levels. Seriously, if you truly believe that you can put the equal 
> sign between the complexity of the user space code and the wanted patch in 
> kqueue kernel side, than I simply shut up.
>
> Besides, one of the important points in kq philosophy is simplifying things. 
> I underline the "one of". It is not the goal, of course. Complex things are 
> complex things no matter how hard you try to simplify them. But this is 
> definitely (should) not falling into that category.
>
> --
> Eugen-Andrei Gavriloaie
> Web: http://www.rtmpd.com
>
> On May 13, 2013, at 6:47 PM, Adrian Chadd  wrote:
>
>> ... holy crap.
>>
>> On 13 May 2013 08:37, Eugen-Andrei Gavriloaie  wrote:
>>> Hi,
>>>
>>> Well, Paul already asked this question like 3-4 times now. Even insisting 
>>> on it. I will also ask it again:
>>> If user code is responsible of tracking down the data associated with the 
>>> signalled entity, what is the point of having user data?
>>> Is rendered completely useless…
>>
>> .. why does everything have to have a well defined purpose that is
>> also suited for use in _all_ situations?
> That is called perfection. I know we can't achieve it, but I like to walk in 
> that direction at least.
>
>>
>>> Not to mention, that your suggestion with FD index is a definite no-go. The 
>>> FD values are re-used. Especially in MT environments. Imagine one kqueue 
>>> call taking place in thread A and another one in thread B. Both threads 
>>> waiting for events.
>>
>> .. so don't do that. I mean, you're already having to write your code
>> to _not_ touch FDs in other threads. I've done this before, it isn't
>> that hard and it doesn't hurt performance.
> Why not? This is how you achieve natural load balancing for multiple kevent() 
> calls from multiple threads over the same kq fd. Otherwise, again, you have 
> to write complex code to manually balance the threads. That brings locking 
> again….
> Why people always think that locking is cheap? Excessive locking hurts. A lot!
>
>>
>>> When A does his magic, because of internal business rules, it decides to 
>>> close FD number 123. It closes it and it connects somewhere else by opening 
>>> a new one. Surprise, we MAY  get the value 123 again as a new socket, we 
>>> put it on our index, etc. Now, thread B comes in and it has stale/old 
>>> events for the old 123 FD. Somethings bad like EOF for the OLD version of 
>>> FD number 123 (the one we just closed anyway). Guess what… thread B will 
>>> deallocate the perfectly good thingy inside the index associated with 123.
>>
>> So you just ensure that nothing at all calls a close(123); but calls
>> fd_close(123) which will in turn close(123) and free all the state
>> associated with it.
> Once threads A and B returned from their kevent() calls, all bets are off. In 
> between, you get the the behaviour I just described from threads A and B 
> racing towards FD123 to either close it or create a new one. How is wrapping 
> close() going to help? Is not like you have any control over what the 
> socket() function is going to return. (That gave me another token idea btw… I 
> will explain in another email, perhaps you care to comment)
> Mathematically speaking, the fd-to-data association is not bijective.
>
>
>>
>> You have fd_close() either grab a lock, or you ensure that only the
>> owning thread can call fd_close(123) and if any other thread calls it,
>> the behaviour is undefined.
> As I said, that adds up to the user-space code complexity. Just don't forget 
> that Paul's suggestion solves all this problems in a ridiculously simple 
> manner. All our ideas of keeping track who is owning who and indexes are 
> going to be put to rest. kq will notify us when the udata is out of scope 
> from kq perspective. That is all we ask.
>
>>
>>> And regarding the "thre

Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread LeoNerd
[I'm not currently on the list so please forgive the manually-crafted
reply]

> I'm confused as to why this is still an issue. Sure, fix the kqueue
> semantics and do it in a way that doesn't break backwards
> compatibility.

I suggested that. Add a user->kernel flag

  EV_DROPWATCH

which, if present, causes kernel to send back to userland events with
the kernel->user flag

  EV_DROPPING

any time it drops the pointer. Then trivially userland just has to set
that flag on all its events to the kernel, and remember to send those
events back to userland when it does in fact drop them.

These events can be trivially created from the knote_drop() function:

  http://fxr.watson.org/fxr/source/kern/kern_event.c#L2127

because that's called everywhere in the kernel that actually drops
the watch.

Which brings me onto the main reason why: It becomes a lot simpler to
write userland code. When I wrote the original idea 3 years ago, it
was after some research into what reasons would drop these watches in
the kernel. By having the kernel tell userland, that future-proofs it
a lot better.

To further answer the threading questions: Having the locking point
decided by the kernel and the event reflected back up to userland
still with the pointer that kernel had simplifies all this locking.
Now, userland doesn't have to contend on a Big Structure Lock around
whatever data structure it uses to store all this information. It
allows less userland contention.

It's fully back-compatible, because all it does is adds a new
user->kernel flag, that if the userland didn't know about, wouldn't
set, and no behaviour is changed. If the flag -is- set then userland
simply starts receiving a few extra events, or has another bit flag set
on the events it was already receiving.


Finally: I feel quite sure this feature is implementable in ballpark-50
lines of kernel-side code. I'd half-bet the documentation would be
longer than that. It is truely a tiny addition of behaviour to export
information the kernel already knows (namely: that it is calling
knote_drop()). I can't see any objection to it. I'm quite sure more
words and objection have been spent arguing it back and forth than it
would have taken just to implement it initially.

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: PGP signature


Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread LeoNerd
On Mon, 13 May 2013 18:19:43 +0300
Eugen-Andrei Gavriloaie  wrote:

> I'm pretty sure this user data pointer is also breaking a well known
> pointer management paradigm, but I just can't remember which.
> Regardless, it has all the ingredients for memory leaks and/or, the
> worst one, use of corpse pointers which are bound to crash the app. I
> agree, C/C++ is not for the faint of heart, but with little or close
> to no efforts, his EV_FREEWATCH can be put to very good use, and user
> space code not only becomes less prone to mem issues, but also
> cleaner.
> 
> To summarise, +1 for the EV_FREEWATCH. I simply love the idea! Clean
> and very very efficient.

I actually developed the idea a little further and put some notes on
implementation/etc here in this PR:

  http://www.freebsd.org/cgi/query-pr.cgi?pr=153254

I don't think anyone has looked at it though.

If anyone were to just say "yes" and explain how to start developing a
kernel feature, I'm sure I'd be happy to look into it.

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: PGP signature


Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread Eugen-Andrei Gavriloaie
Hi Adrian,

All the tricks, work arounds, paradigms suggested/implemented by us, the kq 
users, are greatly simplified by simply adding that thing that Paul is 
suggesting. What you are saying here is to basically do not-so-natural things 
to overcome a real problem which can be very easy and non-intrusivly solved at 
lower levels. Seriously, if you truly believe that you can put the equal sign 
between the complexity of the user space code and the wanted patch in kqueue 
kernel side, than I simply shut up.

Besides, one of the important points in kq philosophy is simplifying things. I 
underline the "one of". It is not the goal, of course. Complex things are 
complex things no matter how hard you try to simplify them. But this is 
definitely (should) not falling into that category.

--
Eugen-Andrei Gavriloaie
Web: http://www.rtmpd.com

On May 13, 2013, at 6:47 PM, Adrian Chadd  wrote:

> ... holy crap.
> 
> On 13 May 2013 08:37, Eugen-Andrei Gavriloaie  wrote:
>> Hi,
>> 
>> Well, Paul already asked this question like 3-4 times now. Even insisting on 
>> it. I will also ask it again:
>> If user code is responsible of tracking down the data associated with the 
>> signalled entity, what is the point of having user data?
>> Is rendered completely useless…
> 
> .. why does everything have to have a well defined purpose that is
> also suited for use in _all_ situations?
That is called perfection. I know we can't achieve it, but I like to walk in 
that direction at least.

> 
>> Not to mention, that your suggestion with FD index is a definite no-go. The 
>> FD values are re-used. Especially in MT environments. Imagine one kqueue 
>> call taking place in thread A and another one in thread B. Both threads 
>> waiting for events.
> 
> .. so don't do that. I mean, you're already having to write your code
> to _not_ touch FDs in other threads. I've done this before, it isn't
> that hard and it doesn't hurt performance.
Why not? This is how you achieve natural load balancing for multiple kevent() 
calls from multiple threads over the same kq fd. Otherwise, again, you have to 
write complex code to manually balance the threads. That brings locking again….
Why people always think that locking is cheap? Excessive locking hurts. A lot!

> 
>> When A does his magic, because of internal business rules, it decides to 
>> close FD number 123. It closes it and it connects somewhere else by opening 
>> a new one. Surprise, we MAY  get the value 123 again as a new socket, we put 
>> it on our index, etc. Now, thread B comes in and it has stale/old events for 
>> the old 123 FD. Somethings bad like EOF for the OLD version of FD number 123 
>> (the one we just closed anyway). Guess what… thread B will deallocate the 
>> perfectly good thingy inside the index associated with 123.
> 
> So you just ensure that nothing at all calls a close(123); but calls
> fd_close(123) which will in turn close(123) and free all the state
> associated with it.
Once threads A and B returned from their kevent() calls, all bets are off. In 
between, you get the the behaviour I just described from threads A and B racing 
towards FD123 to either close it or create a new one. How is wrapping close() 
going to help? Is not like you have any control over what the socket() function 
is going to return. (That gave me another token idea btw… I will explain in 
another email, perhaps you care to comment)
Mathematically speaking, the fd-to-data association is not bijective. 


> 
> You have fd_close() either grab a lock, or you ensure that only the
> owning thread can call fd_close(123) and if any other thread calls it,
> the behaviour is undefined.
As I said, that adds up to the user-space code complexity. Just don't forget 
that Paul's suggestion solves all this problems in a ridiculously simple 
manner. All our ideas of keeping track who is owning who and indexes are going 
to be put to rest. kq will notify us when the udata is out of scope from kq 
perspective. That is all we ask.

> 
>> And regarding the "thread happiness", that is not happiness at all IMHO…
> 
> Unless you're writing a high connection throughput web server, the
> overhead of grabbing a lock in userland during the fd shutdown process
> is trivial. Yes, I've written those. It doesn't hurt you that much.
That "that much" is subjective. And a streaming server is a few orders of 
magnitude more complex than a web server. Remember, a web server is bound to 
request/response paradigm. While a streaming server is a full duplex (not 
request/response based) animal for most of connections. I strongly believe that 
becomes a real problem. (I would love to be wrong on this one!)

> 
> I'm confused as to why this is still an issue. Sure, fix the kqueue
> semantics and do it in a way that doesn't break backwards
> compatibility.
Than, if someone has time and pleasure, it would be nice to have it. Is a neat 
solution. Is one thing saying, hey, we don't have time, do it yourself. And 
another thing in

Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread Adrian Chadd
... holy crap.

On 13 May 2013 08:37, Eugen-Andrei Gavriloaie  wrote:
> Hi,
>
> Well, Paul already asked this question like 3-4 times now. Even insisting on 
> it. I will also ask it again:
> If user code is responsible of tracking down the data associated with the 
> signalled entity, what is the point of having user data?
> Is rendered completely useless…

.. why does everything have to have a well defined purpose that is
also suited for use in _all_ situations?

> Not to mention, that your suggestion with FD index is a definite no-go. The 
> FD values are re-used. Especially in MT environments. Imagine one kqueue call 
> taking place in thread A and another one in thread B. Both threads waiting 
> for events.

.. so don't do that. I mean, you're already having to write your code
to _not_ touch FDs in other threads. I've done this before, it isn't
that hard and it doesn't hurt performance.

> When A does his magic, because of internal business rules, it decides to 
> close FD number 123. It closes it and it connects somewhere else by opening a 
> new one. Surprise, we MAY  get the value 123 again as a new socket, we put it 
> on our index, etc. Now, thread B comes in and it has stale/old events for the 
> old 123 FD. Somethings bad like EOF for the OLD version of FD number 123 (the 
> one we just closed anyway). Guess what… thread B will deallocate the 
> perfectly good thingy inside the index associated with 123.

So you just ensure that nothing at all calls a close(123); but calls
fd_close(123) which will in turn close(123) and free all the state
associated with it.

You have fd_close() either grab a lock, or you ensure that only the
owning thread can call fd_close(123) and if any other thread calls it,
the behaviour is undefined.

> And regarding the "thread happiness", that is not happiness at all IMHO…

Unless you're writing a high connection throughput web server, the
overhead of grabbing a lock in userland during the fd shutdown process
is trivial. Yes, I've written those. It doesn't hurt you that much.

I'm confused as to why this is still an issue. Sure, fix the kqueue
semantics and do it in a way that doesn't break backwards
compatibility. But please don't claim that it's stopping you from
getting real work done. I've written network apps with kqueue that
scales to 8+ cores and (back in mid-2000's) gigabit + of small HTTP
transactions. This stuff isn't at all problematic.


Adrian
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread Eugen-Andrei Gavriloaie
Hi,

Well, Paul already asked this question like 3-4 times now. Even insisting on 
it. I will also ask it again:
If user code is responsible of tracking down the data associated with the 
signalled entity, what is the point of having user data?
Is rendered completely useless…

Not to mention, that your suggestion with FD index is a definite no-go. The FD 
values are re-used. Especially in MT environments. Imagine one kqueue call 
taking place in thread A and another one in thread B. Both threads waiting for 
events.

When A does his magic, because of internal business rules, it decides to close 
FD number 123. It closes it and it connects somewhere else by opening a new 
one. Surprise, we MAY  get the value 123 again as a new socket, we put it on 
our index, etc. Now, thread B comes in and it has stale/old events for the old 
123 FD. Somethings bad like EOF for the OLD version of FD number 123 (the one 
we just closed anyway). Guess what… thread B will deallocate the perfectly good 
thingy inside the index associated with 123.

And regarding the "thread happiness", that is not happiness at all IMHO…

Best regards,
Andrei

--
Eugen-Andrei Gavriloaie
Web: http://www.rtmpd.com

On May 13, 2013, at 6:25 PM, Adrian Chadd  wrote:

> ... or you could just track the per-descriptor / per-object stuff in
> userland, and use the FD/signal as an index into the state you need.
> 
> adding thread happiness on top of that is trivial.
> 
> Done/done.
> 
> 
> 
> 
> Adrian
> 
> On 13 May 2013 08:19, Eugen-Andrei Gavriloaie  wrote:
>> Hello to all,
>> 
>> I'm trying to reply to this thread:
>> http://lists.freebsd.org/pipermail/freebsd-hackers/2010-November/033565.html
>> 
>> I also faced this very difficult task of tracking down the user data 
>> registered into kq.
>> I end up having some "Tokens" instances which I never deallocate but always 
>> re-use them or create new ones if necessary. This tokens are used as user 
>> data for kq. They are keeping the actual pointers inside them, and the 
>> pointer itself has a reference to the Token. When the pointer dies, I reset 
>> the guts of the token. When the time comes to use the token, I have the 
>> guarantee is not the corpse of a token (never deallocate them, remember?) 
>> and I can see that the actual pointer was gone, everyone is happy. At the 
>> application shutdown, I cleanup the mess (the tokens). However, I just want 
>> to say that Paul has a valid point when he is wondering why EV_FREEWATCH was 
>> not provisioned/implemented.
>> 
>> The moment we throw multi-threading into equation, this becomes a extremely 
>> hard thing to manage (close to impossible), including my "proven-to-work" 
>> Token trick.  It renders the user data pointer completely useless because in 
>> the end we need to keep an association map inside user space. That is 
>> forcing user space code to do a lookup before use, instead of 
>> straightforward use. Not to mention the fact that we need to perform a lock 
>> before searching it. That brings havoc on kernel side on 1000+ active 
>> connections (a multi-threaded streaming server for example).
>> 
>> I'm pretty sure this user data pointer is also breaking a well known pointer 
>> management paradigm, but I just can't remember which. Regardless, it has all 
>> the ingredients for memory leaks and/or, the worst one, use of corpse 
>> pointers which are bound to crash the app. I agree, C/C++ is not for the 
>> faint of heart, but with little or close to no efforts, his EV_FREEWATCH can 
>> be put to very good use, and user space code not only becomes less prone to 
>> mem issues, but also cleaner.
>> 
>> To summarise, +1 for the EV_FREEWATCH. I simply love the idea! Clean and 
>> very very efficient.
>> 
>> Best regards,
>> Andrei
>> 
>> --
>> Eugen-Andrei Gavriloaie
>> Web: http://www.rtmpd.com
>> 



smime.p7s
Description: S/MIME cryptographic signature


Re: Managing userland data pointers in kqueue/kevent

2013-05-13 Thread Adrian Chadd
... or you could just track the per-descriptor / per-object stuff in
userland, and use the FD/signal as an index into the state you need.

adding thread happiness on top of that is trivial.

Done/done.




Adrian

On 13 May 2013 08:19, Eugen-Andrei Gavriloaie  wrote:
> Hello to all,
>
> I'm trying to reply to this thread:
> http://lists.freebsd.org/pipermail/freebsd-hackers/2010-November/033565.html
>
> I also faced this very difficult task of tracking down the user data 
> registered into kq.
> I end up having some "Tokens" instances which I never deallocate but always 
> re-use them or create new ones if necessary. This tokens are used as user 
> data for kq. They are keeping the actual pointers inside them, and the 
> pointer itself has a reference to the Token. When the pointer dies, I reset 
> the guts of the token. When the time comes to use the token, I have the 
> guarantee is not the corpse of a token (never deallocate them, remember?) and 
> I can see that the actual pointer was gone, everyone is happy. At the 
> application shutdown, I cleanup the mess (the tokens). However, I just want 
> to say that Paul has a valid point when he is wondering why EV_FREEWATCH was 
> not provisioned/implemented.
>
> The moment we throw multi-threading into equation, this becomes a extremely 
> hard thing to manage (close to impossible), including my "proven-to-work" 
> Token trick.  It renders the user data pointer completely useless because in 
> the end we need to keep an association map inside user space. That is forcing 
> user space code to do a lookup before use, instead of straightforward use. 
> Not to mention the fact that we need to perform a lock before searching it. 
> That brings havoc on kernel side on 1000+ active connections (a 
> multi-threaded streaming server for example).
>
> I'm pretty sure this user data pointer is also breaking a well known pointer 
> management paradigm, but I just can't remember which. Regardless, it has all 
> the ingredients for memory leaks and/or, the worst one, use of corpse 
> pointers which are bound to crash the app. I agree, C/C++ is not for the 
> faint of heart, but with little or close to no efforts, his EV_FREEWATCH can 
> be put to very good use, and user space code not only becomes less prone to 
> mem issues, but also cleaner.
>
> To summarise, +1 for the EV_FREEWATCH. I simply love the idea! Clean and very 
> very efficient.
>
> Best regards,
> Andrei
>
> --
> Eugen-Andrei Gavriloaie
> Web: http://www.rtmpd.com
>
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Managing userland data pointers in kqueue/kevent

2013-05-13 Thread Eugen-Andrei Gavriloaie
Hello to all,

I'm trying to reply to this thread:
http://lists.freebsd.org/pipermail/freebsd-hackers/2010-November/033565.html

I also faced this very difficult task of tracking down the user data registered 
into kq.
I end up having some "Tokens" instances which I never deallocate but always 
re-use them or create new ones if necessary. This tokens are used as user data 
for kq. They are keeping the actual pointers inside them, and the pointer 
itself has a reference to the Token. When the pointer dies, I reset the guts of 
the token. When the time comes to use the token, I have the guarantee is not 
the corpse of a token (never deallocate them, remember?) and I can see that the 
actual pointer was gone, everyone is happy. At the application shutdown, I 
cleanup the mess (the tokens). However, I just want to say that Paul has a 
valid point when he is wondering why EV_FREEWATCH was not 
provisioned/implemented. 

The moment we throw multi-threading into equation, this becomes a extremely 
hard thing to manage (close to impossible), including my "proven-to-work" Token 
trick.  It renders the user data pointer completely useless because in the end 
we need to keep an association map inside user space. That is forcing user 
space code to do a lookup before use, instead of straightforward use. Not to 
mention the fact that we need to perform a lock before searching it. That 
brings havoc on kernel side on 1000+ active connections (a multi-threaded 
streaming server for example). 

I'm pretty sure this user data pointer is also breaking a well known pointer 
management paradigm, but I just can't remember which. Regardless, it has all 
the ingredients for memory leaks and/or, the worst one, use of corpse pointers 
which are bound to crash the app. I agree, C/C++ is not for the faint of heart, 
but with little or close to no efforts, his EV_FREEWATCH can be put to very 
good use, and user space code not only becomes less prone to mem issues, but 
also cleaner.

To summarise, +1 for the EV_FREEWATCH. I simply love the idea! Clean and very 
very efficient.

Best regards,
Andrei

--
Eugen-Andrei Gavriloaie
Web: http://www.rtmpd.com



smime.p7s
Description: S/MIME cryptographic signature


Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread Paul LeoNerd Evans
On Mon, Nov 15, 2010 at 12:51:57PM -0800, Julian Elischer wrote:
> "keep more information associated with each kevent and use the user
> cookie to
> match them"  this is what it was for.
> it's a tool, not an answer. Given this tool you should be able to
> get what you want.
> how you do it is your job.

OK. Then I am not seeing it. I would love to seen an example, if you
or anyone else could provide me one, on how I am supposed to use this
feature. That would be great... but please read below first.

> It's not the kernel's job to keep application specific data for
> you.. but it gives you a way
> to do it yourself and keep track of it trivially.
> It's expected that for every event the user gives to the kernel, he
> has some matching
> information about that event in userland.

Sure. The information I keep in userland is in the structure at the end
of that  udata  pointer.


Since you claim it to be so trivial then, I would like to ask you to
explain it. It should be quite a simple task:

---
  Demonstrate me a program that, on receipt of -any- event out of the
  kqueue file descriptor, can print the word "FREE\n" when the kernel
  has now dropped its side of the watcher, for this event.

  Specifically, it has to print "FREE\n" in any of the following four
  conditions:

1. After a final event, such as EVFILT_PROC,NOTE_EXIT

2. After any event that had been registered with EV_ONESHOT

3. After the user has called EV_SET(..., EV_DELETE,...) on it

4. After calling close(fd) on a filehandle that has been registered
   under EVFILT_READ or EVFILT_WRITE
---

I am claiming that such a program cannot be written, using the current
kqueue interface, and simply allowing the user code to call EV_SET
however they like and put their own pointers in it. If I read your
assertion of triviallity correctly, then you are claiming that such a
program is indeed possible. I would therefore invite you to demonstrate
for me such a program.


If perhaps this does indeed prove to be impossible, I would like instead
you to demonstate a program having all the above properties, but
allowing you to arbitrarily wrap the kqueue API; store extra data in my
structures, or hook extra information around EV_SET calls.

I have already demonstrated -a- way to solve this, by storing data in
the event udata structure to answer 2, and storing a full mapping from
ident+filter to udata pointer, to answer 3. I declare 1 trivial by
inspection of the results in the returned kevent. I declare 4 to be
impossible short of such hackery as LD_PRELOAD around the actual close()
libc function.

In short, I claim that a solution to all parts 1-4 is impossible. It is
possible to solve 1-3 only, by storing a full mapping from ident+filter
to udata pointer, in userland. But then by doing that why bother giving
the pointer to the kernel in the first place?


There comes a further complication for a wrapping library that tries to
provide a generic interface around kqueue, for problem 1 however. Right
now, the following function could be said to implement problem 1:

 int is_final(struct kevent *ev)
 {
   switch(ev->filter) {
 case EVFILT_READ:
 case EVFILT_WRITE:
   return ev->flags & EV_EOF;
 case EVFILT_VNODE:
   return ev->fflags & (NOTE_DELETE|NOTE_REVOKE);
   /* I'm only guessing on this one from reading the docs, I'm not
* 100% sure */
 case EVFILT_PROC:
   return ev->fflags & NOTE_EXIT;
 default:
   return 0;
   }
 }

And in fact even this code isn't perfect, because the kqueue(2) manpage
does also point out that EV_EOF on a pipe/fifo isn't final, because you
can EV_CLEAR to reset the EOF condition and wait again. So maybe this
code ought to read:

   case EVFILT_READ:
   case EVFILT_WRITE:
 {
   struct stat st;
   fstat(ev->ident, &st);
   return (ev->flags & EV_EOF) && !(S_ISFIFO(st.st_mode));
 }

And so now we suddenly have to make an fstat() call -every- time we
receive an event on a read/write filter?

OK well clearly not, we'd in fact do that once at EV_ADD time, and store
whether it's a FIFO in our extended  udata  structure, so as to know if
EV_EOF is final. But then we're having to use that udata structure to
store data internal to the purposes of this kqueue interface, and not
the overall user data.


Are you still now going to claim to me this is trivial?


Please compare this solution to:

   if(ev->flags & EV_FREEWATCH)
 free(ev->udata);

I would call that solution "trivial". And I claim it fairly easy to
implement. 

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: Digital signature


Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread Julian Elischer

On 11/15/10 12:14 PM, Paul LeoNerd Evans wrote:

On Mon, Nov 15, 2010 at 11:37:23AM -0800, Julian Elischer wrote:

I don't think it was thought of in the context of reference counted items.

This problem has nothing to do with reference counting, and all to do
with resource ownership.

Consider in the totally C-based world, no refcounts, just malloc() and
free(). You malloc() some event structure, put it in the udata field to
the kernel, then return to the main run loop. You've dropped every
reference to this malloc()ed memory, because hey, kernel has it. Later,
event fires, kernel gives you back that pointer. Great. Lets use it. Was
it a PROC|EXIT event? Lets free() the data.

Was it an event that had been registered as EV_ONESHOT? Oops. We can't
remember because kernel didn't tell us.

Want to EV_DELETE it now? Can't, lost the pointer, can't ask kernel for
it back.

Want to close() the filehandle associated? Can't, because kernel has
pointer that it'll drop.

In all these cases we'll memory leak the malloc()ed data.

The only solution here is to keep another copy somewhere up in userland.
This copy has to be associated with the original filter specification,
so that on EV_DELETE we know the pointer so can free() it. But, as I
said, if we're going to keep that mapping, why does the kernel even give
us this udata ability in the first place? We might as well not bother
and use the mapping all the time.

Maybe this just is what people do? That was the thrust of my first
question - _is_ this what people do? I'm not experienced enough with
kqueue to know what is best practice here, and the documentation gives
no guidance. Can someone advise me?

---

Totally separate to that, if nobody has really thought of a solution to
this before, what are anyone's thoughts on my suggestion of the
EV_FREEWATCH flag? Get the kernel always to tell userland that it has
dropped a watch, and return the pointer back, so userland can do
whatever it wants by way of resource reclaimation.




you could use an ever increasing number that you hash on a hash table.
if the kernel returns a number that is out of date you won't find it
and you
can ignore it. If the kernel returns a number you are currently tracking.
then you use the item associated with that entry.

I'm really not sure I understand where this is going, or how it helps
"keep more information associated with each kevent and use the user 
cookie to

match them"  this is what it was for.
it's a tool, not an answer. Given this tool you should be able to get 
what you want.

how you do it is your job.
It's not the kernel's job to keep application specific data for you.. 
but it gives you a way

to do it yourself and keep track of it trivially.
It's expected that for every event the user gives to the kernel, he 
has some matching

information about that event in userland.



me...



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread Paul LeoNerd Evans
On Mon, Nov 15, 2010 at 11:37:23AM -0800, Julian Elischer wrote:
> I don't think it was thought of in the context of reference counted items.

This problem has nothing to do with reference counting, and all to do
with resource ownership.

Consider in the totally C-based world, no refcounts, just malloc() and
free(). You malloc() some event structure, put it in the udata field to
the kernel, then return to the main run loop. You've dropped every
reference to this malloc()ed memory, because hey, kernel has it. Later,
event fires, kernel gives you back that pointer. Great. Lets use it. Was
it a PROC|EXIT event? Lets free() the data.

Was it an event that had been registered as EV_ONESHOT? Oops. We can't
remember because kernel didn't tell us.

Want to EV_DELETE it now? Can't, lost the pointer, can't ask kernel for
it back.

Want to close() the filehandle associated? Can't, because kernel has
pointer that it'll drop.

In all these cases we'll memory leak the malloc()ed data.

The only solution here is to keep another copy somewhere up in userland.
This copy has to be associated with the original filter specification,
so that on EV_DELETE we know the pointer so can free() it. But, as I
said, if we're going to keep that mapping, why does the kernel even give
us this udata ability in the first place? We might as well not bother
and use the mapping all the time.

Maybe this just is what people do? That was the thrust of my first
question - _is_ this what people do? I'm not experienced enough with
kqueue to know what is best practice here, and the documentation gives
no guidance. Can someone advise me?

---

Totally separate to that, if nobody has really thought of a solution to
this before, what are anyone's thoughts on my suggestion of the
EV_FREEWATCH flag? Get the kernel always to tell userland that it has
dropped a watch, and return the pointer back, so userland can do
whatever it wants by way of resource reclaimation.



> you could use an ever increasing number that you hash on a hash table.
> if the kernel returns a number that is out of date you won't find it
> and you
> can ignore it. If the kernel returns a number you are currently tracking.
> then you use the item associated with that entry.

I'm really not sure I understand where this is going, or how it helps
me...

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: Digital signature


Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread Julian Elischer

On 11/15/10 10:38 AM, Paul LeoNerd Evans wrote:

On Mon, Nov 15, 2010 at 10:33:25AM -0800, Julian Elischer wrote:

it was provided for pretty much what you are using it for, so that
the userland caller could
easily associate the returning event with some private information
about the event.

This was indeed the impression I got. With reference to my original
questions regarding its use, perhaps you could suggest some way to
actually use this API then, in order to solve my problem?

Unless there's some subtle detail or trick I have misunderstood, it
doesn't appear to be easily possible in this manner.

How would you suggest I manage these pointers and data structures?


I don't think it was thought of in the context of reference counted items.

you could use an ever increasing number that you hash on a hash table.
if the kernel returns a number that is out of date you won't find it 
and you

can ignore it. If the kernel returns a number you are currently tracking.
then you use the item associated with that entry.



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread Paul LeoNerd Evans
On Mon, Nov 15, 2010 at 02:10:45PM -0500, John Baldwin wrote:
> On Monday, November 15, 2010 1:12:11 pm Paul LeoNerd Evans wrote:
> > On Mon, Nov 15, 2010 at 11:25:42AM -0500, John Baldwin wrote:
> > > I think the assumption is that userland actually maintains a reference on 
> > > the 
> > > specified object (e.g. a file descriptor) and will know to drop the 
> > > associated 
> > > data when the file descriptor is closed.  That is, think of the kevent as 
> > > a 
> > > member of an eventable object rather than a separate object that has a 
> > > reference to the eventable object.  When the eventable object's reference 
> > > count drops to zero in userland, then the kevent should be deleted, 
> > > either via 
> > > EV_DELETE, or implicitly (e.g. by closing the associated file descriptor).
> > 
> > Ah. Well, that could be considered a bit more awkward for the use case I
> > wanted to apply. The idea was that the  udata  would refer effectively
> > to a closure, to invoke when the event happens. The idea being you can
> > just add an event watcher by, say:
> > 
> >   $ev->EV_SET( $pid, EVFILT_PROC, 0, NOTE_EXIT, 0, sub {
> >  print STDERR "The child process $pid has now exited\n";
> >   } );
> > 
> > So, the kernel's udata pointer effectively holds the only reference to
> > this anonymous closure. It's much more flexible this way, especially for
> > oneshot events like that.
> > 
> > The beauty is also that the kevents() loop can simply know that the
> > udata is always a code reference so just has to invoke it to do whatever
> > the original caller wanted to do.
> > 
> > Keep in mind my use-case here; I'm not trying to be one specific
> > application, it's a general-purpose kevent-wrapping library.
> 
> So is GCD (Apple's libdispatch).  It also implements closures on top of
> kevent.  However, the way it works is that it doesn't expose kevent()
> directly, instead it uses kevent to implement asynchronous I/O on a 
> socket for example, and since it is logically managing the life cycle
> of a socket, it knows when the socket is closed and cleans up then.

Well, the principle item of work here is a direct API reimplementation
in IO::KQueue on CPAN. I'm trying to simply expose the API of storing an
arbitrary Perl scalar in the  udata  field. It -could- be a closure, but
of course it doesn't have to. Maybe the using code wants to keep a HASH
ref of some pseudo-structure, or whatever...

> For the above case, if you know an event is one shot, you should either
> use EV_ONESHOT, or use a wrapper around the closure that clears the event
> after the closure runs (or possibly before the closure runs?)

I don't see how passing EV_ONESHOT at all helps here. If it's oneshot by
nature (child process exit), it'll be oneshot whatever happens. I've
already observed that the EV_ONESHOT flag does not get re-emitted by the
kernel anyway, so I'd have to track that one separately somehow.

> Your use case is rare.  Almost all consumers of kevent() that I've seen
> use kevent() as one part of a system that maintain the lifecycle of objects.
> Those objects are only accessed within the system, so the system knows when
> an object is closed and can release the resources at the same time.

OK. I'm prepared to accept this. It may be that nobody's really tried to
provide a simple kqueue/kevent API wrapping for a high-level language,
that could nicely take advantage of memory management in the language,
rather than at the low C level of explicit malloc+free.


Could we perhaps address the second part of my question for a moment? If
there really isn't a general solution here, could my EV_FREEWATCH flag
be added? It's a single extra flag to export in the API .h file,
completely backward-compatible. Surely quite simple to implement in the
kernel too, because it already knows how to free() its own internal
structures anyway; so just before it does that when it deletes an event
for whatever reason, it could just fire that event back up to userland
to say "I've dropped this, and here have your pointer back". Userland
catches it, SvRECFNT_dec()s or whatever, problem solved.

Is that an API extension anyone would consider accepting? I for one
would use it.

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: Digital signature


Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread John Baldwin
On Monday, November 15, 2010 1:12:11 pm Paul LeoNerd Evans wrote:
> On Mon, Nov 15, 2010 at 11:25:42AM -0500, John Baldwin wrote:
> > I think the assumption is that userland actually maintains a reference on 
> > the 
> > specified object (e.g. a file descriptor) and will know to drop the 
> > associated 
> > data when the file descriptor is closed.  That is, think of the kevent as a 
> > member of an eventable object rather than a separate object that has a 
> > reference to the eventable object.  When the eventable object's reference 
> > count drops to zero in userland, then the kevent should be deleted, either 
> > via 
> > EV_DELETE, or implicitly (e.g. by closing the associated file descriptor).
> 
> Ah. Well, that could be considered a bit more awkward for the use case I
> wanted to apply. The idea was that the  udata  would refer effectively
> to a closure, to invoke when the event happens. The idea being you can
> just add an event watcher by, say:
> 
>   $ev->EV_SET( $pid, EVFILT_PROC, 0, NOTE_EXIT, 0, sub {
>  print STDERR "The child process $pid has now exited\n";
>   } );
> 
> So, the kernel's udata pointer effectively holds the only reference to
> this anonymous closure. It's much more flexible this way, especially for
> oneshot events like that.
> 
> The beauty is also that the kevents() loop can simply know that the
> udata is always a code reference so just has to invoke it to do whatever
> the original caller wanted to do.
> 
> Keep in mind my use-case here; I'm not trying to be one specific
> application, it's a general-purpose kevent-wrapping library.

So is GCD (Apple's libdispatch).  It also implements closures on top of
kevent.  However, the way it works is that it doesn't expose kevent()
directly, instead it uses kevent to implement asynchronous I/O on a 
socket for example, and since it is logically managing the life cycle
of a socket, it knows when the socket is closed and cleans up then.

> > I think in your case you should not give the kevent a reference to your 
> > object, but instead remove the associated event for a given object when an 
> > object's refcount drops to zero.
> 
> Well that's certainly doable in longrunning watches, but I don't think
> it sounds very convenient for a oneshot event; see the above example for
> justification.

For the above case, if you know an event is one shot, you should either
use EV_ONESHOT, or use a wrapper around the closure that clears the event
after the closure runs (or possibly before the closure runs?)

> Also it again begs my question, worth repeating here:
> 
> On Friday, November 12, 2010 1:40:00 pm Paul LeoNerd Evans wrote:
> > I had
> > thought the point of kqueue/kevent is the O(1) nature of it, which is
> > among why the kernel is storing that  void *udata  pointer in the first
> > place. If I have to store a mapping from every filter+identity back to
> > my data pointer, why does the kernel store one at all? I could just
> > ignore the udata field and use my mapping for my own purposes.
> 
> If you're saying that in my not-so-rare use case, I don't want to be
> using udata, and instead keeping my own mapping, why does the kernel
> provide this udata field at all?

Your use case is rare.  Almost all consumers of kevent() that I've seen
use kevent() as one part of a system that maintain the lifecycle of objects.
Those objects are only accessed within the system, so the system knows when
an object is closed and can release the resources at the same time.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread Paul LeoNerd Evans
On Mon, Nov 15, 2010 at 10:33:25AM -0800, Julian Elischer wrote:
> it was provided for pretty much what you are using it for, so that
> the userland caller could
> easily associate the returning event with some private information
> about the event.

This was indeed the impression I got. With reference to my original
questions regarding its use, perhaps you could suggest some way to
actually use this API then, in order to solve my problem?

Unless there's some subtle detail or trick I have misunderstood, it
doesn't appear to be easily possible in this manner.

How would you suggest I manage these pointers and data structures?

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: Digital signature


Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread Julian Elischer

On 11/15/10 10:12 AM, Paul LeoNerd Evans wrote:

On Mon, Nov 15, 2010 at 11:25:42AM -0500, John Baldwin wrote:

I think the assumption is that userland actually maintains a reference on the
specified object (e.g. a file descriptor) and will know to drop the associated
data when the file descriptor is closed.  That is, think of the kevent as a
member of an eventable object rather than a separate object that has a
reference to the eventable object.  When the eventable object's reference
count drops to zero in userland, then the kevent should be deleted, either via
EV_DELETE, or implicitly (e.g. by closing the associated file descriptor).

Ah. Well, that could be considered a bit more awkward for the use case I
wanted to apply. The idea was that the  udata  would refer effectively
to a closure, to invoke when the event happens. The idea being you can
just add an event watcher by, say:

   $ev->EV_SET( $pid, EVFILT_PROC, 0, NOTE_EXIT, 0, sub {
  print STDERR "The child process $pid has now exited\n";
   } );

So, the kernel's udata pointer effectively holds the only reference to
this anonymous closure. It's much more flexible this way, especially for
oneshot events like that.

The beauty is also that the kevents() loop can simply know that the
udata is always a code reference so just has to invoke it to do whatever
the original caller wanted to do.

Keep in mind my use-case here; I'm not trying to be one specific
application, it's a general-purpose kevent-wrapping library.


I think in your case you should not give the kevent a reference to your
object, but instead remove the associated event for a given object when an
object's refcount drops to zero.

Well that's certainly doable in longrunning watches, but I don't think
it sounds very convenient for a oneshot event; see the above example for
justification.

Also it again begs my question, worth repeating here:

On Friday, November 12, 2010 1:40:00 pm Paul LeoNerd Evans wrote:

I had
thought the point of kqueue/kevent is the O(1) nature of it, which is
among why the kernel is storing that  void *udata  pointer in the first
place. If I have to store a mapping from every filter+identity back to
my data pointer, why does the kernel store one at all? I could just
ignore the udata field and use my mapping for my own purposes.

If you're saying that in my not-so-rare use case, I don't want to be
using udata, and instead keeping my own mapping, why does the kernel
provide this udata field at all?


it was provided for pretty much what you are using it for, so that the 
userland caller could
easily associate the returning event with some private information 
about the event.




___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread Paul LeoNerd Evans
On Mon, Nov 15, 2010 at 11:25:42AM -0500, John Baldwin wrote:
> I think the assumption is that userland actually maintains a reference on the 
> specified object (e.g. a file descriptor) and will know to drop the 
> associated 
> data when the file descriptor is closed.  That is, think of the kevent as a 
> member of an eventable object rather than a separate object that has a 
> reference to the eventable object.  When the eventable object's reference 
> count drops to zero in userland, then the kevent should be deleted, either 
> via 
> EV_DELETE, or implicitly (e.g. by closing the associated file descriptor).

Ah. Well, that could be considered a bit more awkward for the use case I
wanted to apply. The idea was that the  udata  would refer effectively
to a closure, to invoke when the event happens. The idea being you can
just add an event watcher by, say:

  $ev->EV_SET( $pid, EVFILT_PROC, 0, NOTE_EXIT, 0, sub {
 print STDERR "The child process $pid has now exited\n";
  } );

So, the kernel's udata pointer effectively holds the only reference to
this anonymous closure. It's much more flexible this way, especially for
oneshot events like that.

The beauty is also that the kevents() loop can simply know that the
udata is always a code reference so just has to invoke it to do whatever
the original caller wanted to do.

Keep in mind my use-case here; I'm not trying to be one specific
application, it's a general-purpose kevent-wrapping library.

> I think in your case you should not give the kevent a reference to your 
> object, but instead remove the associated event for a given object when an 
> object's refcount drops to zero.

Well that's certainly doable in longrunning watches, but I don't think
it sounds very convenient for a oneshot event; see the above example for
justification.

Also it again begs my question, worth repeating here:

On Friday, November 12, 2010 1:40:00 pm Paul LeoNerd Evans wrote:
> I had
> thought the point of kqueue/kevent is the O(1) nature of it, which is
> among why the kernel is storing that  void *udata  pointer in the first
> place. If I have to store a mapping from every filter+identity back to
> my data pointer, why does the kernel store one at all? I could just
> ignore the udata field and use my mapping for my own purposes.

If you're saying that in my not-so-rare use case, I don't want to be
using udata, and instead keeping my own mapping, why does the kernel
provide this udata field at all?

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: Digital signature


Re: Managing userland data pointers in kqueue/kevent

2010-11-15 Thread John Baldwin
On Friday, November 12, 2010 1:40:00 pm Paul LeoNerd Evans wrote:
> I'm trying to build a high-level language wrapper around kqueue/kevent,
> specifically, a Perl wrapper.
> 
> (In fact I am trying to fix this bug:
>   http://rt.cpan.org/Public/Bug/Display.html?id=61481
> )
> 
> My plan is to use the  void *udata  field of a kevent watcher to store a
> pointer to some user-provided Perl data structure (an SV*), to associate
> with the event. Typically this could be a code reference for an event
> callback or similar, but the exact nature doesn't matter. It's a pointer
> to a reference-counted data structure. SvREFCNT_dec(sv) is the function
> used to decrement the reference counter.
> 
> To account for the fact that the kernel stores a pointer here, I'm
> artificially increasing the reference count on the object, so that it
> still remains alive even if the rest of the Perl code drops it, to rely
> on getting it back out of the kernel in an individual kevent. At some
> point when the kernel has finished looking after the event, this count
> needs to be decreased again, so the structure can be freed.
> 
> I am having trouble trying to work out how to do this, or rather, when.
> I have the following problems:
> 
>  * If the event was registered using EV_ONESHOT, when it gets fired the
>flags that come back in the event stucture do not include EV_ONESHOT.
> 
>  * Some events can only happen once, such as watching for EVFILT_PROC
>NOTE_EXIT events.
> 
>  * The kernel can silently drop watches, such as when the process calls
>close() on a filehandl with an EVFILT_READ or EVFILT_WRITE watch.
> 
>  * There doesn't seem to be a way to query that pointer back out of the
>kernel, in case the user code wants to EV_DELETE the watch.
> 
> These problems all mean that I never quite know when I ought to call
> SvREFCNT_dec() on that pointer.
> 
> My current best-attack plan looks like the following:
> 
>  a) Store a structure in the  void *udata  that contains the actual SV*
> pointer and a flag to remember if the event had been installed as
> EV_ONESHOT (or remember if it was one of the event types that is
> oneshot anyway)
> 
>  b) Store an entire mapping in userland from filter+identity to pointer,
> so that if userland wants to EV_DELETE the watch early, it has the
> pointer to be able to drop it.
> 
> I can't think of a solution to the close() problem at all, though.
> 
> Part a of my solution seems OK (though I'd wonder why the flags back
> from the kernel don't contain EV_ONESHOT), but part b confuses me. I had
> thought the point of kqueue/kevent is the O(1) nature of it, which is
> among why the kernel is storing that  void *udata  pointer in the first
> place. If I have to store a mapping from every filter+identity back to
> my data pointer, why does the kernel store one at all? I could just
> ignore the udata field and use my mapping for my own purposes.
> 
> Have I missed something here, then? I was hoping there'd be a nice way
> for kernel to give me back those pointers so I can just decrement a
> refcount on it, and have it reclaimed. 

I think the assumption is that userland actually maintains a reference on the 
specified object (e.g. a file descriptor) and will know to drop the associated 
data when the file descriptor is closed.  That is, think of the kevent as a 
member of an eventable object rather than a separate object that has a 
reference to the eventable object.  When the eventable object's reference 
count drops to zero in userland, then the kevent should be deleted, either via 
EV_DELETE, or implicitly (e.g. by closing the associated file descriptor).

I think in your case you should not give the kevent a reference to your 
object, but instead remove the associated event for a given object when an 
object's refcount drops to zero.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Managing userland data pointers in kqueue/kevent

2010-11-12 Thread Paul LeoNerd Evans
I'm trying to build a high-level language wrapper around kqueue/kevent,
specifically, a Perl wrapper.

(In fact I am trying to fix this bug:
  http://rt.cpan.org/Public/Bug/Display.html?id=61481
)

My plan is to use the  void *udata  field of a kevent watcher to store a
pointer to some user-provided Perl data structure (an SV*), to associate
with the event. Typically this could be a code reference for an event
callback or similar, but the exact nature doesn't matter. It's a pointer
to a reference-counted data structure. SvREFCNT_dec(sv) is the function
used to decrement the reference counter.

To account for the fact that the kernel stores a pointer here, I'm
artificially increasing the reference count on the object, so that it
still remains alive even if the rest of the Perl code drops it, to rely
on getting it back out of the kernel in an individual kevent. At some
point when the kernel has finished looking after the event, this count
needs to be decreased again, so the structure can be freed.

I am having trouble trying to work out how to do this, or rather, when.
I have the following problems:

 * If the event was registered using EV_ONESHOT, when it gets fired the
   flags that come back in the event stucture do not include EV_ONESHOT.

 * Some events can only happen once, such as watching for EVFILT_PROC
   NOTE_EXIT events.

 * The kernel can silently drop watches, such as when the process calls
   close() on a filehandl with an EVFILT_READ or EVFILT_WRITE watch.

 * There doesn't seem to be a way to query that pointer back out of the
   kernel, in case the user code wants to EV_DELETE the watch.

These problems all mean that I never quite know when I ought to call
SvREFCNT_dec() on that pointer.

My current best-attack plan looks like the following:

 a) Store a structure in the  void *udata  that contains the actual SV*
pointer and a flag to remember if the event had been installed as
EV_ONESHOT (or remember if it was one of the event types that is
oneshot anyway)

 b) Store an entire mapping in userland from filter+identity to pointer,
so that if userland wants to EV_DELETE the watch early, it has the
pointer to be able to drop it.

I can't think of a solution to the close() problem at all, though.

Part a of my solution seems OK (though I'd wonder why the flags back
from the kernel don't contain EV_ONESHOT), but part b confuses me. I had
thought the point of kqueue/kevent is the O(1) nature of it, which is
among why the kernel is storing that  void *udata  pointer in the first
place. If I have to store a mapping from every filter+identity back to
my data pointer, why does the kernel store one at all? I could just
ignore the udata field and use my mapping for my own purposes.

Have I missed something here, then? I was hoping there'd be a nice way
for kernel to give me back those pointers so I can just decrement a
refcount on it, and have it reclaimed. 

-

I have an idea on a small addition to the kernel API that would make
this issue much simpler to manage, if there is nothing else.

By the addition of a new event flag, called something like
EV_FREEWATCH, the kernel can be told "tell userland whenever I am about
to drop this event watcher". So now, after a EV_ONESHOT or any of the
single events are fired, or when it gets EV_DELETEed, or when the kernel
itself drops because of a close() on a filehandle, it can fire an event
back up to userland with this flag, passing up the pointer.

Now, all userland has to do to correctly manage the memory is to always
set that flag on EV_ADD, and if the flag ever comes back in an event out
of the kernel, it can  SvREFCNT_dec(ev->udata);

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: Digital signature