Re: Managing userland data pointers in kqueue/kevent
On Wed, May 15, 2013 at 01:34:58PM +0100, Paul LeoNerd Evans wrote: > On Wed, 15 May 2013 13:29:59 +0100 > Paul "LeoNerd" Evans wrote: > > Is that not the exact thing I suggested? > > The "extension to create register a kevent to catch these events" is > > that you put the EV_DROPWATCH bit flag in the event at the time you > > register it. > > The "returned event [that] could have all the appropriate informaiton > > for the event being dropped" is that you receive an event with > > EV_DROPPED set on it. It being a real event includes of course the > > udata pointer, so you can handle it. > In fact, to requote the original PR I wrote[1] on the subject: > --- > I propose the addition of a new flag applicable to any kevent watch > structure, documented thusly: > The flags field can contain the following values: > .. > EV_DROPWATCH Requests that the kernel will send an EV_DROPPED event > on this watch when it has finished watching it for any > reason, including EV_DELETE, expiry because of > EV_ONESHOT, or because the filehandle was closed by > close(2). > > EV_DROPPED This flag is returned by the kernel if it is now about > to drop the watch. After this flag has been received, > no further events will occur on this watch. > This flag then makes it trivial to build a generic wrapper for kqueue > that can always manage its memory correctly. > a) at EV_ADD time, simply set flags |= EV_DROPWATCH > b) after an event has been processed that included the EV_DROPPED > flag, free() the pointer given in the udata field. An important detail is missing: how do you avoid using up all kernel memory on knotes if someone keeps adding new file descriptors with EV_ADD | EV_DROPWATCH and closing the file descriptors again without ever draining the kqueue? This problem did not use to exist for file descriptor events before: the number of such knotes was limited to the number of open file descriptors. However, it does already exist for most of the other event types. For example, pwait -v will return the exit status even if it was suspended (^Z) while the process terminated and the parent reaped the zombie. For EVFILT_TIMER, the worst effect is a denial of service of EVFILT_TIMER on all other processes in the system. EVFILT_USER does not appear to check anything and appears to allow arbitrary kernel memory consumption. The EVFILT_TIMER needs to keep its global limit and EVFILT_USER needs something similar. For the rest, call an event that is no longer associated to a kernel object (e.g. EVFILT_READ whose file descriptor is closed, EVFILT_PROC whose process has terminated and been reaped by the parent or EVFILT_AIO whose I/O request is completed) "unbound". The number of events that are not unbound is limited by existing limits on the other kernel objects. A possible fix is to reject (such as with [ENOMEM]) adding new events when there are too many unbound events in the queue already. The application should then allow kevent() to return pending events first before it adds new ones. If the kernel returns unbound events in preference to other events, a kevent() call with nevents >= 2 * nchanges cannot result in a net increase in the number of current and potential unbound events, since it allows the kernel to return (and forget) as many unbound events as it may add (nchanges entries are required for EV_ERROR leaving nchanges for returning other events). > It is not required that these two flags have distinct values; since > one is userland->kernel and the other kernel->userland, they could for > neatness reuse the same bit field. I think it would be consistent with other EV_* to use the same name and value for both. -- Jilles Tjoelker ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Managing userland data pointers in kqueue/kevent
On 13.05.13 11:19, Eugen-Andrei Gavriloaie wrote: > Regardless, it has all the ingredients for memory leaks and/or, the > worst one, use of corpse pointers which are bound to crash the app. I I earlier pointed out other things that prevent me from using kqueue as a proper storage for user land pointers: http://lists.freebsd.org/pipermail/freebsd-hackers/2013-March/042181.html To sum it up, once I pass a pointer via udata to the kqueue system, kqueue becomes the owner and the expected behaviour is (a) never silently swallow the pointer (b) provide an interface to retrieve the pointer anytime Besides the way you pointed out to violate (a), there is another way, re-adding an existing event. So I propose a flag EV_EXCL that will fail adding an existing event with EEXIST in data. As an alternative or in addition to that, re-adding an existing event without the EV_EXCL flag will cause the old event to be returned with the EV_DROPPED mechanism proposed. This can also be used to fulfill property (b). kqueue is an efficient store for the per-event-data. In an event base application, I usually allocate resources per new session, pass the metadata via udata to kevent and will see the udata pointer whenever the event triggers. But in order to clean up resources due to external events (like program termination), I have no way to ask my kqueue for the kevents (and thus the udata) I stored in them ... without knowing about them in the first place. For the program termination case, it would be enough to extend EV_DELETE with a flag to drop all filters and catch them by checking for the EV_DROPPED flag. This means that EV_DELETE could return a list of filters and can be called several times until no events are returned. Of course this can be extended to specifically drop events that match a certain filter, flag or fflag value, but for now you can also use different kqueues. Regards, erdgeist ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Managing userland data pointers in kqueue/kevent
On Wed, 15 May 2013 13:29:59 +0100 Paul "LeoNerd" Evans wrote: > Is that not the exact thing I suggested? > > The "extension to create register a kevent to catch these events" is > that you put the EV_DROPWATCH bit flag in the event at the time you > register it. > > The "returned event [that] could have all the appropriate informaiton > for the event being dropped" is that you receive an event with > EV_DROPPED set on it. It being a real event includes of course the > udata pointer, so you can handle it. In fact, to requote the original PR I wrote[1] on the subject: --- I propose the addition of a new flag applicable to any kevent watch structure, documented thusly: The flags field can contain the following values: .. EV_DROPWATCH Requests that the kernel will send an EV_DROPPED event on this watch when it has finished watching it for any reason, including EV_DELETE, expiry because of EV_ONESHOT, or because the filehandle was closed by close(2). EV_DROPPED This flag is returned by the kernel if it is now about to drop the watch. After this flag has been received, no further events will occur on this watch. This flag then makes it trivial to build a generic wrapper for kqueue that can always manage its memory correctly. a) at EV_ADD time, simply set flags |= EV_DROPWATCH b) after an event has been processed that included the EV_DROPPED flag, free() the pointer given in the udata field. It is not required that these two flags have distinct values; since one is userland->kernel and the other kernel->userland, they could for neatness reuse the same bit field. --- [1]: http://www.freebsd.org/cgi/query-pr.cgi?pr=153254 -- Paul "LeoNerd" Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: PGP signature
Re: Managing userland data pointers in kqueue/kevent
On Wed, 15 May 2013 02:14:55 -0400 Julian Elischer wrote: > I would suggest that one answer would be to create an extension to > register a > kevent to catch these events.. > > (the knote_drop()) > > The returned event could have all the appropriate information for the > event being dropped.. Is that not the exact thing I suggested? The "extension to create register a kevent to catch these events" is that you put the EV_DROPWATCH bit flag in the event at the time you register it. The "returned event [that] could have all the appropriate informaiton for the event being dropped" is that you receive an event with EV_DROPPED set on it. It being a real event includes of course the udata pointer, so you can handle it. It's really simple to use: When you register, ev->flags |= EV_DROPWATCH When you receive an event, process it in the normal way, then if(ev->flags & EV_DROPPED) free(ev->udata); and that is all there is to it. -- Paul "LeoNerd" Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: PGP signature
Re: Managing userland data pointers in kqueue/kevent
On 5/13/13 2:44 PM, Paul LeoNerd Evans wrote: On Mon, 13 May 2013 11:23:45 -0700 Adrian Chadd wrote: Just as a data point, I managed 50,000 + connections, at 5,000 + a second, doing a gigabit + of traffic, mid-2000s, with the userland management of all of the socket/disk FD stuff. The biggest overhead at the time was actually the read/write copyin/copyout, NOT the locking overhead of managing this stuff. Why? Because I architected the HTTP side of things to specifically pin FDs to threads, and not allow arbitrary threads to deal with arbitrary FDs. This removed the need for almost all of the state locking that people are concerned about here. I think then this comes from different experiences. I'm guessing this application was: a) Written in C b) Entirely filled with identically-typed identical-purpose file descriptors c) Didn't really use any EV_ONESHOT events d) Didn't close sockets apart from when it received EOF and perhaps most importantly e) Was entirely self-contained - did everything from one unified block of source code. I.e. a very simple set of semantics. I'll explain the situation that I had. The reason I ran into the problem needing EV_DROPWATCH/EV_DROPPED was because I was trying to fix Perl's IO::KQueue. IO::KQueue tries to wrap kqueue/kevent for Perl, allowing the userland Perl code to store an arbitrary Perl data pointer in the udata field. This data is reference-counted. Userland might let the kernel store the only copy of that data, because it comes back in event notifications anyway. Because of this, the reference count has to be artificially incremented to account for the extra pointer in the kernel. Without knowing when the kernel will decide to drop that pointer, I never know when I should decrement the refcount myself. It has no knowledge of what userland is doing with this. It can't know when userland might be EV_ONESHOT'ing. It doesn't really know what events will be oneshot anyway (such as the process exit watches). Finally, it has no idea what other modules are going to call close() on it. This final problem was the real killer - while the first two -could- be worked around with more complex code structures, not knowing what other CPAN modules will ever call close() makes it impossible to handle. Simply asking every CPAN module to "please just call fd_close() instead of close()" doesn't work here. As compared: having the kernel tell userland when it calls knote_drop() is much simpler. It knows exactly when it is doing this, so simply pushing an event up to userland to tell it it did so is simple. If any more cases than the three known (EV_ONESHOT or other single-shot events; EV_DELETE, close()) are added, userland - and in particular, the IO::KQueue module, will not need updating. It will continue to decrement refcounts and free data perfectly happily when kernel has dropped the watch. I've used this pattern before in C libraries + higher-level language wrappers, and found it to be nicely simple to both implement and use. Because it follows the -same- event notification path that userland is already using, it manages to avoid quite a number of the race-conditions that a secondary, separate data structure and locking often runs into; e.g. if userland is trying to add a new thing into it just at the time there's a notification "in-flight" from the kernel about an old thing that it used to have. Principly - the fact that kernel tells -userland- about the delete, means that it can atomically *guarantee* that this *will* be the last event about this particular item. Userland must not delete its own data structure about it until this notification happens. If it does this, lots of semantics become a lot simpler. I was responsible for the u_data field. It was not in the original design that was proposed and I suggested it to Jonathan. I was thinking purely of a simple way for an event to supply added information to its handler that would obviate the need for the app to keep complicated tracking structures. I was not thinking in terms of "badly behaved" (sic) third party high level ops using it through a language binding. I admit that I did not think about the close issue at that time. Your suggested changes are not unreasonable however we could do with more discussion. The point about tracking objects that may be arbitrarily destroyed without the framework being notified is valid and aligns well with general robustness principals. I would suggest that one answer would be to create an extension to register a kevent to catch these events.. (the knote_drop()) The returned event could have all the appropriate information for the event being dropped.. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Managing userland data pointers in kqueue/kevent
On Mon, 13 May 2013 11:23:45 -0700 Adrian Chadd wrote: > Just as a data point, I managed 50,000 + connections, at 5,000 + a > second, doing a gigabit + of traffic, mid-2000s, with the userland > management of all of the socket/disk FD stuff. > > The biggest overhead at the time was actually the read/write > copyin/copyout, NOT the locking overhead of managing this stuff. Why? > Because I architected the HTTP side of things to specifically pin FDs > to threads, and not allow arbitrary threads to deal with arbitrary > FDs. This removed the need for almost all of the state locking that > people are concerned about here. I think then this comes from different experiences. I'm guessing this application was: a) Written in C b) Entirely filled with identically-typed identical-purpose file descriptors c) Didn't really use any EV_ONESHOT events d) Didn't close sockets apart from when it received EOF and perhaps most importantly e) Was entirely self-contained - did everything from one unified block of source code. I.e. a very simple set of semantics. I'll explain the situation that I had. The reason I ran into the problem needing EV_DROPWATCH/EV_DROPPED was because I was trying to fix Perl's IO::KQueue. IO::KQueue tries to wrap kqueue/kevent for Perl, allowing the userland Perl code to store an arbitrary Perl data pointer in the udata field. This data is reference-counted. Userland might let the kernel store the only copy of that data, because it comes back in event notifications anyway. Because of this, the reference count has to be artificially incremented to account for the extra pointer in the kernel. Without knowing when the kernel will decide to drop that pointer, I never know when I should decrement the refcount myself. It has no knowledge of what userland is doing with this. It can't know when userland might be EV_ONESHOT'ing. It doesn't really know what events will be oneshot anyway (such as the process exit watches). Finally, it has no idea what other modules are going to call close() on it. This final problem was the real killer - while the first two -could- be worked around with more complex code structures, not knowing what other CPAN modules will ever call close() makes it impossible to handle. Simply asking every CPAN module to "please just call fd_close() instead of close()" doesn't work here. As compared: having the kernel tell userland when it calls knote_drop() is much simpler. It knows exactly when it is doing this, so simply pushing an event up to userland to tell it it did so is simple. If any more cases than the three known (EV_ONESHOT or other single-shot events; EV_DELETE, close()) are added, userland - and in particular, the IO::KQueue module, will not need updating. It will continue to decrement refcounts and free data perfectly happily when kernel has dropped the watch. I've used this pattern before in C libraries + higher-level language wrappers, and found it to be nicely simple to both implement and use. Because it follows the -same- event notification path that userland is already using, it manages to avoid quite a number of the race-conditions that a secondary, separate data structure and locking often runs into; e.g. if userland is trying to add a new thing into it just at the time there's a notification "in-flight" from the kernel about an old thing that it used to have. Principly - the fact that kernel tells -userland- about the delete, means that it can atomically *guarantee* that this *will* be the last event about this particular item. Userland must not delete its own data structure about it until this notification happens. If it does this, lots of semantics become a lot simpler. -- Paul "LeoNerd" Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: PGP signature
Re: Managing userland data pointers in kqueue/kevent
On Mon, 13 May 2013 11:10:44 -0700 Adrian Chadd wrote: > ... also, want to code up a test implementation? > > And some stress testing cases to throw in the regression tree? I already mostly fixed Perl's IO::KQueue wrapper to use this hypothetical feature, I can easily provide that somewhere for someone to test it against. I actually wrote that bit first, before I found such a feature did not exist. That would allow some highly-parallel Perl code to use it. All the main Perl event systems can use IO::KQueue so that easily provides a lot of good test cases. -- Paul "LeoNerd" Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: PGP signature
Re: Managing userland data pointers in kqueue/kevent
Just as a data point, I managed 50,000 + connections, at 5,000 + a second, doing a gigabit + of traffic, mid-2000s, with the userland management of all of the socket/disk FD stuff. The biggest overhead at the time was actually the read/write copyin/copyout, NOT the locking overhead of managing this stuff. Why? Because I architected the HTTP side of things to specifically pin FDs to threads, and not allow arbitrary threads to deal with arbitrary FDs. This removed the need for almost all of the state locking that people are concerned about here. Adrian ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Managing userland data pointers in kqueue/kevent
I also have a large project (crtmpserver) which makes heavy use of socket FDs (with my little Token workaround) and timers. Currently, can handle 2k streaming connections simultaneously, all of them full duplex. I would gladly patch it to use this new feature!!! -- Eugen-Andrei Gavriloaie Web: http://www.rtmpd.com On May 13, 2013, at 9:15 PM, Paul "LeoNerd" Evans wrote: > On Mon, 13 May 2013 11:10:44 -0700 > Adrian Chadd wrote: > >> ... also, want to code up a test implementation? >> >> And some stress testing cases to throw in the regression tree? > > I already mostly fixed Perl's IO::KQueue wrapper to use this > hypothetical feature, I can easily provide that somewhere for someone > to test it against. I actually wrote that bit first, before I found > such a feature did not exist. > > That would allow some highly-parallel Perl code to use it. All the main > Perl event systems can use IO::KQueue so that easily provides a lot of > good test cases. > > -- > Paul "LeoNerd" Evans > > leon...@leonerd.org.uk > ICQ# 4135350 | Registered Linux# 179460 > http://www.leonerd.org.uk/ smime.p7s Description: S/MIME cryptographic signature
Re: Managing userland data pointers in kqueue/kevent
-- Eugen-Andrei Gavriloaie Web: http://www.rtmpd.com On May 13, 2013, at 9:02 PM, Adrian Chadd wrote: > Hi, > > The reason I tend to suggest this is for portability and debugging > reasons. (Before and even since libevent came into existence.) > > If you do it right, you can stub / inline out all of the wrapper > functions in userland and translate them to straight system or library > calls. > > Anyway. I'm all for making kqueue better. I just worry that adding > little hacks here and there isn't the right way to do it. If you want > to guarantee specific behaviours with kqueue, you should likely define > how it should work in its entirety and see if it will cause > architectural difficulties down the track. And it caused some so far. We have workarounds for it, no problem. > Until that is done, I think > you have no excuse to get your code working as needed. Yes. I agree. But when I look at the user space code without that feature, and when thinking how it would have been with that feature, it kinda makes me cry. A little more pain and I will make that patch myself. I'm just hoping that kq kernel side code will be handled by more capable hands before me. Ideally, by the creators. > > Don't blame kqueue because what (iirc) is not defined behaviour isn't > defined in a way that makes you happy :) Nobody blamed kqueue. I'm just saying that it would be better for me (and I'm not the only one) who could use a ltle more help from it. It was born from needs, it evolved because of needs, why stop now? I dare to say it will become a standard on linux and other OSs very soon. It is the best fd reactor. Hands down! Best regards, Andrei > > > > Adrian > > On 13 May 2013 09:36, Eugen-Andrei Gavriloaie wrote: >> Hi Adrian, >> >> All the tricks, work arounds, paradigms suggested/implemented by us, the kq >> users, are greatly simplified by simply adding that thing that Paul is >> suggesting. What you are saying here is to basically do not-so-natural >> things to overcome a real problem which can be very easy and non-intrusivly >> solved at lower levels. Seriously, if you truly believe that you can put the >> equal sign between the complexity of the user space code and the wanted >> patch in kqueue kernel side, than I simply shut up. >> >> Besides, one of the important points in kq philosophy is simplifying things. >> I underline the "one of". It is not the goal, of course. Complex things are >> complex things no matter how hard you try to simplify them. But this is >> definitely (should) not falling into that category. >> >> -- >> Eugen-Andrei Gavriloaie >> Web: http://www.rtmpd.com >> >> On May 13, 2013, at 6:47 PM, Adrian Chadd wrote: >> >>> ... holy crap. >>> >>> On 13 May 2013 08:37, Eugen-Andrei Gavriloaie wrote: Hi, Well, Paul already asked this question like 3-4 times now. Even insisting on it. I will also ask it again: If user code is responsible of tracking down the data associated with the signalled entity, what is the point of having user data? Is rendered completely useless… >>> >>> .. why does everything have to have a well defined purpose that is >>> also suited for use in _all_ situations? >> That is called perfection. I know we can't achieve it, but I like to walk in >> that direction at least. >> >>> Not to mention, that your suggestion with FD index is a definite no-go. The FD values are re-used. Especially in MT environments. Imagine one kqueue call taking place in thread A and another one in thread B. Both threads waiting for events. >>> >>> .. so don't do that. I mean, you're already having to write your code >>> to _not_ touch FDs in other threads. I've done this before, it isn't >>> that hard and it doesn't hurt performance. >> Why not? This is how you achieve natural load balancing for multiple >> kevent() calls from multiple threads over the same kq fd. Otherwise, again, >> you have to write complex code to manually balance the threads. That brings >> locking again…. >> Why people always think that locking is cheap? Excessive locking hurts. A >> lot! >> >>> When A does his magic, because of internal business rules, it decides to close FD number 123. It closes it and it connects somewhere else by opening a new one. Surprise, we MAY get the value 123 again as a new socket, we put it on our index, etc. Now, thread B comes in and it has stale/old events for the old 123 FD. Somethings bad like EOF for the OLD version of FD number 123 (the one we just closed anyway). Guess what… thread B will deallocate the perfectly good thingy inside the index associated with 123. >>> >>> So you just ensure that nothing at all calls a close(123); but calls >>> fd_close(123) which will in turn close(123) and free all the state >>> associated with it. >> Once threads A and B returned from their kevent() calls, all bets are off. >> In between, yo
Re: Managing userland data pointers in kqueue/kevent
... also, want to code up a test implementation? And some stress testing cases to throw in the regression tree? I'll help shephard this in if this all works out. thanks, Adrian ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Managing userland data pointers in kqueue/kevent
On 13 May 2013 10:53, Paul LeoNerd wrote: > [I'm not currently on the list so please forgive the manually-crafted > reply] > >> I'm confused as to why this is still an issue. Sure, fix the kqueue >> semantics and do it in a way that doesn't break backwards >> compatibility. > > I suggested that. Add a user->kernel flag > > EV_DROPWATCH > > which, if present, causes kernel to send back to userland events with > the kernel->user flag > > EV_DROPPING > > any time it drops the pointer. Then trivially userland just has to set > that flag on all its events to the kernel, and remember to send those > events back to userland when it does in fact drop them. Cool! Ok. I'll go bring this up at bsdcan and see what people think. I haven't been knee deep in this stuff for a few years (but am about to again, damned HTTP proxies!) and I would love to have better semantics here. I just want to make sure it doesn't cause weird things for the non-socket case - ie, files (local, NFS) and signals. Adrian ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Managing userland data pointers in kqueue/kevent
Hi, The reason I tend to suggest this is for portability and debugging reasons. (Before and even since libevent came into existence.) If you do it right, you can stub / inline out all of the wrapper functions in userland and translate them to straight system or library calls. Anyway. I'm all for making kqueue better. I just worry that adding little hacks here and there isn't the right way to do it. If you want to guarantee specific behaviours with kqueue, you should likely define how it should work in its entirety and see if it will cause architectural difficulties down the track. Until that is done, I think you have no excuse to get your code working as needed. Don't blame kqueue because what (iirc) is not defined behaviour isn't defined in a way that makes you happy :) Adrian On 13 May 2013 09:36, Eugen-Andrei Gavriloaie wrote: > Hi Adrian, > > All the tricks, work arounds, paradigms suggested/implemented by us, the kq > users, are greatly simplified by simply adding that thing that Paul is > suggesting. What you are saying here is to basically do not-so-natural things > to overcome a real problem which can be very easy and non-intrusivly solved > at lower levels. Seriously, if you truly believe that you can put the equal > sign between the complexity of the user space code and the wanted patch in > kqueue kernel side, than I simply shut up. > > Besides, one of the important points in kq philosophy is simplifying things. > I underline the "one of". It is not the goal, of course. Complex things are > complex things no matter how hard you try to simplify them. But this is > definitely (should) not falling into that category. > > -- > Eugen-Andrei Gavriloaie > Web: http://www.rtmpd.com > > On May 13, 2013, at 6:47 PM, Adrian Chadd wrote: > >> ... holy crap. >> >> On 13 May 2013 08:37, Eugen-Andrei Gavriloaie wrote: >>> Hi, >>> >>> Well, Paul already asked this question like 3-4 times now. Even insisting >>> on it. I will also ask it again: >>> If user code is responsible of tracking down the data associated with the >>> signalled entity, what is the point of having user data? >>> Is rendered completely useless… >> >> .. why does everything have to have a well defined purpose that is >> also suited for use in _all_ situations? > That is called perfection. I know we can't achieve it, but I like to walk in > that direction at least. > >> >>> Not to mention, that your suggestion with FD index is a definite no-go. The >>> FD values are re-used. Especially in MT environments. Imagine one kqueue >>> call taking place in thread A and another one in thread B. Both threads >>> waiting for events. >> >> .. so don't do that. I mean, you're already having to write your code >> to _not_ touch FDs in other threads. I've done this before, it isn't >> that hard and it doesn't hurt performance. > Why not? This is how you achieve natural load balancing for multiple kevent() > calls from multiple threads over the same kq fd. Otherwise, again, you have > to write complex code to manually balance the threads. That brings locking > again…. > Why people always think that locking is cheap? Excessive locking hurts. A lot! > >> >>> When A does his magic, because of internal business rules, it decides to >>> close FD number 123. It closes it and it connects somewhere else by opening >>> a new one. Surprise, we MAY get the value 123 again as a new socket, we >>> put it on our index, etc. Now, thread B comes in and it has stale/old >>> events for the old 123 FD. Somethings bad like EOF for the OLD version of >>> FD number 123 (the one we just closed anyway). Guess what… thread B will >>> deallocate the perfectly good thingy inside the index associated with 123. >> >> So you just ensure that nothing at all calls a close(123); but calls >> fd_close(123) which will in turn close(123) and free all the state >> associated with it. > Once threads A and B returned from their kevent() calls, all bets are off. In > between, you get the the behaviour I just described from threads A and B > racing towards FD123 to either close it or create a new one. How is wrapping > close() going to help? Is not like you have any control over what the > socket() function is going to return. (That gave me another token idea btw… I > will explain in another email, perhaps you care to comment) > Mathematically speaking, the fd-to-data association is not bijective. > > >> >> You have fd_close() either grab a lock, or you ensure that only the >> owning thread can call fd_close(123) and if any other thread calls it, >> the behaviour is undefined. > As I said, that adds up to the user-space code complexity. Just don't forget > that Paul's suggestion solves all this problems in a ridiculously simple > manner. All our ideas of keeping track who is owning who and indexes are > going to be put to rest. kq will notify us when the udata is out of scope > from kq perspective. That is all we ask. > >> >>> And regarding the "thre
Re: Managing userland data pointers in kqueue/kevent
[I'm not currently on the list so please forgive the manually-crafted reply] > I'm confused as to why this is still an issue. Sure, fix the kqueue > semantics and do it in a way that doesn't break backwards > compatibility. I suggested that. Add a user->kernel flag EV_DROPWATCH which, if present, causes kernel to send back to userland events with the kernel->user flag EV_DROPPING any time it drops the pointer. Then trivially userland just has to set that flag on all its events to the kernel, and remember to send those events back to userland when it does in fact drop them. These events can be trivially created from the knote_drop() function: http://fxr.watson.org/fxr/source/kern/kern_event.c#L2127 because that's called everywhere in the kernel that actually drops the watch. Which brings me onto the main reason why: It becomes a lot simpler to write userland code. When I wrote the original idea 3 years ago, it was after some research into what reasons would drop these watches in the kernel. By having the kernel tell userland, that future-proofs it a lot better. To further answer the threading questions: Having the locking point decided by the kernel and the event reflected back up to userland still with the pointer that kernel had simplifies all this locking. Now, userland doesn't have to contend on a Big Structure Lock around whatever data structure it uses to store all this information. It allows less userland contention. It's fully back-compatible, because all it does is adds a new user->kernel flag, that if the userland didn't know about, wouldn't set, and no behaviour is changed. If the flag -is- set then userland simply starts receiving a few extra events, or has another bit flag set on the events it was already receiving. Finally: I feel quite sure this feature is implementable in ballpark-50 lines of kernel-side code. I'd half-bet the documentation would be longer than that. It is truely a tiny addition of behaviour to export information the kernel already knows (namely: that it is calling knote_drop()). I can't see any objection to it. I'm quite sure more words and objection have been spent arguing it back and forth than it would have taken just to implement it initially. -- Paul "LeoNerd" Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: PGP signature
Re: Managing userland data pointers in kqueue/kevent
On Mon, 13 May 2013 18:19:43 +0300 Eugen-Andrei Gavriloaie wrote: > I'm pretty sure this user data pointer is also breaking a well known > pointer management paradigm, but I just can't remember which. > Regardless, it has all the ingredients for memory leaks and/or, the > worst one, use of corpse pointers which are bound to crash the app. I > agree, C/C++ is not for the faint of heart, but with little or close > to no efforts, his EV_FREEWATCH can be put to very good use, and user > space code not only becomes less prone to mem issues, but also > cleaner. > > To summarise, +1 for the EV_FREEWATCH. I simply love the idea! Clean > and very very efficient. I actually developed the idea a little further and put some notes on implementation/etc here in this PR: http://www.freebsd.org/cgi/query-pr.cgi?pr=153254 I don't think anyone has looked at it though. If anyone were to just say "yes" and explain how to start developing a kernel feature, I'm sure I'd be happy to look into it. -- Paul "LeoNerd" Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: PGP signature
Re: Managing userland data pointers in kqueue/kevent
Hi Adrian, All the tricks, work arounds, paradigms suggested/implemented by us, the kq users, are greatly simplified by simply adding that thing that Paul is suggesting. What you are saying here is to basically do not-so-natural things to overcome a real problem which can be very easy and non-intrusivly solved at lower levels. Seriously, if you truly believe that you can put the equal sign between the complexity of the user space code and the wanted patch in kqueue kernel side, than I simply shut up. Besides, one of the important points in kq philosophy is simplifying things. I underline the "one of". It is not the goal, of course. Complex things are complex things no matter how hard you try to simplify them. But this is definitely (should) not falling into that category. -- Eugen-Andrei Gavriloaie Web: http://www.rtmpd.com On May 13, 2013, at 6:47 PM, Adrian Chadd wrote: > ... holy crap. > > On 13 May 2013 08:37, Eugen-Andrei Gavriloaie wrote: >> Hi, >> >> Well, Paul already asked this question like 3-4 times now. Even insisting on >> it. I will also ask it again: >> If user code is responsible of tracking down the data associated with the >> signalled entity, what is the point of having user data? >> Is rendered completely useless… > > .. why does everything have to have a well defined purpose that is > also suited for use in _all_ situations? That is called perfection. I know we can't achieve it, but I like to walk in that direction at least. > >> Not to mention, that your suggestion with FD index is a definite no-go. The >> FD values are re-used. Especially in MT environments. Imagine one kqueue >> call taking place in thread A and another one in thread B. Both threads >> waiting for events. > > .. so don't do that. I mean, you're already having to write your code > to _not_ touch FDs in other threads. I've done this before, it isn't > that hard and it doesn't hurt performance. Why not? This is how you achieve natural load balancing for multiple kevent() calls from multiple threads over the same kq fd. Otherwise, again, you have to write complex code to manually balance the threads. That brings locking again…. Why people always think that locking is cheap? Excessive locking hurts. A lot! > >> When A does his magic, because of internal business rules, it decides to >> close FD number 123. It closes it and it connects somewhere else by opening >> a new one. Surprise, we MAY get the value 123 again as a new socket, we put >> it on our index, etc. Now, thread B comes in and it has stale/old events for >> the old 123 FD. Somethings bad like EOF for the OLD version of FD number 123 >> (the one we just closed anyway). Guess what… thread B will deallocate the >> perfectly good thingy inside the index associated with 123. > > So you just ensure that nothing at all calls a close(123); but calls > fd_close(123) which will in turn close(123) and free all the state > associated with it. Once threads A and B returned from their kevent() calls, all bets are off. In between, you get the the behaviour I just described from threads A and B racing towards FD123 to either close it or create a new one. How is wrapping close() going to help? Is not like you have any control over what the socket() function is going to return. (That gave me another token idea btw… I will explain in another email, perhaps you care to comment) Mathematically speaking, the fd-to-data association is not bijective. > > You have fd_close() either grab a lock, or you ensure that only the > owning thread can call fd_close(123) and if any other thread calls it, > the behaviour is undefined. As I said, that adds up to the user-space code complexity. Just don't forget that Paul's suggestion solves all this problems in a ridiculously simple manner. All our ideas of keeping track who is owning who and indexes are going to be put to rest. kq will notify us when the udata is out of scope from kq perspective. That is all we ask. > >> And regarding the "thread happiness", that is not happiness at all IMHO… > > Unless you're writing a high connection throughput web server, the > overhead of grabbing a lock in userland during the fd shutdown process > is trivial. Yes, I've written those. It doesn't hurt you that much. That "that much" is subjective. And a streaming server is a few orders of magnitude more complex than a web server. Remember, a web server is bound to request/response paradigm. While a streaming server is a full duplex (not request/response based) animal for most of connections. I strongly believe that becomes a real problem. (I would love to be wrong on this one!) > > I'm confused as to why this is still an issue. Sure, fix the kqueue > semantics and do it in a way that doesn't break backwards > compatibility. Than, if someone has time and pleasure, it would be nice to have it. Is a neat solution. Is one thing saying, hey, we don't have time, do it yourself. And another thing in
Re: Managing userland data pointers in kqueue/kevent
... holy crap. On 13 May 2013 08:37, Eugen-Andrei Gavriloaie wrote: > Hi, > > Well, Paul already asked this question like 3-4 times now. Even insisting on > it. I will also ask it again: > If user code is responsible of tracking down the data associated with the > signalled entity, what is the point of having user data? > Is rendered completely useless… .. why does everything have to have a well defined purpose that is also suited for use in _all_ situations? > Not to mention, that your suggestion with FD index is a definite no-go. The > FD values are re-used. Especially in MT environments. Imagine one kqueue call > taking place in thread A and another one in thread B. Both threads waiting > for events. .. so don't do that. I mean, you're already having to write your code to _not_ touch FDs in other threads. I've done this before, it isn't that hard and it doesn't hurt performance. > When A does his magic, because of internal business rules, it decides to > close FD number 123. It closes it and it connects somewhere else by opening a > new one. Surprise, we MAY get the value 123 again as a new socket, we put it > on our index, etc. Now, thread B comes in and it has stale/old events for the > old 123 FD. Somethings bad like EOF for the OLD version of FD number 123 (the > one we just closed anyway). Guess what… thread B will deallocate the > perfectly good thingy inside the index associated with 123. So you just ensure that nothing at all calls a close(123); but calls fd_close(123) which will in turn close(123) and free all the state associated with it. You have fd_close() either grab a lock, or you ensure that only the owning thread can call fd_close(123) and if any other thread calls it, the behaviour is undefined. > And regarding the "thread happiness", that is not happiness at all IMHO… Unless you're writing a high connection throughput web server, the overhead of grabbing a lock in userland during the fd shutdown process is trivial. Yes, I've written those. It doesn't hurt you that much. I'm confused as to why this is still an issue. Sure, fix the kqueue semantics and do it in a way that doesn't break backwards compatibility. But please don't claim that it's stopping you from getting real work done. I've written network apps with kqueue that scales to 8+ cores and (back in mid-2000's) gigabit + of small HTTP transactions. This stuff isn't at all problematic. Adrian ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Managing userland data pointers in kqueue/kevent
Hi, Well, Paul already asked this question like 3-4 times now. Even insisting on it. I will also ask it again: If user code is responsible of tracking down the data associated with the signalled entity, what is the point of having user data? Is rendered completely useless… Not to mention, that your suggestion with FD index is a definite no-go. The FD values are re-used. Especially in MT environments. Imagine one kqueue call taking place in thread A and another one in thread B. Both threads waiting for events. When A does his magic, because of internal business rules, it decides to close FD number 123. It closes it and it connects somewhere else by opening a new one. Surprise, we MAY get the value 123 again as a new socket, we put it on our index, etc. Now, thread B comes in and it has stale/old events for the old 123 FD. Somethings bad like EOF for the OLD version of FD number 123 (the one we just closed anyway). Guess what… thread B will deallocate the perfectly good thingy inside the index associated with 123. And regarding the "thread happiness", that is not happiness at all IMHO… Best regards, Andrei -- Eugen-Andrei Gavriloaie Web: http://www.rtmpd.com On May 13, 2013, at 6:25 PM, Adrian Chadd wrote: > ... or you could just track the per-descriptor / per-object stuff in > userland, and use the FD/signal as an index into the state you need. > > adding thread happiness on top of that is trivial. > > Done/done. > > > > > Adrian > > On 13 May 2013 08:19, Eugen-Andrei Gavriloaie wrote: >> Hello to all, >> >> I'm trying to reply to this thread: >> http://lists.freebsd.org/pipermail/freebsd-hackers/2010-November/033565.html >> >> I also faced this very difficult task of tracking down the user data >> registered into kq. >> I end up having some "Tokens" instances which I never deallocate but always >> re-use them or create new ones if necessary. This tokens are used as user >> data for kq. They are keeping the actual pointers inside them, and the >> pointer itself has a reference to the Token. When the pointer dies, I reset >> the guts of the token. When the time comes to use the token, I have the >> guarantee is not the corpse of a token (never deallocate them, remember?) >> and I can see that the actual pointer was gone, everyone is happy. At the >> application shutdown, I cleanup the mess (the tokens). However, I just want >> to say that Paul has a valid point when he is wondering why EV_FREEWATCH was >> not provisioned/implemented. >> >> The moment we throw multi-threading into equation, this becomes a extremely >> hard thing to manage (close to impossible), including my "proven-to-work" >> Token trick. It renders the user data pointer completely useless because in >> the end we need to keep an association map inside user space. That is >> forcing user space code to do a lookup before use, instead of >> straightforward use. Not to mention the fact that we need to perform a lock >> before searching it. That brings havoc on kernel side on 1000+ active >> connections (a multi-threaded streaming server for example). >> >> I'm pretty sure this user data pointer is also breaking a well known pointer >> management paradigm, but I just can't remember which. Regardless, it has all >> the ingredients for memory leaks and/or, the worst one, use of corpse >> pointers which are bound to crash the app. I agree, C/C++ is not for the >> faint of heart, but with little or close to no efforts, his EV_FREEWATCH can >> be put to very good use, and user space code not only becomes less prone to >> mem issues, but also cleaner. >> >> To summarise, +1 for the EV_FREEWATCH. I simply love the idea! Clean and >> very very efficient. >> >> Best regards, >> Andrei >> >> -- >> Eugen-Andrei Gavriloaie >> Web: http://www.rtmpd.com >> smime.p7s Description: S/MIME cryptographic signature
Re: Managing userland data pointers in kqueue/kevent
... or you could just track the per-descriptor / per-object stuff in userland, and use the FD/signal as an index into the state you need. adding thread happiness on top of that is trivial. Done/done. Adrian On 13 May 2013 08:19, Eugen-Andrei Gavriloaie wrote: > Hello to all, > > I'm trying to reply to this thread: > http://lists.freebsd.org/pipermail/freebsd-hackers/2010-November/033565.html > > I also faced this very difficult task of tracking down the user data > registered into kq. > I end up having some "Tokens" instances which I never deallocate but always > re-use them or create new ones if necessary. This tokens are used as user > data for kq. They are keeping the actual pointers inside them, and the > pointer itself has a reference to the Token. When the pointer dies, I reset > the guts of the token. When the time comes to use the token, I have the > guarantee is not the corpse of a token (never deallocate them, remember?) and > I can see that the actual pointer was gone, everyone is happy. At the > application shutdown, I cleanup the mess (the tokens). However, I just want > to say that Paul has a valid point when he is wondering why EV_FREEWATCH was > not provisioned/implemented. > > The moment we throw multi-threading into equation, this becomes a extremely > hard thing to manage (close to impossible), including my "proven-to-work" > Token trick. It renders the user data pointer completely useless because in > the end we need to keep an association map inside user space. That is forcing > user space code to do a lookup before use, instead of straightforward use. > Not to mention the fact that we need to perform a lock before searching it. > That brings havoc on kernel side on 1000+ active connections (a > multi-threaded streaming server for example). > > I'm pretty sure this user data pointer is also breaking a well known pointer > management paradigm, but I just can't remember which. Regardless, it has all > the ingredients for memory leaks and/or, the worst one, use of corpse > pointers which are bound to crash the app. I agree, C/C++ is not for the > faint of heart, but with little or close to no efforts, his EV_FREEWATCH can > be put to very good use, and user space code not only becomes less prone to > mem issues, but also cleaner. > > To summarise, +1 for the EV_FREEWATCH. I simply love the idea! Clean and very > very efficient. > > Best regards, > Andrei > > -- > Eugen-Andrei Gavriloaie > Web: http://www.rtmpd.com > ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Managing userland data pointers in kqueue/kevent
Hello to all, I'm trying to reply to this thread: http://lists.freebsd.org/pipermail/freebsd-hackers/2010-November/033565.html I also faced this very difficult task of tracking down the user data registered into kq. I end up having some "Tokens" instances which I never deallocate but always re-use them or create new ones if necessary. This tokens are used as user data for kq. They are keeping the actual pointers inside them, and the pointer itself has a reference to the Token. When the pointer dies, I reset the guts of the token. When the time comes to use the token, I have the guarantee is not the corpse of a token (never deallocate them, remember?) and I can see that the actual pointer was gone, everyone is happy. At the application shutdown, I cleanup the mess (the tokens). However, I just want to say that Paul has a valid point when he is wondering why EV_FREEWATCH was not provisioned/implemented. The moment we throw multi-threading into equation, this becomes a extremely hard thing to manage (close to impossible), including my "proven-to-work" Token trick. It renders the user data pointer completely useless because in the end we need to keep an association map inside user space. That is forcing user space code to do a lookup before use, instead of straightforward use. Not to mention the fact that we need to perform a lock before searching it. That brings havoc on kernel side on 1000+ active connections (a multi-threaded streaming server for example). I'm pretty sure this user data pointer is also breaking a well known pointer management paradigm, but I just can't remember which. Regardless, it has all the ingredients for memory leaks and/or, the worst one, use of corpse pointers which are bound to crash the app. I agree, C/C++ is not for the faint of heart, but with little or close to no efforts, his EV_FREEWATCH can be put to very good use, and user space code not only becomes less prone to mem issues, but also cleaner. To summarise, +1 for the EV_FREEWATCH. I simply love the idea! Clean and very very efficient. Best regards, Andrei -- Eugen-Andrei Gavriloaie Web: http://www.rtmpd.com smime.p7s Description: S/MIME cryptographic signature
Re: Managing userland data pointers in kqueue/kevent
On Mon, Nov 15, 2010 at 12:51:57PM -0800, Julian Elischer wrote: > "keep more information associated with each kevent and use the user > cookie to > match them" this is what it was for. > it's a tool, not an answer. Given this tool you should be able to > get what you want. > how you do it is your job. OK. Then I am not seeing it. I would love to seen an example, if you or anyone else could provide me one, on how I am supposed to use this feature. That would be great... but please read below first. > It's not the kernel's job to keep application specific data for > you.. but it gives you a way > to do it yourself and keep track of it trivially. > It's expected that for every event the user gives to the kernel, he > has some matching > information about that event in userland. Sure. The information I keep in userland is in the structure at the end of that udata pointer. Since you claim it to be so trivial then, I would like to ask you to explain it. It should be quite a simple task: --- Demonstrate me a program that, on receipt of -any- event out of the kqueue file descriptor, can print the word "FREE\n" when the kernel has now dropped its side of the watcher, for this event. Specifically, it has to print "FREE\n" in any of the following four conditions: 1. After a final event, such as EVFILT_PROC,NOTE_EXIT 2. After any event that had been registered with EV_ONESHOT 3. After the user has called EV_SET(..., EV_DELETE,...) on it 4. After calling close(fd) on a filehandle that has been registered under EVFILT_READ or EVFILT_WRITE --- I am claiming that such a program cannot be written, using the current kqueue interface, and simply allowing the user code to call EV_SET however they like and put their own pointers in it. If I read your assertion of triviallity correctly, then you are claiming that such a program is indeed possible. I would therefore invite you to demonstrate for me such a program. If perhaps this does indeed prove to be impossible, I would like instead you to demonstate a program having all the above properties, but allowing you to arbitrarily wrap the kqueue API; store extra data in my structures, or hook extra information around EV_SET calls. I have already demonstrated -a- way to solve this, by storing data in the event udata structure to answer 2, and storing a full mapping from ident+filter to udata pointer, to answer 3. I declare 1 trivial by inspection of the results in the returned kevent. I declare 4 to be impossible short of such hackery as LD_PRELOAD around the actual close() libc function. In short, I claim that a solution to all parts 1-4 is impossible. It is possible to solve 1-3 only, by storing a full mapping from ident+filter to udata pointer, in userland. But then by doing that why bother giving the pointer to the kernel in the first place? There comes a further complication for a wrapping library that tries to provide a generic interface around kqueue, for problem 1 however. Right now, the following function could be said to implement problem 1: int is_final(struct kevent *ev) { switch(ev->filter) { case EVFILT_READ: case EVFILT_WRITE: return ev->flags & EV_EOF; case EVFILT_VNODE: return ev->fflags & (NOTE_DELETE|NOTE_REVOKE); /* I'm only guessing on this one from reading the docs, I'm not * 100% sure */ case EVFILT_PROC: return ev->fflags & NOTE_EXIT; default: return 0; } } And in fact even this code isn't perfect, because the kqueue(2) manpage does also point out that EV_EOF on a pipe/fifo isn't final, because you can EV_CLEAR to reset the EOF condition and wait again. So maybe this code ought to read: case EVFILT_READ: case EVFILT_WRITE: { struct stat st; fstat(ev->ident, &st); return (ev->flags & EV_EOF) && !(S_ISFIFO(st.st_mode)); } And so now we suddenly have to make an fstat() call -every- time we receive an event on a read/write filter? OK well clearly not, we'd in fact do that once at EV_ADD time, and store whether it's a FIFO in our extended udata structure, so as to know if EV_EOF is final. But then we're having to use that udata structure to store data internal to the purposes of this kqueue interface, and not the overall user data. Are you still now going to claim to me this is trivial? Please compare this solution to: if(ev->flags & EV_FREEWATCH) free(ev->udata); I would call that solution "trivial". And I claim it fairly easy to implement. -- Paul "LeoNerd" Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: Digital signature
Re: Managing userland data pointers in kqueue/kevent
On 11/15/10 12:14 PM, Paul LeoNerd Evans wrote: On Mon, Nov 15, 2010 at 11:37:23AM -0800, Julian Elischer wrote: I don't think it was thought of in the context of reference counted items. This problem has nothing to do with reference counting, and all to do with resource ownership. Consider in the totally C-based world, no refcounts, just malloc() and free(). You malloc() some event structure, put it in the udata field to the kernel, then return to the main run loop. You've dropped every reference to this malloc()ed memory, because hey, kernel has it. Later, event fires, kernel gives you back that pointer. Great. Lets use it. Was it a PROC|EXIT event? Lets free() the data. Was it an event that had been registered as EV_ONESHOT? Oops. We can't remember because kernel didn't tell us. Want to EV_DELETE it now? Can't, lost the pointer, can't ask kernel for it back. Want to close() the filehandle associated? Can't, because kernel has pointer that it'll drop. In all these cases we'll memory leak the malloc()ed data. The only solution here is to keep another copy somewhere up in userland. This copy has to be associated with the original filter specification, so that on EV_DELETE we know the pointer so can free() it. But, as I said, if we're going to keep that mapping, why does the kernel even give us this udata ability in the first place? We might as well not bother and use the mapping all the time. Maybe this just is what people do? That was the thrust of my first question - _is_ this what people do? I'm not experienced enough with kqueue to know what is best practice here, and the documentation gives no guidance. Can someone advise me? --- Totally separate to that, if nobody has really thought of a solution to this before, what are anyone's thoughts on my suggestion of the EV_FREEWATCH flag? Get the kernel always to tell userland that it has dropped a watch, and return the pointer back, so userland can do whatever it wants by way of resource reclaimation. you could use an ever increasing number that you hash on a hash table. if the kernel returns a number that is out of date you won't find it and you can ignore it. If the kernel returns a number you are currently tracking. then you use the item associated with that entry. I'm really not sure I understand where this is going, or how it helps "keep more information associated with each kevent and use the user cookie to match them" this is what it was for. it's a tool, not an answer. Given this tool you should be able to get what you want. how you do it is your job. It's not the kernel's job to keep application specific data for you.. but it gives you a way to do it yourself and keep track of it trivially. It's expected that for every event the user gives to the kernel, he has some matching information about that event in userland. me... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Managing userland data pointers in kqueue/kevent
On Mon, Nov 15, 2010 at 11:37:23AM -0800, Julian Elischer wrote: > I don't think it was thought of in the context of reference counted items. This problem has nothing to do with reference counting, and all to do with resource ownership. Consider in the totally C-based world, no refcounts, just malloc() and free(). You malloc() some event structure, put it in the udata field to the kernel, then return to the main run loop. You've dropped every reference to this malloc()ed memory, because hey, kernel has it. Later, event fires, kernel gives you back that pointer. Great. Lets use it. Was it a PROC|EXIT event? Lets free() the data. Was it an event that had been registered as EV_ONESHOT? Oops. We can't remember because kernel didn't tell us. Want to EV_DELETE it now? Can't, lost the pointer, can't ask kernel for it back. Want to close() the filehandle associated? Can't, because kernel has pointer that it'll drop. In all these cases we'll memory leak the malloc()ed data. The only solution here is to keep another copy somewhere up in userland. This copy has to be associated with the original filter specification, so that on EV_DELETE we know the pointer so can free() it. But, as I said, if we're going to keep that mapping, why does the kernel even give us this udata ability in the first place? We might as well not bother and use the mapping all the time. Maybe this just is what people do? That was the thrust of my first question - _is_ this what people do? I'm not experienced enough with kqueue to know what is best practice here, and the documentation gives no guidance. Can someone advise me? --- Totally separate to that, if nobody has really thought of a solution to this before, what are anyone's thoughts on my suggestion of the EV_FREEWATCH flag? Get the kernel always to tell userland that it has dropped a watch, and return the pointer back, so userland can do whatever it wants by way of resource reclaimation. > you could use an ever increasing number that you hash on a hash table. > if the kernel returns a number that is out of date you won't find it > and you > can ignore it. If the kernel returns a number you are currently tracking. > then you use the item associated with that entry. I'm really not sure I understand where this is going, or how it helps me... -- Paul "LeoNerd" Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: Digital signature
Re: Managing userland data pointers in kqueue/kevent
On 11/15/10 10:38 AM, Paul LeoNerd Evans wrote: On Mon, Nov 15, 2010 at 10:33:25AM -0800, Julian Elischer wrote: it was provided for pretty much what you are using it for, so that the userland caller could easily associate the returning event with some private information about the event. This was indeed the impression I got. With reference to my original questions regarding its use, perhaps you could suggest some way to actually use this API then, in order to solve my problem? Unless there's some subtle detail or trick I have misunderstood, it doesn't appear to be easily possible in this manner. How would you suggest I manage these pointers and data structures? I don't think it was thought of in the context of reference counted items. you could use an ever increasing number that you hash on a hash table. if the kernel returns a number that is out of date you won't find it and you can ignore it. If the kernel returns a number you are currently tracking. then you use the item associated with that entry. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Managing userland data pointers in kqueue/kevent
On Mon, Nov 15, 2010 at 02:10:45PM -0500, John Baldwin wrote: > On Monday, November 15, 2010 1:12:11 pm Paul LeoNerd Evans wrote: > > On Mon, Nov 15, 2010 at 11:25:42AM -0500, John Baldwin wrote: > > > I think the assumption is that userland actually maintains a reference on > > > the > > > specified object (e.g. a file descriptor) and will know to drop the > > > associated > > > data when the file descriptor is closed. That is, think of the kevent as > > > a > > > member of an eventable object rather than a separate object that has a > > > reference to the eventable object. When the eventable object's reference > > > count drops to zero in userland, then the kevent should be deleted, > > > either via > > > EV_DELETE, or implicitly (e.g. by closing the associated file descriptor). > > > > Ah. Well, that could be considered a bit more awkward for the use case I > > wanted to apply. The idea was that the udata would refer effectively > > to a closure, to invoke when the event happens. The idea being you can > > just add an event watcher by, say: > > > > $ev->EV_SET( $pid, EVFILT_PROC, 0, NOTE_EXIT, 0, sub { > > print STDERR "The child process $pid has now exited\n"; > > } ); > > > > So, the kernel's udata pointer effectively holds the only reference to > > this anonymous closure. It's much more flexible this way, especially for > > oneshot events like that. > > > > The beauty is also that the kevents() loop can simply know that the > > udata is always a code reference so just has to invoke it to do whatever > > the original caller wanted to do. > > > > Keep in mind my use-case here; I'm not trying to be one specific > > application, it's a general-purpose kevent-wrapping library. > > So is GCD (Apple's libdispatch). It also implements closures on top of > kevent. However, the way it works is that it doesn't expose kevent() > directly, instead it uses kevent to implement asynchronous I/O on a > socket for example, and since it is logically managing the life cycle > of a socket, it knows when the socket is closed and cleans up then. Well, the principle item of work here is a direct API reimplementation in IO::KQueue on CPAN. I'm trying to simply expose the API of storing an arbitrary Perl scalar in the udata field. It -could- be a closure, but of course it doesn't have to. Maybe the using code wants to keep a HASH ref of some pseudo-structure, or whatever... > For the above case, if you know an event is one shot, you should either > use EV_ONESHOT, or use a wrapper around the closure that clears the event > after the closure runs (or possibly before the closure runs?) I don't see how passing EV_ONESHOT at all helps here. If it's oneshot by nature (child process exit), it'll be oneshot whatever happens. I've already observed that the EV_ONESHOT flag does not get re-emitted by the kernel anyway, so I'd have to track that one separately somehow. > Your use case is rare. Almost all consumers of kevent() that I've seen > use kevent() as one part of a system that maintain the lifecycle of objects. > Those objects are only accessed within the system, so the system knows when > an object is closed and can release the resources at the same time. OK. I'm prepared to accept this. It may be that nobody's really tried to provide a simple kqueue/kevent API wrapping for a high-level language, that could nicely take advantage of memory management in the language, rather than at the low C level of explicit malloc+free. Could we perhaps address the second part of my question for a moment? If there really isn't a general solution here, could my EV_FREEWATCH flag be added? It's a single extra flag to export in the API .h file, completely backward-compatible. Surely quite simple to implement in the kernel too, because it already knows how to free() its own internal structures anyway; so just before it does that when it deletes an event for whatever reason, it could just fire that event back up to userland to say "I've dropped this, and here have your pointer back". Userland catches it, SvRECFNT_dec()s or whatever, problem solved. Is that an API extension anyone would consider accepting? I for one would use it. -- Paul "LeoNerd" Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: Digital signature
Re: Managing userland data pointers in kqueue/kevent
On Monday, November 15, 2010 1:12:11 pm Paul LeoNerd Evans wrote: > On Mon, Nov 15, 2010 at 11:25:42AM -0500, John Baldwin wrote: > > I think the assumption is that userland actually maintains a reference on > > the > > specified object (e.g. a file descriptor) and will know to drop the > > associated > > data when the file descriptor is closed. That is, think of the kevent as a > > member of an eventable object rather than a separate object that has a > > reference to the eventable object. When the eventable object's reference > > count drops to zero in userland, then the kevent should be deleted, either > > via > > EV_DELETE, or implicitly (e.g. by closing the associated file descriptor). > > Ah. Well, that could be considered a bit more awkward for the use case I > wanted to apply. The idea was that the udata would refer effectively > to a closure, to invoke when the event happens. The idea being you can > just add an event watcher by, say: > > $ev->EV_SET( $pid, EVFILT_PROC, 0, NOTE_EXIT, 0, sub { > print STDERR "The child process $pid has now exited\n"; > } ); > > So, the kernel's udata pointer effectively holds the only reference to > this anonymous closure. It's much more flexible this way, especially for > oneshot events like that. > > The beauty is also that the kevents() loop can simply know that the > udata is always a code reference so just has to invoke it to do whatever > the original caller wanted to do. > > Keep in mind my use-case here; I'm not trying to be one specific > application, it's a general-purpose kevent-wrapping library. So is GCD (Apple's libdispatch). It also implements closures on top of kevent. However, the way it works is that it doesn't expose kevent() directly, instead it uses kevent to implement asynchronous I/O on a socket for example, and since it is logically managing the life cycle of a socket, it knows when the socket is closed and cleans up then. > > I think in your case you should not give the kevent a reference to your > > object, but instead remove the associated event for a given object when an > > object's refcount drops to zero. > > Well that's certainly doable in longrunning watches, but I don't think > it sounds very convenient for a oneshot event; see the above example for > justification. For the above case, if you know an event is one shot, you should either use EV_ONESHOT, or use a wrapper around the closure that clears the event after the closure runs (or possibly before the closure runs?) > Also it again begs my question, worth repeating here: > > On Friday, November 12, 2010 1:40:00 pm Paul LeoNerd Evans wrote: > > I had > > thought the point of kqueue/kevent is the O(1) nature of it, which is > > among why the kernel is storing that void *udata pointer in the first > > place. If I have to store a mapping from every filter+identity back to > > my data pointer, why does the kernel store one at all? I could just > > ignore the udata field and use my mapping for my own purposes. > > If you're saying that in my not-so-rare use case, I don't want to be > using udata, and instead keeping my own mapping, why does the kernel > provide this udata field at all? Your use case is rare. Almost all consumers of kevent() that I've seen use kevent() as one part of a system that maintain the lifecycle of objects. Those objects are only accessed within the system, so the system knows when an object is closed and can release the resources at the same time. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Managing userland data pointers in kqueue/kevent
On Mon, Nov 15, 2010 at 10:33:25AM -0800, Julian Elischer wrote: > it was provided for pretty much what you are using it for, so that > the userland caller could > easily associate the returning event with some private information > about the event. This was indeed the impression I got. With reference to my original questions regarding its use, perhaps you could suggest some way to actually use this API then, in order to solve my problem? Unless there's some subtle detail or trick I have misunderstood, it doesn't appear to be easily possible in this manner. How would you suggest I manage these pointers and data structures? -- Paul "LeoNerd" Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: Digital signature
Re: Managing userland data pointers in kqueue/kevent
On 11/15/10 10:12 AM, Paul LeoNerd Evans wrote: On Mon, Nov 15, 2010 at 11:25:42AM -0500, John Baldwin wrote: I think the assumption is that userland actually maintains a reference on the specified object (e.g. a file descriptor) and will know to drop the associated data when the file descriptor is closed. That is, think of the kevent as a member of an eventable object rather than a separate object that has a reference to the eventable object. When the eventable object's reference count drops to zero in userland, then the kevent should be deleted, either via EV_DELETE, or implicitly (e.g. by closing the associated file descriptor). Ah. Well, that could be considered a bit more awkward for the use case I wanted to apply. The idea was that the udata would refer effectively to a closure, to invoke when the event happens. The idea being you can just add an event watcher by, say: $ev->EV_SET( $pid, EVFILT_PROC, 0, NOTE_EXIT, 0, sub { print STDERR "The child process $pid has now exited\n"; } ); So, the kernel's udata pointer effectively holds the only reference to this anonymous closure. It's much more flexible this way, especially for oneshot events like that. The beauty is also that the kevents() loop can simply know that the udata is always a code reference so just has to invoke it to do whatever the original caller wanted to do. Keep in mind my use-case here; I'm not trying to be one specific application, it's a general-purpose kevent-wrapping library. I think in your case you should not give the kevent a reference to your object, but instead remove the associated event for a given object when an object's refcount drops to zero. Well that's certainly doable in longrunning watches, but I don't think it sounds very convenient for a oneshot event; see the above example for justification. Also it again begs my question, worth repeating here: On Friday, November 12, 2010 1:40:00 pm Paul LeoNerd Evans wrote: I had thought the point of kqueue/kevent is the O(1) nature of it, which is among why the kernel is storing that void *udata pointer in the first place. If I have to store a mapping from every filter+identity back to my data pointer, why does the kernel store one at all? I could just ignore the udata field and use my mapping for my own purposes. If you're saying that in my not-so-rare use case, I don't want to be using udata, and instead keeping my own mapping, why does the kernel provide this udata field at all? it was provided for pretty much what you are using it for, so that the userland caller could easily associate the returning event with some private information about the event. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Managing userland data pointers in kqueue/kevent
On Mon, Nov 15, 2010 at 11:25:42AM -0500, John Baldwin wrote: > I think the assumption is that userland actually maintains a reference on the > specified object (e.g. a file descriptor) and will know to drop the > associated > data when the file descriptor is closed. That is, think of the kevent as a > member of an eventable object rather than a separate object that has a > reference to the eventable object. When the eventable object's reference > count drops to zero in userland, then the kevent should be deleted, either > via > EV_DELETE, or implicitly (e.g. by closing the associated file descriptor). Ah. Well, that could be considered a bit more awkward for the use case I wanted to apply. The idea was that the udata would refer effectively to a closure, to invoke when the event happens. The idea being you can just add an event watcher by, say: $ev->EV_SET( $pid, EVFILT_PROC, 0, NOTE_EXIT, 0, sub { print STDERR "The child process $pid has now exited\n"; } ); So, the kernel's udata pointer effectively holds the only reference to this anonymous closure. It's much more flexible this way, especially for oneshot events like that. The beauty is also that the kevents() loop can simply know that the udata is always a code reference so just has to invoke it to do whatever the original caller wanted to do. Keep in mind my use-case here; I'm not trying to be one specific application, it's a general-purpose kevent-wrapping library. > I think in your case you should not give the kevent a reference to your > object, but instead remove the associated event for a given object when an > object's refcount drops to zero. Well that's certainly doable in longrunning watches, but I don't think it sounds very convenient for a oneshot event; see the above example for justification. Also it again begs my question, worth repeating here: On Friday, November 12, 2010 1:40:00 pm Paul LeoNerd Evans wrote: > I had > thought the point of kqueue/kevent is the O(1) nature of it, which is > among why the kernel is storing that void *udata pointer in the first > place. If I have to store a mapping from every filter+identity back to > my data pointer, why does the kernel store one at all? I could just > ignore the udata field and use my mapping for my own purposes. If you're saying that in my not-so-rare use case, I don't want to be using udata, and instead keeping my own mapping, why does the kernel provide this udata field at all? -- Paul "LeoNerd" Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: Digital signature
Re: Managing userland data pointers in kqueue/kevent
On Friday, November 12, 2010 1:40:00 pm Paul LeoNerd Evans wrote: > I'm trying to build a high-level language wrapper around kqueue/kevent, > specifically, a Perl wrapper. > > (In fact I am trying to fix this bug: > http://rt.cpan.org/Public/Bug/Display.html?id=61481 > ) > > My plan is to use the void *udata field of a kevent watcher to store a > pointer to some user-provided Perl data structure (an SV*), to associate > with the event. Typically this could be a code reference for an event > callback or similar, but the exact nature doesn't matter. It's a pointer > to a reference-counted data structure. SvREFCNT_dec(sv) is the function > used to decrement the reference counter. > > To account for the fact that the kernel stores a pointer here, I'm > artificially increasing the reference count on the object, so that it > still remains alive even if the rest of the Perl code drops it, to rely > on getting it back out of the kernel in an individual kevent. At some > point when the kernel has finished looking after the event, this count > needs to be decreased again, so the structure can be freed. > > I am having trouble trying to work out how to do this, or rather, when. > I have the following problems: > > * If the event was registered using EV_ONESHOT, when it gets fired the >flags that come back in the event stucture do not include EV_ONESHOT. > > * Some events can only happen once, such as watching for EVFILT_PROC >NOTE_EXIT events. > > * The kernel can silently drop watches, such as when the process calls >close() on a filehandl with an EVFILT_READ or EVFILT_WRITE watch. > > * There doesn't seem to be a way to query that pointer back out of the >kernel, in case the user code wants to EV_DELETE the watch. > > These problems all mean that I never quite know when I ought to call > SvREFCNT_dec() on that pointer. > > My current best-attack plan looks like the following: > > a) Store a structure in the void *udata that contains the actual SV* > pointer and a flag to remember if the event had been installed as > EV_ONESHOT (or remember if it was one of the event types that is > oneshot anyway) > > b) Store an entire mapping in userland from filter+identity to pointer, > so that if userland wants to EV_DELETE the watch early, it has the > pointer to be able to drop it. > > I can't think of a solution to the close() problem at all, though. > > Part a of my solution seems OK (though I'd wonder why the flags back > from the kernel don't contain EV_ONESHOT), but part b confuses me. I had > thought the point of kqueue/kevent is the O(1) nature of it, which is > among why the kernel is storing that void *udata pointer in the first > place. If I have to store a mapping from every filter+identity back to > my data pointer, why does the kernel store one at all? I could just > ignore the udata field and use my mapping for my own purposes. > > Have I missed something here, then? I was hoping there'd be a nice way > for kernel to give me back those pointers so I can just decrement a > refcount on it, and have it reclaimed. I think the assumption is that userland actually maintains a reference on the specified object (e.g. a file descriptor) and will know to drop the associated data when the file descriptor is closed. That is, think of the kevent as a member of an eventable object rather than a separate object that has a reference to the eventable object. When the eventable object's reference count drops to zero in userland, then the kevent should be deleted, either via EV_DELETE, or implicitly (e.g. by closing the associated file descriptor). I think in your case you should not give the kevent a reference to your object, but instead remove the associated event for a given object when an object's refcount drops to zero. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Managing userland data pointers in kqueue/kevent
I'm trying to build a high-level language wrapper around kqueue/kevent, specifically, a Perl wrapper. (In fact I am trying to fix this bug: http://rt.cpan.org/Public/Bug/Display.html?id=61481 ) My plan is to use the void *udata field of a kevent watcher to store a pointer to some user-provided Perl data structure (an SV*), to associate with the event. Typically this could be a code reference for an event callback or similar, but the exact nature doesn't matter. It's a pointer to a reference-counted data structure. SvREFCNT_dec(sv) is the function used to decrement the reference counter. To account for the fact that the kernel stores a pointer here, I'm artificially increasing the reference count on the object, so that it still remains alive even if the rest of the Perl code drops it, to rely on getting it back out of the kernel in an individual kevent. At some point when the kernel has finished looking after the event, this count needs to be decreased again, so the structure can be freed. I am having trouble trying to work out how to do this, or rather, when. I have the following problems: * If the event was registered using EV_ONESHOT, when it gets fired the flags that come back in the event stucture do not include EV_ONESHOT. * Some events can only happen once, such as watching for EVFILT_PROC NOTE_EXIT events. * The kernel can silently drop watches, such as when the process calls close() on a filehandl with an EVFILT_READ or EVFILT_WRITE watch. * There doesn't seem to be a way to query that pointer back out of the kernel, in case the user code wants to EV_DELETE the watch. These problems all mean that I never quite know when I ought to call SvREFCNT_dec() on that pointer. My current best-attack plan looks like the following: a) Store a structure in the void *udata that contains the actual SV* pointer and a flag to remember if the event had been installed as EV_ONESHOT (or remember if it was one of the event types that is oneshot anyway) b) Store an entire mapping in userland from filter+identity to pointer, so that if userland wants to EV_DELETE the watch early, it has the pointer to be able to drop it. I can't think of a solution to the close() problem at all, though. Part a of my solution seems OK (though I'd wonder why the flags back from the kernel don't contain EV_ONESHOT), but part b confuses me. I had thought the point of kqueue/kevent is the O(1) nature of it, which is among why the kernel is storing that void *udata pointer in the first place. If I have to store a mapping from every filter+identity back to my data pointer, why does the kernel store one at all? I could just ignore the udata field and use my mapping for my own purposes. Have I missed something here, then? I was hoping there'd be a nice way for kernel to give me back those pointers so I can just decrement a refcount on it, and have it reclaimed. - I have an idea on a small addition to the kernel API that would make this issue much simpler to manage, if there is nothing else. By the addition of a new event flag, called something like EV_FREEWATCH, the kernel can be told "tell userland whenever I am about to drop this event watcher". So now, after a EV_ONESHOT or any of the single events are fired, or when it gets EV_DELETEed, or when the kernel itself drops because of a close() on a filehandle, it can fire an event back up to userland with this flag, passing up the pointer. Now, all userland has to do to correctly manage the memory is to always set that flag on EV_ADD, and if the flag ever comes back in an event out of the kernel, it can SvREFCNT_dec(ev->udata); -- Paul "LeoNerd" Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: Digital signature