Re: hacking - aio_sendfile()
On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote: Hiya, I've started writing an aio_sendfile() syscall. http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff Yes, the diff is against -HEAD and not stable/9. It's totally horrible, hackish and likely bad. I've only done some very, very basic testing to ensure it actually works; i haven't at all stress tested it out yet. It's also very naive - I'm not at all doing any checks to see whether I can short-cut to do the aio there and then; I'm always queuing the sendfile() op through the worker threads. That's likely stupid and inefficient in a lot of cases, but it at least gets the syscall up and working. Yes, it is naive, but for different reason. The kern_sendfile() is synchronous function, it only completes after the other end of the network communication allows it. This means that calling kern_sendfile() from the aio thread blocks the thread indefinitely by unbounded sleep. Your implementation easily causes exhaustion of the aio thread pool, blocking the whole aio subsystem. It is known that our aio does not work for sockets for the same reason. I object against adding more code with the same defect. Proper route seems to rewrite aio for sockets using the upcalls. The same should be done for sendfile, if sendfile is given aio flavor. I'd like some feedback and possibly some help in stress testing it to make sure it's functioning well. Thanks, -adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org pgpGk5VC33yBq.pgp Description: PGP signature
Re: hacking - aio_sendfile()
Hiya, I'm more interested in the API than the implementation at the moment. Yes, you're right - it should eventually be driven using disk io completion upcalls which triggers the push of data into the socket buffer. I totally agree. I'm hacking up some libevent-ish looking thing that uses kqueue and wraps aio, read, write, and other event types into something I can easily shoehorn this stuff into. I'll then throughly test it (and other options) out. You're right, it's likely going to end up with a whole lot of aio threads sitting there waiting for disk IO to complete - and at that point, I'll start hacking at sendfile() to split it into two halves and have it driven by a completion call from g_up or wherever, triggering the socket write side of things. There are some other questions too - like whether the IO completion should just queue socket IO (and have it potentially block in the TCP code) or whether it should funnel completions into a per-CPU aio completion thread which does the socket write bit. That way disk IO completion isn't going to be blocked by longer-held locks in the networking stack. Thanks, -adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: hacking - aio_sendfile()
On Thu, Jul 11, 2013 at 01:37:19AM -0700, Adrian Chadd wrote: Hiya, I'm more interested in the API than the implementation at the moment. Yes, you're right - it should eventually be driven using disk io completion upcalls which triggers the push of data into the socket buffer. I totally agree. I'm hacking up some libevent-ish looking thing that uses kqueue and wraps aio, read, write, and other event types into something I can easily shoehorn this stuff into. I'll then throughly test it (and other options) out. You're right, it's likely going to end up with a whole lot of aio threads sitting there waiting for disk IO to complete - and at that point, I'll start hacking at sendfile() to split it into two halves and have it driven by a completion call from g_up or wherever, triggering the socket write side of things. There are some other questions too - like whether the IO completion should just queue socket IO (and have it potentially block in the TCP code) or whether it should funnel completions into a per-CPU aio completion thread which does the socket write bit. That way disk IO completion isn't going to be blocked by longer-held locks in the networking stack. No, it is not disk I/O which is problematic there. It is socket I/O e.g. wait for the socket buffers lomark in the kern_sendfile() which causes unbounded sleep. Look for the sbwait() call, both in the kern_sendfile() itself, and in the pru_send methods of the protocols, e.g. in sosend_generic(). The wait scope controlled by the other side of connection and allow it to completely block the aio subsystem. Disk I/O is supposed to finish in the finite time. pgpVyo_YYHh1i.pgp Description: PGP signature
Re: hacking - aio_sendfile()
On 11 July 2013 02:36, Konstantin Belousov kostik...@gmail.com wrote: No, it is not disk I/O which is problematic there. It is socket I/O e.g. wait for the socket buffers lomark in the kern_sendfile() which causes unbounded sleep. Look for the sbwait() call, both in the kern_sendfile() itself, and in the pru_send methods of the protocols, e.g. in sosend_generic(). The wait scope controlled by the other side of connection and allow it to completely block the aio subsystem. Disk I/O is supposed to finish in the finite time. Even if the destination socket is marked as NONBLOCK? -adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: hacking - aio_sendfile()
On Thu, Jul 11, 2013 at 02:39:00AM -0700, Adrian Chadd wrote: On 11 July 2013 02:36, Konstantin Belousov kostik...@gmail.com wrote: No, it is not disk I/O which is problematic there. It is socket I/O e.g. wait for the socket buffers lomark in the kern_sendfile() which causes unbounded sleep. Look for the sbwait() call, both in the kern_sendfile() itself, and in the pru_send methods of the protocols, e.g. in sosend_generic(). The wait scope controlled by the other side of connection and allow it to completely block the aio subsystem. Disk I/O is supposed to finish in the finite time. Even if the destination socket is marked as NONBLOCK? You mean, would a sleep for the socket buffer space cause aio thread block is the socket is put in nonblocking mode ? Or something else ? No, it would not block the thread. But I cannot consider the aio_sendfile(2) implementation useful if it requires non-blocking socket. Also, what about other thread changing the socket to blocking mode while sendfile is in flight ? pgpJdSazseJ69.pgp Description: PGP signature
Re: hacking - aio_sendfile()
Adrian, On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote: A I've started writing an aio_sendfile() syscall. A A http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff A A Yes, the diff is against -HEAD and not stable/9. A A It's totally horrible, hackish and likely bad. I've only done some A very, very basic testing to ensure it actually works; i haven't at all A stress tested it out yet. It's also very naive - I'm not at all doing A any checks to see whether I can short-cut to do the aio there and A then; I'm always queuing the sendfile() op through the worker threads. A That's likely stupid and inefficient in a lot of cases, but it at A least gets the syscall up and working. A A I'd like some feedback and possibly some help in stress testing it to A make sure it's functioning well. Apart from problem pointed out by Kostik, there is a race between aio thread starting with aio_process_sendfile() and file descriptor (or socket descriptor) going away. Thus, kern_sendfile() needs to be split into two parts: kern_sendfile_pre() and kern_sendfile() that should contain only the sending cycle. The kern_sendfile_pre() should contain: fgetvp_read(uap-fd, vp) vm_object_reference_locked(vp-v_object) Referencing the socket is probably also required. Current synchronous code doesn't do it. The do_sendfile() function should call kern_sendfile_pre() and then kern_sendfile(). The aio code should perform kern_sendfile_pre() in the new syscall itself in context of calling process, and kern_sendfile() in async context. P.S. Some time ago I have started hacking on the above. -- Totus tuus, Glebius. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: hacking - aio_sendfile()
I reference the source/dest FDs in the queue method. Is that not good enough? -adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: hacking - aio_sendfile()
On Thu, Jul 11, 2013 at 07:45:19AM -0700, Adrian Chadd wrote: A I reference the source/dest FDs in the queue method. Is that not good enough? I see. Should probably work, but needs testing. -- Totus tuus, Glebius. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: hacking - aio_sendfile()
On 11 July 2013 07:51, Gleb Smirnoff gleb...@freebsd.org wrote: On Thu, Jul 11, 2013 at 07:45:19AM -0700, Adrian Chadd wrote: A I reference the source/dest FDs in the queue method. Is that not good enough? I see. Should probably work, but needs testing. It's terrible - I'd think I should pass the file ref's into kern_sendfile() so I'm sure that the process hasn't close/dup'ed an FD in its place in the meantime. Is that better? -adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: hacking - aio_sendfile()
On Jul 10, 2013, at 11:17 PM, Konstantin Belousov kostik...@gmail.com wrote: On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote: Hiya, I've started writing an aio_sendfile() syscall. http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff Yes, the diff is against -HEAD and not stable/9. It's totally horrible, hackish and likely bad. I've only done some very, very basic testing to ensure it actually works; i haven't at all stress tested it out yet. It's also very naive - I'm not at all doing any checks to see whether I can short-cut to do the aio there and then; I'm always queuing the sendfile() op through the worker threads. That's likely stupid and inefficient in a lot of cases, but it at least gets the syscall up and working. Yes, it is naive, but for different reason. The kern_sendfile() is synchronous function, it only completes after the other end of the network communication allows it. This means that calling kern_sendfile() from the aio thread blocks the thread indefinitely by unbounded sleep. No, kern_sendfile is async unless you specify the SF_SYNC hack flag. Otherwise, it'll fill the socket buffer and then return immediately, unless the socket buffer is full and the socket is set to blocking mode. That's outside the scope, as I said in my previous email. Scott ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: hacking - aio_sendfile()
On Thu, Jul 11, 2013 at 11:44:32AM -0700, Scott Long wrote: On Jul 10, 2013, at 11:17 PM, Konstantin Belousov kostik...@gmail.com wrote: On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote: Hiya, I've started writing an aio_sendfile() syscall. http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff Yes, the diff is against -HEAD and not stable/9. It's totally horrible, hackish and likely bad. I've only done some very, very basic testing to ensure it actually works; i haven't at all stress tested it out yet. It's also very naive - I'm not at all doing any checks to see whether I can short-cut to do the aio there and then; I'm always queuing the sendfile() op through the worker threads. That's likely stupid and inefficient in a lot of cases, but it at least gets the syscall up and working. Yes, it is naive, but for different reason. The kern_sendfile() is synchronous function, it only completes after the other end of the network communication allows it. This means that calling kern_sendfile() from the aio thread blocks the thread indefinitely by unbounded sleep. No, kern_sendfile is async unless you specify the SF_SYNC hack flag. Otherwise, it'll fill the socket buffer and then return immediately, unless the socket buffer is full and the socket is set to blocking mode. That's outside the scope, as I said in my previous email. You do not understand what I said, please re-read both my mail and code before replying. Implementing aio_sendfile() as proposed would create yet another possibility of indefinitely block all processes using aio. pgpZPbm2SyxrI.pgp Description: PGP signature
Re: hacking - aio_sendfile()
On Jul 11, 2013, at 2:56 AM, Konstantin Belousov kostik...@gmail.com wrote: On Thu, Jul 11, 2013 at 02:39:00AM -0700, Adrian Chadd wrote: On 11 July 2013 02:36, Konstantin Belousov kostik...@gmail.com wrote: No, it is not disk I/O which is problematic there. It is socket I/O e.g. wait for the socket buffers lomark in the kern_sendfile() which causes unbounded sleep. Look for the sbwait() call, both in the kern_sendfile() itself, and in the pru_send methods of the protocols, e.g. in sosend_generic(). The wait scope controlled by the other side of connection and allow it to completely block the aio subsystem. Disk I/O is supposed to finish in the finite time. Even if the destination socket is marked as NONBLOCK? You mean, would a sleep for the socket buffer space cause aio thread block is the socket is put in nonblocking mode ? Or something else ? No, it would not block the thread. But I cannot consider the aio_sendfile(2) implementation useful if it requires non-blocking socket. Also, what about other thread changing the socket to blocking mode while sendfile is in flight ? Just as with other aspects of sendfile, it's up to the caller to protect this kind of state. Objecting to aio_sendfile() simply for the reason you state is absurd and against the design goals of sendfile. Scott ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: hacking - aio_sendfile()
On Jul 11, 2013, at 11:48 AM, Konstantin Belousov kostik...@gmail.com wrote: On Thu, Jul 11, 2013 at 11:44:32AM -0700, Scott Long wrote: On Jul 10, 2013, at 11:17 PM, Konstantin Belousov kostik...@gmail.com wrote: On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote: Hiya, I've started writing an aio_sendfile() syscall. http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff Yes, the diff is against -HEAD and not stable/9. It's totally horrible, hackish and likely bad. I've only done some very, very basic testing to ensure it actually works; i haven't at all stress tested it out yet. It's also very naive - I'm not at all doing any checks to see whether I can short-cut to do the aio there and then; I'm always queuing the sendfile() op through the worker threads. That's likely stupid and inefficient in a lot of cases, but it at least gets the syscall up and working. Yes, it is naive, but for different reason. The kern_sendfile() is synchronous function, it only completes after the other end of the network communication allows it. This means that calling kern_sendfile() from the aio thread blocks the thread indefinitely by unbounded sleep. No, kern_sendfile is async unless you specify the SF_SYNC hack flag. Otherwise, it'll fill the socket buffer and then return immediately, unless the socket buffer is full and the socket is set to blocking mode. That's outside the scope, as I said in my previous email. You do not understand what I said, please re-read both my mail and code before replying. Implementing aio_sendfile() as proposed would create yet another possibility of indefinitely block all processes using aio. I'm lost, maybe I missed some emails? I see a set of emails where you incorrectly state that kern_sendfile() will always call sbwait(), and then you backtrack on that and claim that it's unacceptable to enforce that SS_NBIO be used for aio operations. I apologize if I'm missing something here. Scott ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: hacking - aio_sendfile()
On Thu, Jul 11, 2013 at 12:04:57PM -0700, Scott Long wrote: On Jul 11, 2013, at 11:48 AM, Konstantin Belousov kostik...@gmail.com wrote: On Thu, Jul 11, 2013 at 11:44:32AM -0700, Scott Long wrote: On Jul 10, 2013, at 11:17 PM, Konstantin Belousov kostik...@gmail.com wrote: On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote: Hiya, I've started writing an aio_sendfile() syscall. http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff Yes, the diff is against -HEAD and not stable/9. It's totally horrible, hackish and likely bad. I've only done some very, very basic testing to ensure it actually works; i haven't at all stress tested it out yet. It's also very naive - I'm not at all doing any checks to see whether I can short-cut to do the aio there and then; I'm always queuing the sendfile() op through the worker threads. That's likely stupid and inefficient in a lot of cases, but it at least gets the syscall up and working. Yes, it is naive, but for different reason. The kern_sendfile() is synchronous function, it only completes after the other end of the network communication allows it. This means that calling kern_sendfile() from the aio thread blocks the thread indefinitely by unbounded sleep. No, kern_sendfile is async unless you specify the SF_SYNC hack flag. Otherwise, it'll fill the socket buffer and then return immediately, unless the socket buffer is full and the socket is set to blocking mode. That's outside the scope, as I said in my previous email. You do not understand what I said, please re-read both my mail and code before replying. Implementing aio_sendfile() as proposed would create yet another possibility of indefinitely block all processes using aio. I'm lost, maybe I missed some emails? I see a set of emails where you incorrectly state that kern_sendfile() will always call sbwait(), and then you backtrack on that and claim that it's unacceptable to enforce that SS_NBIO be used for aio operations. I apologize if I'm missing something here. Can you cite my exact text where I claimed that kern_sendfile() always calls sbwait ? I wrote about this explicitely, stating that it is very easy to make kern_sendfile() sleep for the socket buffer space, and the duration of the sleep is user-controllable. As result, it allows to hang all processes doing aio calls, since aio thread pool is finite. I am sorry for retyping this and stealing your time by repeating. Making the kern_sendfile() to behave from the aio context as if the SS_NBIO was set on the socket contradicts the behaviour of other aio operations. E.g. aio_read and aio_write do not perform short reads and writes to not block the aio daemon threads (which is the cause of buggy behaviour of existing aio syscalls on sockets). More, I do not think that setting SS_NBIO is enough to prevent the blocking of aio threads in kern_sendfile(). The send socket buffer is locked exclusively by kern_sendfile(). Other thread which entered sendfile(2) and was deliberately put to sleep on the low watermark, still owns the so_snd sx. This means that aio threads trying to do kern_sendfile() on this socket would be also blocked, for duration controlled by other end. That said, even assuming SS_NBIO is always enforced and other sleep points are identified and worked around, the only benefit of such implementation comparing with the direct sendfile(2) call would be preventing the use of the calling thread context for disk i/o. FreeBSD recently gained aio_mlock(2) which allows to get the same result in non-hackish way. pgp2L4b4m0tRA.pgp Description: PGP signature
Re: hacking - aio_sendfile()
On 2013/07/11 14:17, Konstantin Belousov wrote: On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote: Hiya, I've started writing an aio_sendfile() syscall. http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff Yes, the diff is against -HEAD and not stable/9. It's totally horrible, hackish and likely bad. I've only done some very, very basic testing to ensure it actually works; i haven't at all stress tested it out yet. It's also very naive - I'm not at all doing any checks to see whether I can short-cut to do the aio there and then; I'm always queuing the sendfile() op through the worker threads. That's likely stupid and inefficient in a lot of cases, but it at least gets the syscall up and working. Yes, it is naive, but for different reason. The kern_sendfile() is synchronous function, it only completes after the other end of the network communication allows it. This means that calling kern_sendfile() from the aio thread blocks the thread indefinitely by unbounded sleep. Your implementation easily causes exhaustion of the aio thread pool, blocking the whole aio subsystem. It is known that our aio does not work for sockets for the same reason. I object against adding more code with the same defect. Proper route seems to rewrite aio for sockets using the upcalls. The same should be done for sendfile, if sendfile is given aio flavor. Yes, current aio implementation is horrible, it only works for fast disk I/O, I think the thread pool size is enough to saturate disks, but for socket or pipe I/O, it does not work well, the thread pool is too easy to be exhausted. I even think the support for socket and pipe in aio code should be cut and thrown away, because you can always use kqueue + non-blocking I/O. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
hacking - aio_sendfile()
Hiya, I've started writing an aio_sendfile() syscall. http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff Yes, the diff is against -HEAD and not stable/9. It's totally horrible, hackish and likely bad. I've only done some very, very basic testing to ensure it actually works; i haven't at all stress tested it out yet. It's also very naive - I'm not at all doing any checks to see whether I can short-cut to do the aio there and then; I'm always queuing the sendfile() op through the worker threads. That's likely stupid and inefficient in a lot of cases, but it at least gets the syscall up and working. I'd like some feedback and possibly some help in stress testing it to make sure it's functioning well. Thanks, -adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org