Re: hacking - aio_sendfile()

2013-07-11 Thread Konstantin Belousov
On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote:
 Hiya,
 
 I've started writing an aio_sendfile() syscall.
 
 http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff
 
 Yes, the diff is against -HEAD and not stable/9.
 
 It's totally horrible, hackish and likely bad. I've only done some
 very, very basic testing to ensure it actually works; i haven't at all
 stress tested it out yet. It's also very naive - I'm not at all doing
 any checks to see whether I can short-cut to do the aio there and
 then; I'm always queuing the sendfile() op through the worker threads.
 That's likely stupid and inefficient in a lot of cases, but it at
 least gets the syscall up and working.
Yes, it is naive, but for different reason.

The kern_sendfile() is synchronous function, it only completes after
the other end of the network communication allows it. This means
that calling kern_sendfile() from the aio thread blocks the thread
indefinitely by unbounded sleep.

Your implementation easily causes exhaustion of the aio thread pool,
blocking the whole aio subsystem. It is known that our aio does not work
for sockets for the same reason. I object against adding more code with
the same defect.

Proper route seems to rewrite aio for sockets using the upcalls.  The same
should be done for sendfile, if sendfile is given aio flavor.

 
 I'd like some feedback and possibly some help in stress testing it to
 make sure it's functioning well.
 
 Thanks,
 
 
 -adrian
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


pgpGk5VC33yBq.pgp
Description: PGP signature


Re: hacking - aio_sendfile()

2013-07-11 Thread Adrian Chadd
Hiya,

I'm more interested in the API than the implementation at the moment.

Yes, you're right - it should eventually be driven using disk io
completion upcalls which triggers the push of data into the socket
buffer. I totally agree.

I'm hacking up some libevent-ish looking thing that uses kqueue and
wraps aio, read, write, and other event types into something I can
easily shoehorn this stuff into. I'll then throughly test it (and
other options) out. You're right, it's likely going to end up with a
whole lot of aio threads sitting there waiting for disk IO to complete
- and at that point, I'll start hacking at sendfile() to split it into
two halves and have it driven by a completion call from g_up or
wherever, triggering the socket write side of things.

There are some other questions too - like whether the IO completion
should just queue socket IO (and have it potentially block in the TCP
code) or whether it should funnel completions into a per-CPU aio
completion thread which does the socket write bit. That way disk IO
completion isn't going to be blocked by longer-held locks in the
networking stack.

Thanks,


-adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: hacking - aio_sendfile()

2013-07-11 Thread Konstantin Belousov
On Thu, Jul 11, 2013 at 01:37:19AM -0700, Adrian Chadd wrote:
 Hiya,
 
 I'm more interested in the API than the implementation at the moment.
 
 Yes, you're right - it should eventually be driven using disk io
 completion upcalls which triggers the push of data into the socket
 buffer. I totally agree.
 
 I'm hacking up some libevent-ish looking thing that uses kqueue and
 wraps aio, read, write, and other event types into something I can
 easily shoehorn this stuff into. I'll then throughly test it (and
 other options) out. You're right, it's likely going to end up with a
 whole lot of aio threads sitting there waiting for disk IO to complete
 - and at that point, I'll start hacking at sendfile() to split it into
 two halves and have it driven by a completion call from g_up or
 wherever, triggering the socket write side of things.
 
 There are some other questions too - like whether the IO completion
 should just queue socket IO (and have it potentially block in the TCP
 code) or whether it should funnel completions into a per-CPU aio
 completion thread which does the socket write bit. That way disk IO
 completion isn't going to be blocked by longer-held locks in the
 networking stack.

No, it is not disk I/O which is problematic there. It is socket I/O
e.g. wait for the socket buffers lomark in the kern_sendfile() which
causes unbounded sleep. Look for the sbwait() call, both in the
kern_sendfile() itself, and in the pru_send methods of the protocols,
e.g. in sosend_generic(). The wait scope controlled by the other side of
connection and allow it to completely block the aio subsystem.

Disk I/O is supposed to finish in the finite time.


pgpVyo_YYHh1i.pgp
Description: PGP signature


Re: hacking - aio_sendfile()

2013-07-11 Thread Adrian Chadd
On 11 July 2013 02:36, Konstantin Belousov kostik...@gmail.com wrote:

 No, it is not disk I/O which is problematic there. It is socket I/O
 e.g. wait for the socket buffers lomark in the kern_sendfile() which
 causes unbounded sleep. Look for the sbwait() call, both in the
 kern_sendfile() itself, and in the pru_send methods of the protocols,
 e.g. in sosend_generic(). The wait scope controlled by the other side of
 connection and allow it to completely block the aio subsystem.

 Disk I/O is supposed to finish in the finite time.

Even if the destination socket is marked as NONBLOCK?


-adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: hacking - aio_sendfile()

2013-07-11 Thread Konstantin Belousov
On Thu, Jul 11, 2013 at 02:39:00AM -0700, Adrian Chadd wrote:
 On 11 July 2013 02:36, Konstantin Belousov kostik...@gmail.com wrote:
 
  No, it is not disk I/O which is problematic there. It is socket I/O
  e.g. wait for the socket buffers lomark in the kern_sendfile() which
  causes unbounded sleep. Look for the sbwait() call, both in the
  kern_sendfile() itself, and in the pru_send methods of the protocols,
  e.g. in sosend_generic(). The wait scope controlled by the other side of
  connection and allow it to completely block the aio subsystem.
 
  Disk I/O is supposed to finish in the finite time.
 
 Even if the destination socket is marked as NONBLOCK?

You mean, would a sleep for the socket buffer space cause aio thread
block is the socket is put in nonblocking mode ?  Or something else ?

No, it would not block the thread. But I cannot consider the
aio_sendfile(2) implementation useful if it requires non-blocking
socket. Also, what about other thread changing the socket to blocking
mode while sendfile is in flight ?


pgpJdSazseJ69.pgp
Description: PGP signature


Re: hacking - aio_sendfile()

2013-07-11 Thread Gleb Smirnoff
  Adrian,

On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote:
A I've started writing an aio_sendfile() syscall.
A 
A http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff
A 
A Yes, the diff is against -HEAD and not stable/9.
A 
A It's totally horrible, hackish and likely bad. I've only done some
A very, very basic testing to ensure it actually works; i haven't at all
A stress tested it out yet. It's also very naive - I'm not at all doing
A any checks to see whether I can short-cut to do the aio there and
A then; I'm always queuing the sendfile() op through the worker threads.
A That's likely stupid and inefficient in a lot of cases, but it at
A least gets the syscall up and working.
A 
A I'd like some feedback and possibly some help in stress testing it to
A make sure it's functioning well.

Apart from problem pointed out by Kostik, there is a race between
aio thread starting with aio_process_sendfile() and file descriptor
(or socket descriptor) going away.

Thus, kern_sendfile() needs to be split into two parts: kern_sendfile_pre()
and kern_sendfile() that should contain only the sending cycle.

The kern_sendfile_pre() should contain:

fgetvp_read(uap-fd, vp)
vm_object_reference_locked(vp-v_object)

Referencing the socket is probably also required. Current synchronous
code doesn't do it.

The do_sendfile() function should call kern_sendfile_pre() and then
kern_sendfile(). The aio code should perform kern_sendfile_pre() in the
new syscall itself in context of calling process, and kern_sendfile()
in async context.

P.S. Some time ago I have started hacking on the above.

-- 
Totus tuus, Glebius.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: hacking - aio_sendfile()

2013-07-11 Thread Adrian Chadd
I reference the source/dest FDs in the queue method. Is that not good enough?


-adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: hacking - aio_sendfile()

2013-07-11 Thread Gleb Smirnoff
On Thu, Jul 11, 2013 at 07:45:19AM -0700, Adrian Chadd wrote:
A I reference the source/dest FDs in the queue method. Is that not good enough?

I see. Should probably work, but needs testing.

-- 
Totus tuus, Glebius.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: hacking - aio_sendfile()

2013-07-11 Thread Adrian Chadd
On 11 July 2013 07:51, Gleb Smirnoff gleb...@freebsd.org wrote:
 On Thu, Jul 11, 2013 at 07:45:19AM -0700, Adrian Chadd wrote:
 A I reference the source/dest FDs in the queue method. Is that not good 
 enough?

 I see. Should probably work, but needs testing.

It's terrible - I'd think I should pass the file ref's into
kern_sendfile() so I'm sure that the process hasn't close/dup'ed an FD
in its place in the meantime.

Is that better?



-adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: hacking - aio_sendfile()

2013-07-11 Thread Scott Long

On Jul 10, 2013, at 11:17 PM, Konstantin Belousov kostik...@gmail.com wrote:

 On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote:
 Hiya,
 
 I've started writing an aio_sendfile() syscall.
 
 http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff
 
 Yes, the diff is against -HEAD and not stable/9.
 
 It's totally horrible, hackish and likely bad. I've only done some
 very, very basic testing to ensure it actually works; i haven't at all
 stress tested it out yet. It's also very naive - I'm not at all doing
 any checks to see whether I can short-cut to do the aio there and
 then; I'm always queuing the sendfile() op through the worker threads.
 That's likely stupid and inefficient in a lot of cases, but it at
 least gets the syscall up and working.
 Yes, it is naive, but for different reason.
 
 The kern_sendfile() is synchronous function, it only completes after
 the other end of the network communication allows it. This means
 that calling kern_sendfile() from the aio thread blocks the thread
 indefinitely by unbounded sleep.


No, kern_sendfile is async unless you specify the SF_SYNC hack flag.
Otherwise, it'll fill the socket buffer and then return immediately, unless
the socket buffer is full and the socket is set to blocking mode.  That's
outside the scope, as I said in my previous email.

Scott

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: hacking - aio_sendfile()

2013-07-11 Thread Konstantin Belousov
On Thu, Jul 11, 2013 at 11:44:32AM -0700, Scott Long wrote:
 
 On Jul 10, 2013, at 11:17 PM, Konstantin Belousov kostik...@gmail.com wrote:
 
  On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote:
  Hiya,
  
  I've started writing an aio_sendfile() syscall.
  
  http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff
  
  Yes, the diff is against -HEAD and not stable/9.
  
  It's totally horrible, hackish and likely bad. I've only done some
  very, very basic testing to ensure it actually works; i haven't at all
  stress tested it out yet. It's also very naive - I'm not at all doing
  any checks to see whether I can short-cut to do the aio there and
  then; I'm always queuing the sendfile() op through the worker threads.
  That's likely stupid and inefficient in a lot of cases, but it at
  least gets the syscall up and working.
  Yes, it is naive, but for different reason.
  
  The kern_sendfile() is synchronous function, it only completes after
  the other end of the network communication allows it. This means
  that calling kern_sendfile() from the aio thread blocks the thread
  indefinitely by unbounded sleep.
 
 
 No, kern_sendfile is async unless you specify the SF_SYNC hack flag.
 Otherwise, it'll fill the socket buffer and then return immediately, unless
 the socket buffer is full and the socket is set to blocking mode.  That's
 outside the scope, as I said in my previous email.

You do not understand what I said, please re-read both my mail and code
before replying.  Implementing aio_sendfile() as proposed would create
yet another possibility of indefinitely block all processes using aio.


pgpZPbm2SyxrI.pgp
Description: PGP signature


Re: hacking - aio_sendfile()

2013-07-11 Thread Scott Long

On Jul 11, 2013, at 2:56 AM, Konstantin Belousov kostik...@gmail.com wrote:

 On Thu, Jul 11, 2013 at 02:39:00AM -0700, Adrian Chadd wrote:
 On 11 July 2013 02:36, Konstantin Belousov kostik...@gmail.com wrote:
 
 No, it is not disk I/O which is problematic there. It is socket I/O
 e.g. wait for the socket buffers lomark in the kern_sendfile() which
 causes unbounded sleep. Look for the sbwait() call, both in the
 kern_sendfile() itself, and in the pru_send methods of the protocols,
 e.g. in sosend_generic(). The wait scope controlled by the other side of
 connection and allow it to completely block the aio subsystem.
 
 Disk I/O is supposed to finish in the finite time.
 
 Even if the destination socket is marked as NONBLOCK?
 
 You mean, would a sleep for the socket buffer space cause aio thread
 block is the socket is put in nonblocking mode ?  Or something else ?
 
 No, it would not block the thread. But I cannot consider the
 aio_sendfile(2) implementation useful if it requires non-blocking
 socket. Also, what about other thread changing the socket to blocking
 mode while sendfile is in flight ?

Just as with other aspects of sendfile, it's up to the caller to protect this 
kind
of state.  Objecting to aio_sendfile() simply for the reason you state is absurd
and against the design goals of sendfile.

Scott

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: hacking - aio_sendfile()

2013-07-11 Thread Scott Long

On Jul 11, 2013, at 11:48 AM, Konstantin Belousov kostik...@gmail.com wrote:

 On Thu, Jul 11, 2013 at 11:44:32AM -0700, Scott Long wrote:
 
 On Jul 10, 2013, at 11:17 PM, Konstantin Belousov kostik...@gmail.com 
 wrote:
 
 On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote:
 Hiya,
 
 I've started writing an aio_sendfile() syscall.
 
 http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff
 
 Yes, the diff is against -HEAD and not stable/9.
 
 It's totally horrible, hackish and likely bad. I've only done some
 very, very basic testing to ensure it actually works; i haven't at all
 stress tested it out yet. It's also very naive - I'm not at all doing
 any checks to see whether I can short-cut to do the aio there and
 then; I'm always queuing the sendfile() op through the worker threads.
 That's likely stupid and inefficient in a lot of cases, but it at
 least gets the syscall up and working.
 Yes, it is naive, but for different reason.
 
 The kern_sendfile() is synchronous function, it only completes after
 the other end of the network communication allows it. This means
 that calling kern_sendfile() from the aio thread blocks the thread
 indefinitely by unbounded sleep.
 
 
 No, kern_sendfile is async unless you specify the SF_SYNC hack flag.
 Otherwise, it'll fill the socket buffer and then return immediately, unless
 the socket buffer is full and the socket is set to blocking mode.  That's
 outside the scope, as I said in my previous email.
 
 You do not understand what I said, please re-read both my mail and code
 before replying.  Implementing aio_sendfile() as proposed would create
 yet another possibility of indefinitely block all processes using aio.

I'm lost, maybe I missed some emails?  I see a set of emails where you 
incorrectly
state that kern_sendfile() will always call sbwait(), and then you backtrack on 
that
and claim that it's unacceptable to enforce that SS_NBIO be used for aio 
operations.
I apologize if I'm missing something here.

Scott

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: hacking - aio_sendfile()

2013-07-11 Thread Konstantin Belousov
On Thu, Jul 11, 2013 at 12:04:57PM -0700, Scott Long wrote:
 
 On Jul 11, 2013, at 11:48 AM, Konstantin Belousov kostik...@gmail.com wrote:
 
  On Thu, Jul 11, 2013 at 11:44:32AM -0700, Scott Long wrote:
  
  On Jul 10, 2013, at 11:17 PM, Konstantin Belousov kostik...@gmail.com 
  wrote:
  
  On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote:
  Hiya,
  
  I've started writing an aio_sendfile() syscall.
  
  http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff
  
  Yes, the diff is against -HEAD and not stable/9.
  
  It's totally horrible, hackish and likely bad. I've only done some
  very, very basic testing to ensure it actually works; i haven't at all
  stress tested it out yet. It's also very naive - I'm not at all doing
  any checks to see whether I can short-cut to do the aio there and
  then; I'm always queuing the sendfile() op through the worker threads.
  That's likely stupid and inefficient in a lot of cases, but it at
  least gets the syscall up and working.
  Yes, it is naive, but for different reason.
  
  The kern_sendfile() is synchronous function, it only completes after
  the other end of the network communication allows it. This means
  that calling kern_sendfile() from the aio thread blocks the thread
  indefinitely by unbounded sleep.
  
  
  No, kern_sendfile is async unless you specify the SF_SYNC hack flag.
  Otherwise, it'll fill the socket buffer and then return immediately, unless
  the socket buffer is full and the socket is set to blocking mode.  That's
  outside the scope, as I said in my previous email.
  
  You do not understand what I said, please re-read both my mail and code
  before replying.  Implementing aio_sendfile() as proposed would create
  yet another possibility of indefinitely block all processes using aio.
 
 I'm lost, maybe I missed some emails?  I see a set of emails where you 
 incorrectly
 state that kern_sendfile() will always call sbwait(), and then you backtrack 
 on that
 and claim that it's unacceptable to enforce that SS_NBIO be used for aio 
 operations.
 I apologize if I'm missing something here.

Can you cite my exact text where I claimed that kern_sendfile() always calls
sbwait ?

I wrote about this explicitely, stating that it is very easy to make
kern_sendfile() sleep for the socket buffer space, and the duration
of the sleep is user-controllable.  As result, it allows to hang all
processes doing aio calls, since aio thread pool is finite.  I am sorry
for retyping this and stealing your time by repeating.

Making the kern_sendfile() to behave from the aio context as if the
SS_NBIO was set on the socket contradicts the behaviour of other aio
operations. E.g. aio_read and aio_write do not perform short reads and
writes to not block the aio daemon threads (which is the cause of buggy
behaviour of existing aio syscalls on sockets).

More, I do not think that setting SS_NBIO is enough to prevent the
blocking of aio threads in kern_sendfile(). The send socket buffer
is locked exclusively by kern_sendfile(). Other thread which entered
sendfile(2) and was deliberately put to sleep on the low watermark,
still owns the so_snd sx. This means that aio threads trying to do
kern_sendfile() on this socket would be also blocked, for duration
controlled by other end.

That said, even assuming SS_NBIO is always enforced and other sleep
points are identified and worked around, the only benefit of such
implementation comparing with the direct sendfile(2) call would be
preventing the use of the calling thread context for disk i/o. FreeBSD
recently gained aio_mlock(2) which allows to get the same result in
non-hackish way.


pgp2L4b4m0tRA.pgp
Description: PGP signature


Re: hacking - aio_sendfile()

2013-07-11 Thread David Xu

On 2013/07/11 14:17, Konstantin Belousov wrote:

On Wed, Jul 10, 2013 at 04:36:23PM -0700, Adrian Chadd wrote:

Hiya,

I've started writing an aio_sendfile() syscall.

http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff

Yes, the diff is against -HEAD and not stable/9.

It's totally horrible, hackish and likely bad. I've only done some
very, very basic testing to ensure it actually works; i haven't at all
stress tested it out yet. It's also very naive - I'm not at all doing
any checks to see whether I can short-cut to do the aio there and
then; I'm always queuing the sendfile() op through the worker threads.
That's likely stupid and inefficient in a lot of cases, but it at
least gets the syscall up and working.

Yes, it is naive, but for different reason.

The kern_sendfile() is synchronous function, it only completes after
the other end of the network communication allows it. This means
that calling kern_sendfile() from the aio thread blocks the thread
indefinitely by unbounded sleep.

Your implementation easily causes exhaustion of the aio thread pool,
blocking the whole aio subsystem. It is known that our aio does not work
for sockets for the same reason. I object against adding more code with
the same defect.

Proper route seems to rewrite aio for sockets using the upcalls.  The same
should be done for sendfile, if sendfile is given aio flavor.



Yes, current aio implementation is horrible, it only works for fast 
disk I/O, I think the thread pool size is enough to saturate disks,

but for socket or pipe I/O, it does not work well, the thread pool is
too easy to be exhausted.

I even think the support for socket and pipe in aio code should be
cut and thrown away, because you can always use kqueue + non-blocking
I/O.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


hacking - aio_sendfile()

2013-07-10 Thread Adrian Chadd
Hiya,

I've started writing an aio_sendfile() syscall.

http://people.freebsd.org/~adrian/ath/20130710-aio-sendfile-3.diff

Yes, the diff is against -HEAD and not stable/9.

It's totally horrible, hackish and likely bad. I've only done some
very, very basic testing to ensure it actually works; i haven't at all
stress tested it out yet. It's also very naive - I'm not at all doing
any checks to see whether I can short-cut to do the aio there and
then; I'm always queuing the sendfile() op through the worker threads.
That's likely stupid and inefficient in a lot of cases, but it at
least gets the syscall up and working.

I'd like some feedback and possibly some help in stress testing it to
make sure it's functioning well.

Thanks,


-adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org