On 3/6/19 9:13 AM, Jens Axboe wrote:
> Hi Linus,
>
> 2nd attempt at adding the io_uring interface. Since the first one,
> we've added basic unit testing of the three system calls, that
> resides in liburing like the other unit tests that we have so far.
> It'll take a while to get full coverage of it, but we're working
> towards it. I've also added two basic test programs to tools/io_uring.
> One uses the raw interface and has support for all the various
> features that io_uring supports outside of standard IO, like fixed
> files, fixed IO buffers, and polled IO. The other uses the liburing
> API, and is a simplified version of cp(1).
>
> This pull request adds support for a new IO interface, io_uring.
> io_uring allows an application to communicate with the kernel through
> two rings, the submission queue (SQ) and completion queue (CQ) ring.
> This allows for very efficient handling of IOs, see the v5 posting for
> some basic numbers:
>
> https://lore.kernel.org/linux-block/[email protected]/
>
> Outside of just efficiency, the interface is also flexible and
> extendable, and allows for future use cases like the upcoming NVMe
> key-value store API, networked IO, and so on. It also supports async
> buffered IO, something that we've always failed to support in the
> kernel.
>
> Outside of basic IO features, it supports async polled IO as well. This
> particular feature has already been tested at Facebook months ago for
> flash storage boxes, with 25-33% improvements. It makes polled IO
> actually useful for real world use cases, where even basic flash sees a
> nice win in terms of efficiency, latency, and performance. These boxes
> were IOPS bound before, now they are not.
>
> This series adds three new system calls. One for setting up an io_uring
> instance (io_uring_setup(2)), one for submitting/completing IO
> (io_uring_enter(2)), and one for aux functions like registrating file
> sets, buffers, etc (io_uring_register(2)). Through the help of Arnd,
> I've coordinated the syscall numbers so merge on that front should be
> painless.
>
> Jon did a writeup of the interface a while back, which (except for minor
> details that have been tweaked) is still accurate. Find that here:
>
> https://lwn.net/Articles/776703/
>
> Huge thanks to Al Viro for helping getting the reference cycle code
> correct, and to Jann Horn for his extensive reviews focused on both
> security and bugs in general.
>
> There's a userspace library that provides basic functionality for
> applications that don't need or want to care about how to fiddle with
> the rings directly. It has helpers to allow applications to easily set
> up an io_uring instance, and submit/complete IO through it without
> knowing about the intricacies of the rings. It also includes man pages
> (thanks to Jeff Moyer), and will continue to grow support helper
> functions and features as time progresses. Find it here:
>
> git://git.kernel.dk/liburing
>
> Fio has full support for the raw interface, both in the form of an IO
> engine (io_uring), but also with a small test application (t/io_uring)
> that can exercise and benchmark the interface.
>
> Note that this branch sits on top of my for-5.1/block branch, since the
> multi-page bvec changes caused a few conflicts with the pre-mapped
> buffer support. I also moved a few prep patches to that branch today,
> which is why it appears recently rebased (moved the 4 bottom patches
> from io_uring to for-5.1/block).
>
> Please consider this feature for 5.1, so we can finally have something
> that's both fast, efficient, and feature rich for IO instead of the sad
> niche case that is aio/libaio.
>
>
> git://git.kernel.dk/linux-block.git tags/io_uring-2019-03-06
Slight mess up in the stats, here's the correct one... Note that this
also throws a few more merge conflicts now, due to the syscall merges.
All trivial, though, and the branch was prepared for it in terms of
numbering.
----------------------------------------------------------------
Christoph Hellwig (1):
io_uring: add fsync support
Jens Axboe (14):
Add io_uring IO interface
io_uring: support for IO polling
fs: add fget_many() and fput_many()
io_uring: use fget/fput_many() for file references
io_uring: batch io_kiocb allocation
block: implement bio helper to add iter bvec pages to bio
io_uring: add support for pre-mapped user IO buffers
net: split out functions related to registering inflight socket files
io_uring: add file set registration
io_uring: add submission polling
io_uring: add io_kiocb ref count
io_uring: add support for IORING_OP_POLL
io_uring: allow workqueue item to handle multiple buffered requests
io_uring: add a few test tools
arch/x86/entry/syscalls/syscall_32.tbl | 3 +
arch/x86/entry/syscalls/syscall_64.tbl | 3 +
block/bio.c | 62 +-
fs/Makefile | 1 +
fs/file.c | 15 +-
fs/file_table.c | 9 +-
fs/io_uring.c | 2971 ++++++++++++++++++++++++++++++++
include/linux/file.h | 2 +
include/linux/fs.h | 13 +-
include/linux/sched/user.h | 2 +-
include/linux/syscalls.h | 8 +
include/net/af_unix.h | 1 +
include/uapi/asm-generic/unistd.h | 8 +-
include/uapi/linux/io_uring.h | 137 ++
init/Kconfig | 9 +
kernel/sys_ni.c | 3 +
net/Makefile | 2 +-
net/unix/Kconfig | 5 +
net/unix/Makefile | 2 +
net/unix/af_unix.c | 63 +-
net/unix/garbage.c | 68 +-
net/unix/scm.c | 151 ++
net/unix/scm.h | 10 +
tools/io_uring/Makefile | 18 +
tools/io_uring/README | 29 +
tools/io_uring/barrier.h | 16 +
tools/io_uring/io_uring-bench.c | 616 +++++++
tools/io_uring/io_uring-cp.c | 251 +++
tools/io_uring/liburing.h | 143 ++
tools/io_uring/queue.c | 164 ++
tools/io_uring/setup.c | 103 ++
tools/io_uring/syscall.c | 40 +
32 files changed, 4782 insertions(+), 146 deletions(-)
create mode 100644 fs/io_uring.c
create mode 100644 include/uapi/linux/io_uring.h
create mode 100644 net/unix/scm.c
create mode 100644 net/unix/scm.h
create mode 100644 tools/io_uring/Makefile
create mode 100644 tools/io_uring/README
create mode 100644 tools/io_uring/barrier.h
create mode 100644 tools/io_uring/io_uring-bench.c
create mode 100644 tools/io_uring/io_uring-cp.c
create mode 100644 tools/io_uring/liburing.h
create mode 100644 tools/io_uring/queue.c
create mode 100644 tools/io_uring/setup.c
create mode 100644 tools/io_uring/syscall.c
--
Jens Axboe