On Tue, Nov 18, 2025 at 12:31 PM Samuel Thibault <[email protected]> wrote: > On Mon, Nov 17, 2025 at 03:59:30PM +1300, Thomas Munro wrote: > > . o O { An absurdly far-fetched thought while browsing glibc/hurd glue > > code: if synchronous I/O is implemented as RPC on Mach ports, could > > that mean that it's technically possible to submit now and consume > > results later, for asynchronous I/O? > > Yes, it is completely possible.
Neat! > > Possibly too private/undocumented anyway, > > It's not really documented much, but it's completely public. One > can include <hurd/io_request.h> and call e.g. io_read_request(port, > reply_port, offset, amount). One then has to run a msgserver loop on the > reply_port to get the reply messages. An example can be seen in the hurd > source in trans/streamio.c, for e.g. device_open_request() calls. OK, to continue the thought experiment... someone could invent write io_method=hurd, and it'd have to be more efficient than handing the work off to I/O worker processes (what you get with the default io_method=worker), since the worker process clearly has to do exactly the same thing internally in a synchronous wrapper function anyway, just with extra steps to reach it. At a guess, it could follow io_method=io_uring's general design and have a reply port owned by each backend (= process), and backends would almost always consume replies from their own reply port. They'd need to be able to consume from each other's reply port occasionally, but I assume that's possible with an exclusive lock and a temporary transfer of receive rights. Every process would have to receive duplicates of the full set of ports after fork(), but at least that problem would go away in an in-development multithreaded mode. I doubt it'd be much good without a readv/writev operations, though. It looks they aren't in io_request.defs yet? Does that also imply that preadv() has to loop over the vectors sending tons of messages and waiting for replies? Standard POSIX AIO also lacks vectored I/O. It lacks many, many other things one might want (though serious implementations in the old commercial Unixen added unknown incompatible extensions negotiated with database vendors, including reply ports), but scatter/gather seems pretty fundamental for database buffer pool implementations: we'd have to call aio_read()/aio_write() 16, 32 times when we could just ask a helper process to call preadv() once (assuming it's really one operation), to transfer a contiguous blocks range to/from discontiguous buffers. Databases want to do that a lot. When combined with direct I/O, that's actual IOPS out the window, but even for buffered I/O it's a very high overhead for straight-line I/O. For that reason we don't actually support pgaio implementations that don't have readv/writev currently. When we tried it we had to inhibit I/O combining at higher levels and it wasn't good. (And then to get more and more pie-in-the-sky: (1) O_DIRECT is highly desirable for zero-copy DMA to/from a user space buffer pool, (2) starting more than one I/O with a single context switch and likewise for consuming replies, (3) registering/locking memory pages and descriptors with a port so they don't have to be pinned/unpinned by the I/O subsystem all the time. And then, if Hurd works the way I think it might, (4) to avoid chains of pipe-like scheduling overheads when starting a direct I/O and maybe also some already-cached buffered I/O, you'd ideally want ports to have a "fast" send path that behaves like the old Spring/Solaris doors, where the caller's thread would yield directly to a thread in the receiving server, forming a chain: database -> file system -> driver -> device that is sort of synchronous and then returns control, like a kind of dual of a system call that reaches through the chain of user space service, and presumably the same sort of thing on the way back from the interrupt handler on completion. Idea (4) might well be Hurd/Mach heresy for all I know, being totally out of the loop on this stuff; or perhaps you already have something like that...)
