On Wed 06 Apr 2016 22:46, Andy Wingo <wi...@pobox.com> writes: > I have been working on a refactor to ports.
The status is that in wip-ports-refactor I have changed the internal port implementation to always have buffers, and that those buffers are always bytevectors (internally encapsulated in a scm_t_port_buffer struct that has cursors into the buffer). In that way we should be able to access the port buffers from Scheme safely. The end-game is to allow programs like Scheme's `read' to be suspendable. That means, whenever they would peek-char/read-char and input is unavailable, the program would suspend to the scheduler by aborting to a prompt, and resume the resulting continuation when input becomes available. Likewise for writing. To do this, all port functions need to be implemented in Scheme, because for a delimited continuation to be resumed, it has to only capture Scheme activations, not C activations. This is obviously a gnarly task. It still makes sense to have C functions that work on ports -- and specifically, that C have access to the port buffers. But it would be fine for C ports to call out to Scheme to fill their read buffers / flush their write buffers. So the near-term is to move the read/write/etc ptob methods to be Scheme functions -- probably gsubr wrappers for now (for the existing port types). Then we need to start allowing I/O functions to be implemented in Scheme -- in (ice-9 ports) or so. But, you don't want Scheme code to have to import (ice-9 ports). You want existing code that uses read-char and so on to become suspendable. So, we will replace core I/O bindings in boot-9 with imported bindings from (ice-9 ports). That will also allow us to trim the set of bindings defined in boot-9 itself (before (ice-9 ports) is loaded) to the minimum set that is needed to boot Guile. So the plan is: 1. Create (ice-9 ports) module - it will do load-extension to cause ports.c to define I/O routines - it exports all i/o routines that are exported by ports.c, and perhaps by other files as well - bindings from (ice-9 ports) are imported into boot-9, augmenting the minimal set of bindings defined in boot-9, and replacing the existing minimal bindings via set! 2. Add Scheme interface to port buffers, make internal to (ice-9 ports) - this should allow I/O routines to get a port's read or write buffers, grovel in the bytes, update cursors, and call the read or write functions to fill or empty them 3. Start rewriting I/O routines in Scheme 4. Add/adapt a non-blocking interface - Currently port read/write functions are blocking. Probably we should change their semantics to be nonblocking. This would allow Guile to detect when to suspend a computation. - Nonblocking ports need an FD to select on; if they don't have one, a write or read that consumes 0 bytes indicates EOF - Existing blocking interfaces would be shimmed by "select"-ing on the port until it's writable in a loop 5. Add "current read waiter" / "current write waiter" abstraction from the ethreads branch - These are parameters (dynamic bindings) that are procedures that define what to do when a read or write would block. By default I think probably they should select in a loop to emulate blocking behavior. They could be parameterized to suspend the computation to a scheduler though. Finally there is a question about speed. I expect that for buffered ports, I/O from C will have a minimal slowdown. For unbuffered ports, the slowdown would be more, because the cost of filling and emptying ports is higher with a call from C to Scheme (and then back, for read/write functions actually implemented in C.) But for Scheme, I expect that generally throughput goes up, as we will be able to build flexible I/O routines that can access the buffer directly, both because with this branch buffering is uniformly handled in the generic port code, and also because Scheme avoids the Scheme->C penalty in common cases. We can provide compiler support for accessing the port buffer, if needed, but hopefully we can avoid that. Finally finally, there is still the question about locks. I don't know the answer here. I think it's likely that we can have concurrent access to port buffers without locks, but I suspect that anything that accesses mutable port state should probably be protected by a lock -- but probably not a re-entrant lock, because the operations called with that lock wouldn't call out to any user code. That means that read/write functions from port implementations would have to bake in their own threadsafety, but probably that's OK; for file ports, for example, the threadsafety is baked in the kernel. Atomic accessors are also a possibility if there is still overhead. I think also we could remove all of the _unlocked functions from our API and from our internals in that case, and just lock as appropriate, understanding that the perf impact should be minimal. Andy