Re: wip-ports-refactor

Andy Wingo Tue, 12 Apr 2016 02:34:48 -0700

On Wed 06 Apr 2016 22:46, Andy Wingo <wi...@pobox.com> writes:

> I have been working on a refactor to ports.


The status is that in wip-ports-refactor I have changed the internal
port implementation to always have buffers, and that those buffers are
always bytevectors (internally encapsulated in a scm_t_port_buffer
struct that has cursors into the buffer).  In that way we should be able
to access the port buffers from Scheme safely.

The end-game is to allow programs like Scheme's `read' to be
suspendable.  That means, whenever they would peek-char/read-char and
input is unavailable, the program would suspend to the scheduler by
aborting to a prompt, and resume the resulting continuation when input
becomes available.  Likewise for writing.  To do this, all port
functions need to be implemented in Scheme, because for a delimited
continuation to be resumed, it has to only capture Scheme activations,
not C activations.

This is obviously a gnarly task.  It still makes sense to have C
functions that work on ports -- and specifically, that C have access to
the port buffers.  But it would be fine for C ports to call out to
Scheme to fill their read buffers / flush their write buffers.

So the near-term is to move the read/write/etc ptob methods to be Scheme
functions -- probably gsubr wrappers for now (for the existing port
types).  Then we need to start allowing I/O functions to be implemented
in Scheme -- in (ice-9 ports) or so.

But, you don't want Scheme code to have to import (ice-9 ports).  You
want existing code that uses read-char and so on to become suspendable.
So, we will replace core I/O bindings in boot-9 with imported bindings
from (ice-9 ports).  That will also allow us to trim the set of bindings
defined in boot-9 itself (before (ice-9 ports) is loaded) to the minimum
set that is needed to boot Guile.

So the plan is:

  1. Create (ice-9 ports) module

     - it will do load-extension to cause ports.c to define I/O routines

     - it exports all i/o routines that are exported by ports.c, and
       perhaps by other files as well

     - bindings from (ice-9 ports) are imported into boot-9, augmenting
       the minimal set of bindings defined in boot-9, and replacing the
       existing minimal bindings via set!

  2. Add Scheme interface to port buffers, make internal to (ice-9
     ports)

     - this should allow I/O routines to get a port's read or write
       buffers, grovel in the bytes, update cursors, and call the read
       or write functions to fill or empty them

  3. Start rewriting I/O routines in Scheme

  4. Add/adapt a non-blocking interface

     - Currently port read/write functions are blocking.  Probably we
       should change their semantics to be nonblocking.  This would
       allow Guile to detect when to suspend a computation.

     - Nonblocking ports need an FD to select on; if they don't have
       one, a write or read that consumes 0 bytes indicates EOF

     - Existing blocking interfaces would be shimmed by "select"-ing on
       the port until it's writable in a loop

  5. Add "current read waiter" / "current write waiter" abstraction from
     the ethreads branch

     - These are parameters (dynamic bindings) that are procedures that
       define what to do when a read or write would block.  By default I
       think probably they should select in a loop to emulate blocking
       behavior.  They could be parameterized to suspend the computation
       to a scheduler though.

Finally there is a question about speed.  I expect that for buffered
ports, I/O from C will have a minimal slowdown.  For unbuffered ports,
the slowdown would be more, because the cost of filling and emptying
ports is higher with a call from C to Scheme (and then back, for
read/write functions actually implemented in C.)  But for Scheme, I
expect that generally throughput goes up, as we will be able to build
flexible I/O routines that can access the buffer directly, both because
with this branch buffering is uniformly handled in the generic port
code, and also because Scheme avoids the Scheme->C penalty in common
cases.  We can provide compiler support for accessing the port buffer,
if needed, but hopefully we can avoid that.

Finally finally, there is still the question about locks.  I don't know
the answer here.  I think it's likely that we can have concurrent access
to port buffers without locks, but I suspect that anything that accesses
mutable port state should probably be protected by a lock -- but
probably not a re-entrant lock, because the operations called with that
lock wouldn't call out to any user code.  That means that read/write
functions from port implementations would have to bake in their own
threadsafety, but probably that's OK; for file ports, for example, the
threadsafety is baked in the kernel.  Atomic accessors are also a
possibility if there is still overhead.  I think also we could remove
all of the _unlocked functions from our API and from our internals in
that case, and just lock as appropriate, understanding that the perf
impact should be minimal.

Andy

Re: wip-ports-refactor

Reply via email to