On Sun, 19 Jun 2016 11:13:17 +0200 Andy Wingo <wi...@pobox.com> wrote: > Hi :) > > On Sun 12 Jun 2016 10:25, Chris Vine <ch...@cvine.freeserve.co.uk> > writes: > > >> > >> http://www.gnu.org/software/guile/docs/master/guile.html/Input-and-Output.html > >> > > > > The documentation indicates that with the C ports implementation in > > guile-2.2, reads will block on non-blocking file descriptors. > > Correct. > > > This will stop the approach to asynchronicity used in 8sync and > > guile-a-sync (the latter of which I have written) from working > > correctly with sockets on linux operating systems, because at > > present both of these use guile's wrapper for select. > > The trouble is that AFAIU there is no way to make non-blocking input > work reliably with O_NONBLOCK file descriptors in the approach that > Guile has always used. > > As you know, the current behavior for Guile 2.0 is to throw an > exception when you get EAGAIN / EWOULDBLOCK. If I am understanding > you correctly, your approach is to only read from a port if you have > done a select() / poll() / etc on it beforehand indicating that you > can read at least one byte.
My approach would be to to do that, if it worked. And it does work with pipes and unix domain sockets provided you only read a byte at a time (see further below), but does not work on linux with TCP sockets because linux's select() and poll() system calls are not POSIX compliant. Therefore with TCP sockets on linux you have to use non-blocking reads and cater for the possibility of an EAGAIN/EWOULDBLOCK exception. > The problem with this is not only spurious wakeups, as you note, but > also buffering. Throwing an exception when reading in Guile 2.0 will > discard input buffers in many cases. Likewise when writing, you won't > be able to know how much you've written. > > This goes not only for the explicit bufffers attached to ports and > which you can control with `setvbuf', but also implicit buffers, and > it's in this case that it's particularly pernicious: if you > `read-char' on a UTF-8 port, you might end up using local variables > in the stack as a buffer for reconstructing that codepoint. If you > throw an exception in the middle, you discard those bytes. Likewise > for writing. I recognise this problem. The answer I have adopted when reading from TCP sockets is to extract individual bytes only from the port into a bytevector using R6RS's get-u8 procedure and (if the port is textual rather than binary) to reconstruct characters from that using bytevector->string at, say, a line end[1]. An EAGAIN/EWOULDBLOCK exception is then just treated as an invitation to return to the prompt, and read state is retained in the bytevector. In effect this is doing by hand what a more complete non-blocking EAGAIN-safe port implementation might otherwise do for you. Writing is something else. To do it effectively the writer to the port must in any event cater for the fact that when the buffer is full but the underlying file descriptor is ready for writing, the next write will cause a buffer flush, and if the size of the buffer is greater than the number of characters that the file can receive without blocking, blocking might still occur. You usually need to switch off buffering for writes (but you quite often may want to do that anyway on output ports for sockets). > For suspendable ports, you don't throw an exception: you just assume > the operation is going to work, but if you get EAGAIN / EWOULDBLOCK, > you call the current-read-waiter / current-write-waiter and when that > returns retry the operation. Since it operates on the lowest level of > bytes, it's reliable. Looping handles the spurious wakeup case. > > > However, to cater for other asynchronous implementations of file > > watches, would it be possible to provide a configurable option > > either to retain the guile-2.0 behaviour in such cases (which is to > > throw a system-error with errno set to EAGAIN or EWOULDBLOCK), or > > to provide a non-blocking alternative whereby the read operation > > would, instead of blocking, return some special value such as an > > EAGAIN symbol? Either would enable user code then to resume to its > > prompt and let other code execute. > > Why not just (install-suspendable-ports!) and > > (parameterize ((current-read-waiter my-read-waiter)) ...) > > etc? It is entirely possible with Guile 2.1.3 to build an > asynchronous coroutine-style concurrent system in user-space using > these primitives. See the wip-ethread branch for an example > implementation. I would want to continue using an external event loop implemented with poll() or select() and delimited continuations. This makes it relatively trivial to implement asynchronous timeouts and single-threaded task multiplexing (albeit co-operative rather than pre-emptive) as well as file operations, and would also enable the glib event loop to be used for programs which happen to use guile-gnome (although guile-gnome has other issues at present). I don't think I have got to grips with how to do that with read-waiter, because the read-waiter comprises in effect another loop (in which the main event loop with its own prompts would have to run) until the read request has been satisfied. I would need to think about it. Since ethreads use a poll()/epoll() loop, presumably you think it is straightforward enough to integrate the two, even if at present I don't. Writing custom C ports which do what I want with non-blocking descriptors is another option, but I would hope that could be avoided: at that point one would probably use some other solution entirely. On a side issue, I am still trying to understand the point of causing guile-2.2's read of a non-blocking C port to block. The whole point of making a descriptor non-blocking is that that shouldn't happen, and there are circumstances where pealing individual bytes off a non-blocking port as they become available is what you want to do. It makes guile's select wrapper unusable with TCP sockets on linux. I understand that suspendable-ports work differently, but that is another matter. Chris [1] A sample implementation can be seen at https://github.com/ChrisVine/guile-a-sync/blob/master/a-sync/event-loop.scm from line 900 on, and elsewhere in that file.