Ahn Ki-yung <[EMAIL PROTECTED]> writes: > Document says GHC uses non-blocking IO to be thread friendly. > [...] > How is non-blocking and thread working ? I can't understand.
This is usually implemented using 'select' (see attached man page). In a multithreaded program where you have user-level threads (which is, effectively, what GHC's threads provide), you typically maintain a global set of all file descriptors that a thread is trying to read from. There are then two cases: 1) If there are no runnable threads, call 'select' on all threads using a NULL timeout. 2) If there are runnable threads, you can use 'select' with a timeout of 0 to poll for readable file descriptors every N seconds. The number N should be chosen low enough to have low latency and high enough to have low overhead. An alternative that works on some operating systems is to use asynchronous I/O calls. The problem is that this isn't as portable as using select. Hope this helps. -- Alastair Reid [EMAIL PROTECTED] Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/ SELECT(2) Linux Programmer's Manual SELECT(2) NAME select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO - synchronous I/O multiplexing SYNOPSIS /* According to POSIX 1003.1-2001 */ #include <sys/select.h> /* According to earlier standards */ #include <sys/time.h> #include <sys/types.h> #include <unistd.h> int select(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); int pselect(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, const struct timespec *timeout, const sigset_t *sigmask); FD_CLR(int fd, fd_set *set); FD_ISSET(int fd, fd_set *set); FD_SET(int fd, fd_set *set); FD_ZERO(fd_set *set); DESCRIPTION The functions select and pselect wait for a number of file descriptors to change status. Their function is identical, with three differences: (i) The select function uses a timeout that is a struct timeval (with seconds and microseconds), while pselect uses a struct timespec (with seconds and nanoseconds). (ii) The select function may update the timeout parameter to indicate how much time was left. The pselect function does not change this parameter. (iii) The select function has no sigmask parameter, and behaves as pselect called with NULL sigmask. Three independent sets of descriptors are watched. Those listed in readfds will be watched to see if characters become available for read- ing (more precisely, to see if a read will not block - in particular, a file descriptor is also ready on end-of-file), those in writefds will be watched to see if a write will not block, and those in exceptfds will be watched for exceptions. On exit, the sets are modified in place to indicate which descriptors actually changed status. Four macros are provided to manipulate the sets. FD_ZERO will clear a set. FD_SET and FD_CLR add or remove a given descriptor from a set. FD_ISSET tests to see if a descriptor is part of the set; this is use- ful after select returns. n is the highest-numbered descriptor in any of the three sets, plus 1. timeout is an upper bound on the amount of time elapsed before select returns. It may be zero, causing select to return immediately. (This is useful for polling.) If timeout is NULL (no timeout), select can block indefinitely. sigmask is a pointer to a signal mask (see sigprocmask(2)); if it is not NULL, then pselect first replaces the current signal mask by the one pointed to by sigmask, then does the `select' function, and then restores the original signal mask again. The idea of pselect is that if one wants to wait for an event, either a signal or something on a file descriptor, an atomic test is needed to prevent race conditions. (Suppose the signal handler sets a global flag and returns. Then a test of this global flag followed by a call of select() could hang indefinitely if the signal arrived just after the test but just before the call. On the other hand, pselect allows one to first block signals, handle the signals that have come in, then call pselect() with the desired sigmask, avoiding the race.) Since Linux today does not have a pselect() system call, the current glibc2 routine still contains this race. The timeout The time structures involved are defined in <sys/time.h> and look like struct timeval { long tv_sec; /* seconds */ long tv_usec; /* microseconds */ }; and struct timespec { long tv_sec; /* seconds */ long tv_nsec; /* nanoseconds */ }; Some code calls select with all three sets empty, n zero, and a non- null timeout as a fairly portable way to sleep with subsecond preci- sion. On Linux, the function select modifies timeout to reflect the amount of time not slept; most other implementations do not do this. This causes problems both when Linux code which reads timeout is ported to other operating systems, and when code is ported to Linux that reuses a struct timeval for multiple selects in a loop without reinitializing it. Consider timeout to be undefined after select returns. RETURN VALUE On success, select and pselect return the number of descriptors con- tained in the descriptor sets, which may be zero if the timeout expires before anything interesting happens. On error, -1 is returned, and errno is set appropriately; the sets and timeout become undefined, so do not rely on their contents after an error. ERRORS EBADF An invalid file descriptor was given in one of the sets. EINTR A non blocked signal was caught. EINVAL n is negative. ENOMEM select was unable to allocate memory for internal tables. EXAMPLE #include <stdio.h> #include <sys/time.h> #include <sys/types.h> #include <unistd.h> int main(void) { fd_set rfds; struct timeval tv; int retval; /* Watch stdin (fd 0) to see when it has input. */ FD_ZERO(&rfds); FD_SET(0, &rfds); /* Wait up to five seconds. */ tv.tv_sec = 5; tv.tv_usec = 0; retval = select(1, &rfds, NULL, NULL, &tv); /* Don't rely on the value of tv now! */ if (retval) printf("Data is available now.\n"); /* FD_ISSET(0, &rfds) will be true. */ else printf("No data within five seconds.\n"); return 0; } CONFORMING TO 4.4BSD (the select function first appeared in 4.2BSD). Generally portable to/from non-BSD systems supporting clones of the BSD socket layer (including System V variants). However, note that the System V variant typically sets the timeout variable before exit, but the BSD variant does not. The pselect function is defined in IEEE Std 1003.1g-2000 (POSIX.1g), and part of POSIX 1003.1-2001. It is found in glibc2.1 and later. Glibc2.0 has a function with this name, that however does not take a sigmask parameter. NOTES Concerning prototypes, the classical situation is that one should include <time.h> for select. The POSIX 1003.1-2001 situation is that one should include <sys/select.h> for select and pselect. Libc4 and libc5 do not have a <sys/select.h> header; under glibc 2.0 and later this header exists. Under glibc 2.0 it unconditionally gives the wrong prototype for pselect, under glibc 2.1-2.2.1 it gives pselect when _GNU_SOURCE is defined, under glibc 2.2.2-2.2.4 it gives it when _XOPEN_SOURCE is defined and has a value of 600 or larger. No doubt, since POSIX 1003.1-2001, it should give the prototype by default. SEE ALSO For a tutorial with discussion and examples, see select_tut(2). For vaguely related stuff, see accept(2), connect(2), poll(2), read(2), recv(2), send(2), sigprocmask(2), write(2) Linux 2.4 2001-02-09 SELECT(2) _______________________________________________ Glasgow-haskell-users mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users