Re: Line-oriented blocking input from sockets?
On Fri, Jul 11, 2003 at 03:04:18PM -0700, Damon Hastings wrote: This is probably a stupid question, but how do you do line-oriented blocking socket reads in Pth? There's a pth_read() which I assume blocks until a specified number of bytes (or eof) are received -- but I'm looking for something like pth_gets() to block until a newline is received. I don't think you could even write a simple webserver in Pth without such a function, though you could of course implement it yourself via a non-blocking pth_read in a loop. I'll implement it myself if need be, though I would rather trust someone else's code, as I've never used Pth before and there are lots of tricky error conditions in socket programming. The documentation at http://www.gnu.org/software/pth/pth-manual.html includes a simple output-only server as a code example -- is there an input/output example somewhere? You may want to look at the mmfd.c/h library which is included into mmftpd and can be found at http://mmondor.gobot.ca/software.html What it does is custom buffering with the supplied functions, in mmftpd case it uses pth_read()/pth_write(), that is pretty efficient... It is released under a BSD-style license. The library comes with an mdoc manual page. Matt __ GNU Portable Threads (Pth)http://www.gnu.org/software/pth/ Development Site http://www.ossp.org/pkg/lib/pth/ Distribution Files ftp://ftp.gnu.org/gnu/pth/ Distribution Snapshots ftp://ftp.ossp.org/pkg/lib/pth/ User Support Mailing List[EMAIL PROTECTED] Automated List Manager (Majordomo) [EMAIL PROTECTED]
Re: Line-oriented blocking input from sockets?
On Mon, Jul 14, 2003, Damon Hastings wrote: Can you give me a ballpark idea of the cost per context switch for 150 threads, almost all of which are waiting on I/O at any given time? The costs depend on the particular method Pth uses for the context implementation, of course. But all available methods Pth uses are very cheap, because they are user-space only methods. Keep also in mind that because Pth is a non-preemtive threading implementation, the context switching is only performed if really required by the I/O. And does Pth block the entire process when all its threads are blocked? (I would assume so, if you're using a global select under the covers.) Sure, if _all_ threads are waiting for an event, the whole process is waiting. And yes, internally the whole event management is based on a single select(2) call. Ralf S. Engelschall [EMAIL PROTECTED] www.engelschall.com __ GNU Portable Threads (Pth)http://www.gnu.org/software/pth/ Development Site http://www.ossp.org/pkg/lib/pth/ Distribution Files ftp://ftp.gnu.org/gnu/pth/ Distribution Snapshots ftp://ftp.ossp.org/pkg/lib/pth/ User Support Mailing List[EMAIL PROTECTED] Automated List Manager (Majordomo) [EMAIL PROTECTED]
RE: Line-oriented blocking input from sockets?
The costs depend on the particular method Pth uses for the context implementation, of course. But all available methods Pth uses are very cheap, because they are user-space only methods. Keep also in mind that because Pth is a non-preemtive threading implementation, the context switching is only performed if really required by the I/O. There is no reason a user-space context switch should be any faster than a kernel-space context switch unless the user-space context switch saves a kernel call. Measurements under Linux (at least) bear this out. [snip] I should point out that you didn't actually say that user-space context switches are always faster than kernel context switches (which is false). What you did say is that user-space only context swithces are generally quite fast (which is true). So I'm not dsiagreeing with you. I'm just pointing out that there's usally kernel overhead associated with these user-space context switches, especially if they're associated with I/O blocking. DS __ GNU Portable Threads (Pth)http://www.gnu.org/software/pth/ Development Site http://www.ossp.org/pkg/lib/pth/ Distribution Files ftp://ftp.gnu.org/gnu/pth/ Distribution Snapshots ftp://ftp.ossp.org/pkg/lib/pth/ User Support Mailing List[EMAIL PROTECTED] Automated List Manager (Majordomo) [EMAIL PROTECTED]
RE: Line-oriented blocking input from sockets?
If the context-switching overhead turns out to be too high, then I will do exactly that. But it will mean maintaining state for each connection, and that state will only grow more complex with future enhancements by myself and others. Umm, huh?! If you don't maintain state for each connection, how will you know what to do with the data you get? The big draw to Pth for me is the simplicity of threaded code with the efficiency of a state machine. At least, I think it will be efficient. I don't actually know at this point exactly how expensive a context-switch in Pth is. Simplicity is achieved when you precisely understand what the state is. DS __ GNU Portable Threads (Pth)http://www.gnu.org/software/pth/ Development Site http://www.ossp.org/pkg/lib/pth/ Distribution Files ftp://ftp.gnu.org/gnu/pth/ Distribution Snapshots ftp://ftp.ossp.org/pkg/lib/pth/ User Support Mailing List[EMAIL PROTECTED] Automated List Manager (Majordomo) [EMAIL PROTECTED]
Re: Line-oriented blocking input from sockets?
On Sun, Jul 13, 2003, Damon Hastings wrote: [...] So it copies the old stack out and the new stack in during every context switch? I don't know much about thread implementation, but I guess I thought you could just swap out the stack pointer instead of copying the whole stack. [...] Sure, Pth just swaps the stack pointer, of course. Nevertheless what here was mentioned is the fact that each stack consumes memory. If you have a large number of threads in your system, just the allocated stacks already accumulate to a rather large memory consumption. It's good that you guys tell me these things during the planning stage -- I guess that's an argument for using heap allocation instead of stack allocation, eh? In Pth only the (original) stack of the main thread is auto-growing (done by the OS), while the other stacks are fixed size stacks. If you don't have very high recursion levels or other functions uses large buffers on the stack, this is no problem. But if you require really lots of stack space, you should be careful or at least spawn the Pth threads with a larger stack size. Ralf S. Engelschall [EMAIL PROTECTED] www.engelschall.com __ GNU Portable Threads (Pth)http://www.gnu.org/software/pth/ Development Site http://www.ossp.org/pkg/lib/pth/ Distribution Files ftp://ftp.gnu.org/gnu/pth/ Distribution Snapshots ftp://ftp.ossp.org/pkg/lib/pth/ User Support Mailing List[EMAIL PROTECTED] Automated List Manager (Majordomo) [EMAIL PROTECTED]
Re: Line-oriented blocking input from sockets?
On Sun, Jul 13, 2003, Damon Hastings wrote: [...] See the file test_common.c in the Pth source tree. It provides a pth_readline() function (just a small but sufficient buffered wrapper around pth_read()) which should do exactly what you want. It looks like pth_readline_ev() reads a fixed number of bytes at a time (i.e. 1024) and will block until it reads that many (or eof). Is that right? That works great for reading a file, where more bytes (or eof) are always available, but I'm reading from a socket. I was considering a non-blocking approach like: pth_fdmode(fd, PTH_FDMODE_NONBLOCK); while (newline not read yet) { pth_select on fd readable, with long timeout; bytes = pth_read(fd, buff, 1024); if (bytes == 0) cleanup on closed connection } I'm not sure whether I correctly understand your concerns. Yes, pth_readline_ev() blocks, but just the current thread, of course. That's what you usually want in a threaded application: you logically program each thread in blocking mode, but the threading implementation takes care that in case a thread would block, another (ready for next operation) thread is scheduled for execution in the meantime. Internally Pth uses non-blocking I/O to achieve all this, of course. If you pth_select() in each thread you don't take advantage of the event scheduling inside Pth. pth_select() is for if 1 thread wants to poll many filedescriptors, but not if 1 thread polls for 1 filedescriptor. Then you just use pth_read(). And the timeout you achieve by using pth_read_ev() instead of passing it a timeout event. Same for the wrappers pth_readline() and pth_readline_ev(). Is there a more efficient approach than this? And should I put a sleep in that loop, or will the scheduler automatically handle everything? (I'll have about 150 threads in this loop simultaneously, each reading from its own socket, with each socket delivering about 1 line of input per second.) You don't need an explicit sleep there, because Pth's event manager will automatically suspend your thread and schedule others if an I/O operation would block. But keep in mind that this is only true if you use pth_xxx() functions. If you use read(2), sleep(3), etc. directly, Pth has to chance to perform context switches between the threads. You have to give Pth a chance to do this by always going through the Pth API. That's the price for true portability and non-preemtive scheduling. But usually that's just a matter of programming discipline... ;-) Ralf S. Engelschall [EMAIL PROTECTED] www.engelschall.com __ GNU Portable Threads (Pth)http://www.gnu.org/software/pth/ Development Site http://www.ossp.org/pkg/lib/pth/ Distribution Files ftp://ftp.gnu.org/gnu/pth/ Distribution Snapshots ftp://ftp.ossp.org/pkg/lib/pth/ User Support Mailing List[EMAIL PROTECTED] Automated List Manager (Majordomo) [EMAIL PROTECTED]
Re: Line-oriented blocking input from sockets?
--- Ralf S. Engelschall [EMAIL PROTECTED] wrote: On Sun, Jul 13, 2003, Damon Hastings wrote: [...] See the file test_common.c in the Pth source tree. It provides a pth_readline() function (just a small but sufficient buffered wrapper around pth_read()) which should do exactly what you want. It looks like pth_readline_ev() reads a fixed number of bytes at a time (i.e. 1024) and will block until it reads that many (or eof). Is that right? That works great for reading a file, where more bytes (or eof) are always available, but I'm reading from a socket. I was considering a non-blocking approach like: pth_fdmode(fd, PTH_FDMODE_NONBLOCK); while (newline not read yet) { pth_select on fd readable, with long timeout; bytes = pth_read(fd, buff, 1024); if (bytes == 0) cleanup on closed connection } I'm not sure whether I correctly understand your concerns. Yes, pth_readline_ev() blocks, but just the current thread, of course. That's what you usually want in a threaded application: you logically program each thread in blocking mode, but the threading implementation takes care that in case a thread would block, another (ready for next operation) thread is scheduled for execution in the meantime. Internally Pth uses non-blocking I/O to achieve all this, of course. Yeah, I guess my explanation was a little spotty. :- What I want is for a thread to block only until a newline is received on the socket it's reading from. Suppose the socket receives the 12 bytes hello world\n. That has a newline and so I would want pth_readline() to return immediately with the output hello world\n. But if pth_readline() does a blocking 1024-byte read on the socket, it will find only 12 bytes available and will thus block. Furthermore, the remote host will not send any more bytes because it's waiting for us to respond, and so pth_readline() will time out, or never return. Sure, Pth just swaps the stack pointer, of course. Nevertheless what here was mentioned is the fact that each stack consumes memory. If you have a large number of threads in your system, just the allocated stacks already accumulate to a rather large memory consumption. Hey, that's great! I was a little concerned for a moment there. But with it just swapping a few pointers each context switch, Pth should be much faster than Linux Pthreads. (And I have 1GB physical RAM, so no worries about memory consumption.) Can you give me a ballpark idea of the cost per context switch for 150 threads, almost all of which are waiting on I/O at any given time? And does Pth block the entire process when all its threads are blocked? (I would assume so, if you're using a global select under the covers.) __ Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com __ GNU Portable Threads (Pth)http://www.gnu.org/software/pth/ Development Site http://www.ossp.org/pkg/lib/pth/ Distribution Files ftp://ftp.gnu.org/gnu/pth/ Distribution Snapshots ftp://ftp.ossp.org/pkg/lib/pth/ User Support Mailing List[EMAIL PROTECTED] Automated List Manager (Majordomo) [EMAIL PROTECTED]
RE: Line-oriented blocking input from sockets?
Ah, well, I meant that a state machine must store state explicitly, whereas with threaded code the state is implied in the code flow (in effect, the thread system itself is a state machine.) If each thread executes a simple function like void foo() {A; B; C;}, then the equivalent state machine is simple enough -- it just has to remember whether each thread is at step A, B, or C, using an array of state variables or some such. But now throw in a few nested for's, if's, and local data into the above function... well, you get the picture. I could certainly implement my program as a state machine, but that may make my code harder for others to understand/maintain. Personally, I completely disagree. It is very hard to understand control flow when all the state is hidden. You see a 'return' statement -- where does that go? What if you want to log the state of a connection to facilitate debugging? Hiding the state on the stack is not good practice. DS __ GNU Portable Threads (Pth)http://www.gnu.org/software/pth/ Development Site http://www.ossp.org/pkg/lib/pth/ Distribution Files ftp://ftp.gnu.org/gnu/pth/ Distribution Snapshots ftp://ftp.ossp.org/pkg/lib/pth/ User Support Mailing List[EMAIL PROTECTED] Automated List Manager (Majordomo) [EMAIL PROTECTED]
RE: Line-oriented blocking input from sockets?
--- David Schwartz [EMAIL PROTECTED] wrote: Ah, well, I meant that a state machine must store state explicitly, whereas with threaded code the state is implied in the code flow (in effect, the thread system itself is a state machine.) If each thread executes a simple function like void foo() {A; B; C;}, then the equivalent state machine is simple enough -- it just has to remember whether each thread is at step A, B, or C, using an array of state variables or some such. But now throw in a few nested for's, if's, and local data into the above function... well, you get the picture. I could certainly implement my program as a state machine, but that may make my code harder for others to understand/maintain. Personally, I completely disagree. It is very hard to understand control flow when all the state is hidden. You see a 'return' statement -- where does that go? What if you want to log the state of a connection to facilitate debugging? Hiding the state on the stack is not good practice. I don't entirely understand what you're saying -- is this a general argument against threads? I thought the whole point of Pth was to store thread state on the stack...? Well, maybe it would help me understand if we used a concrete example: void *threadMain(void *_arg) { char inputBuff[BUFFLEN], moreInput[BUFFLEN]; int client_fd = (int)_arg; pth_readline(client_fd, inputBuff, BUFLEN); if (strncmp(inputBuff, GET , 4) == 0) { do some stuff pth_write(client_fd, response); } else if (strncmp(inputBuff, PUT , 4) == 0) { pth_readline(client_fd, moreInput, BUFLEN); do some stuff pth_write(client_fd, response); } } Okay, so this is a really simple routine executed by worker threads to process a line or two of input from a client and send a response. Is this bad coding style? Would you do this with a single-threaded state machine of some sort? Or some other way? __ Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com __ GNU Portable Threads (Pth)http://www.gnu.org/software/pth/ Development Site http://www.ossp.org/pkg/lib/pth/ Distribution Files ftp://ftp.gnu.org/gnu/pth/ Distribution Snapshots ftp://ftp.ossp.org/pkg/lib/pth/ User Support Mailing List[EMAIL PROTECTED] Automated List Manager (Majordomo) [EMAIL PROTECTED]
Re: Line-oriented blocking input from sockets?
On Fri, Jul 11, 2003, Damon Hastings wrote: This is probably a stupid question, but how do you do line-oriented blocking socket reads in Pth? There's a pth_read() which I assume blocks until a specified number of bytes (or eof) are received -- but I'm looking for something like pth_gets() to block until a newline is received. I don't think you could even write a simple webserver in Pth without such a function, though you could of course implement it yourself via a non-blocking pth_read in a loop. I'll implement it myself if need be, though I would rather trust someone else's code, as I've never used Pth before and there are lots of tricky error conditions in socket programming. The documentation at http://www.gnu.org/software/pth/pth-manual.html includes a simple output-only server as a code example -- is there an input/output example somewhere? See the file test_common.c in the Pth source tree. It provides a pth_readline() function (just a small but sufficient buffered wrapper around pth_read()) which should do exactly what you want. Ralf S. Engelschall [EMAIL PROTECTED] www.engelschall.com __ GNU Portable Threads (Pth)http://www.gnu.org/software/pth/ Development Site http://www.ossp.org/pkg/lib/pth/ Distribution Files ftp://ftp.gnu.org/gnu/pth/ Distribution Snapshots ftp://ftp.ossp.org/pkg/lib/pth/ User Support Mailing List[EMAIL PROTECTED] Automated List Manager (Majordomo) [EMAIL PROTECTED]
Re: Line-oriented blocking input from sockets?
--- Ralf S. Engelschall [EMAIL PROTECTED] wrote: On Fri, Jul 11, 2003, Damon Hastings wrote: This is probably a stupid question, but how do you do line-oriented blocking socket reads in Pth? There's a pth_read() which I assume blocks until a specified number of bytes (or eof) are received -- but I'm looking for something like pth_gets() to block until a newline is received. I don't think you could even write a simple webserver in Pth without such a function, though you could of course implement it yourself via a non-blocking pth_read in a loop. I'll implement it myself if need be, though I would rather trust someone else's code, as I've never used Pth before and there are lots of tricky error conditions in socket programming. The documentation at http://www.gnu.org/software/pth/pth-manual.html includes a simple output-only server as a code example -- is there an input/output example somewhere? See the file test_common.c in the Pth source tree. It provides a pth_readline() function (just a small but sufficient buffered wrapper around pth_read()) which should do exactly what you want. It looks like pth_readline_ev() reads a fixed number of bytes at a time (i.e. 1024) and will block until it reads that many (or eof). Is that right? That works great for reading a file, where more bytes (or eof) are always available, but I'm reading from a socket. I was considering a non-blocking approach like: pth_fdmode(fd, PTH_FDMODE_NONBLOCK); while (newline not read yet) { pth_select on fd readable, with long timeout; bytes = pth_read(fd, buff, 1024); if (bytes == 0) cleanup on closed connection } Is there a more efficient approach than this? And should I put a sleep in that loop, or will the scheduler automatically handle everything? (I'll have about 150 threads in this loop simultaneously, each reading from its own socket, with each socket delivering about 1 line of input per second.) By the way, thanks for the excellent package -- I don't think I could have 150 threads running without it! :-) Damon __ Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com __ GNU Portable Threads (Pth)http://www.gnu.org/software/pth/ Development Site http://www.ossp.org/pkg/lib/pth/ Distribution Files ftp://ftp.gnu.org/gnu/pth/ Distribution Snapshots ftp://ftp.ossp.org/pkg/lib/pth/ User Support Mailing List[EMAIL PROTECTED] Automated List Manager (Majordomo) [EMAIL PROTECTED]
Re: Line-oriented blocking input from sockets?
On Sun, Jul 13, 2003 at 05:41:14PM -0700, Damon Hastings wrote: If the context-switching overhead turns out to be too high, then I will do exactly that. But it will mean maintaining state for each connection, and that state will only grow more complex with future enhancements by myself and others. The big draw to Pth for me is the simplicity of threaded code with the efficiency of a state machine. At least, I think it will be efficient. I don't actually know at this point exactly how expensive a context-switch in Pth is. Keep in mind that each thread still has its own stack, which can be a lot of overhead as the number of threads grows. You mentioned connection state complexity rather than size, so maybe this isn't a concern for you. Jason __ GNU Portable Threads (Pth)http://www.gnu.org/software/pth/ Development Site http://www.ossp.org/pkg/lib/pth/ Distribution Files ftp://ftp.gnu.org/gnu/pth/ Distribution Snapshots ftp://ftp.ossp.org/pkg/lib/pth/ User Support Mailing List[EMAIL PROTECTED] Automated List Manager (Majordomo) [EMAIL PROTECTED]