On 25/04/2013 10:42 PM, james wrote:
This always concerns me - the design will tend towards having state on the stack etc and 'pull' processing rather than focussing on a clean design for a chain of state machines that handle 'push' of data coming from AIO completion events.
I think "push" and "pull" are a bit ... too vague to critique directly. The underlying IO model admits issuing IO requests and waiting on responses. If you want to issue lots of them and wait on lots of responses, you can use lots of tasks; it will make debugging and understanding the program harder (cf. the variety of hand-rolled mechanisms for event-tracing and causal stack-trace reconstruction in chain-of-state-machine systems). If you want to issue one, wait on response, and then issue another, you can use one task and let its program counter and call stack serialize things; the IO library will interleave you with _other tasks_, but your own task logic remains straight-line.
Our goal is to provide some flexibility to the language user, to choose a balance between straight-line (understandable and debuggable) code and broken-into-pieces (concurrent and interleavable) code. If you want code to interleave, you put it in separate tasks. That should be efficient enough to be practical. If it's not, the task model is not useful. We're betting it will be. I'm sorry if this bet strikes you as irresponsible or unacceptable, but it's the one we're making.
Specifically, will your IO framework be practical with 50,000 connections in reasonable memory (say a constrained 32 bit process)?
If we assume a page-based stack model, 50k tasks (if they can stay in one page of stack) fits in about 200mb. So yes, or maybe, in terms of memory use. In terms of scheduler latency, it's certainly feasible to either operate a normal scheduling algorithm on 50k tasks, or let the OS kernel do it if you're on something fast like linux, or just follow task->chan->port->task links and schedule task-to-task as messages flow through (the "newsqueak way", which I suspect we'll wind up doing for latency sake).
Or, put another way, I'm pretty sure we'll push on the design and implementation to be able to meet numbers like that. We've been doing "million task message-passing tests" since the beginning (on 32bit -- task stacks used to only be a couple hundred bytes to start). They're usually synthetic and we keep overhauling scheduling and the task model, but this sort of number is not at all outside the ballpark we're looking at. Getting a good balance is an engineering tradeoff between a lot of factors. Ease of debugging is one of them.
If not, I wonder if there should be more effort on the AIO layer first, and then consider a synchronous shim on top. Also - can I ask that any AIO layer be integrated with a semaphore-type system natively, then at least one can write some subprocess and use shared memory and integrate it into the main loop.
This is too vague to be able to promise much about, but IPC-in-general is certainly part of any core event loops we'd be sitting atop, and I expect some variants of IPC to use shared memory when appropriate. Beyond saying that, it's just a big pile of engineering tradeoffs. Sometimes pipes are fast enough. Sometimes staying-in-process is safe enough. Sometimes subprocesses and shared memory is "more right" for other reasons (sandboxing, say). Hard to be concrete while saying anything general.
-Graydon _______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
