On 07/13/2012 11:58 AM, David Bruant wrote:
Le 11/07/2012 03:05, Brian Anderson a écrit :
On 07/10/2012 12:13 PM, David Bruant wrote:
(...)
Tasks do have a memory cost, but it is theoretically quite low. Rust
tasks on linux currently have somewhere around 4K of overhead, and
that's around 3K more than we would like and think is possible. Most
of that is dedicated to the call stack, and there are a lot of
potential future optimizations to minimize the cost of creating a task
(by e.g. reusing them).
I haven't implemented a JavaScript event loop, but for comparison,
that's more or less a list where each element is a function with some
arguments. Very likely less than 1k (but with certainly other downsides)
The intent is that you should not have to think about whether using a
task is too expensive, because it is cheap.
I'm not familiar with how the JavaScript event loop works but I
imagine that it has similar responsibilities to the Rust scheduler.
The idea is that a JavaScript "processing unit" (sorry, I don't know the
correct term for this) has a stack and a message queue (the list I
mentionned above). Each message is a function and some arguments. This
function is called and when the call (and the nested ones) are complete,
the next message is processed. There is no preemption. (there are some
weird not-very-standards cases in browsers that break that rule).
The message queue would be the equivalent of the Rust scheduler. I
realized after posting my message that goroutines and Rust tasks could
actually be implemented with a message queue (I'm not asking for that, I
just realized it) with the difference that if some code gets stuck in an
infinite loop, everything else is blocked and that may not be a good
thing for a system language. I'm not sure. Maybe "some" preemption and
the ability to kill a message (or the message being processed) could
compensate.
To be clear, Rust's scheduling is cooperative, so a badly-behaving Rust
task can block and prevent others from making progress. This is mostly
an issue to be aware of for native bindings.
Instead of calling callbacks in response to events though the Rust
scheduler resumes execution of tasks.
I've been thinking about this particular point and about my experience
with node.js.
Node.js has been critized a lot for not enabling parallelism by defaut.
By default, your code runs in one system process (it's possible that
some functions use several system threads under the hood, but you can't
do that in JavaScript code), but you can fork/spawn other processes.
There is also a way to know the number of cpus in your machine
http://nodejs.org/api/os.html#os_os_cpus
My experience is that it feels very right to spawn as many processes
than you have CPUs (in case your program has a need for such a thing).
In that case, you're not leaving some CPUs unused, but at the same time,
you don't have a lot of processes/threads that you hardware can't run
and that cost memory and scheduling.
From what I understand, this level of control cannot be achieved with
goroutines and Rust tasks. The only primitive means "create a new
concurrency unit and let the system figure it out". The downside of that
is that the Rust runtime need to create a lot of stacks and do some
scheduling itself. It sounds like it costs more than what can be done in
the Node.js model.
I don't have the perfect solution here, but there is certainly a
middleground to be found.
Rust does offer some more control than that. The runtime supports
running multiple schedulers simultaneously - particularly so code that
wants to truly block an OS thread (like libuv) can live in its own world.
The scheduler that the main task (and any child tasks) runs in has by
default the same number of threads as there are cores available. For
more control you can create your own schedulers:
let cores = get_num_cores(); // This functions doesn't exist yet
for cores.times {
do spawn_sched(single_threaded) {
run_program()
}
}
If run_program never spawns another task then they will effectively have
their own OS thread. They will occasionally do yield checks and context
switch back to the scheduler momentarily, but that context switch could
conceivably be optimized out if there is only one task on the entire
scheduler.
(...)
So I have several questions regarding Rust:
* Is synchronous blocking possible?
I don't understand the term 'synchronous blocking' (as opposed to just
'blocking').
It's the same thing. I use both terms interchangeably. Sorry for the
confusion.
Receiving a value from a Rust port does block a task. Sending on a
channel does not (whereas Go channels do block on send). We consider
channels to be asynchronous, based on the sending behavior (vs. Go's
synchronous channels).
* How does Rust deal with concurrent tasks synchronization?
Channels are the primary synchronization primitive in Rust.
* How would you write the above example in Rust?
I would basically write it like the Go example. If it didn't have to
also wait for the timeout then I would instead use a vector of futures.
I would use promises (equivalent of futures) as well. Is there a
future/promise library in Rust?
core::future exists, but could be better. In particular, futures are not
sendable types, which severely limits how they can be composed.
Here is that Go code translated to current Rust.
(...)
Thanks for this example :-)
* Do you think it's satisfying in terms of expressiveness?
No, but not for the reasons you suggest. Rust's split between ports
and channels cause a lot of boilerplate, and the lack on an N-ary
select function or control structure is a big omission. Rust's
libraries in general need to be designed better.
Is it shared by the Rust community? How would you move forward from that
situation?
I think that opinion is shared, but most Rust developers are waiting
until the language settles down before focusing on libraries.
As to the ergonomics of channels in Rust, I'm not sure what the solution
is yet, but the existing channel implementation will likely be going
away entirely. Eric Holk is working on a new primitive communication
type called a 'pipe' that only does 1:1 communication, and can only send
a single message. This alone is much more difficult to use that the
current channels, but is also much faster.
On top of that he is building channel contracts that define a protocol
between two pipe endpoints that is enforced by the type system. Channel
contracts additionally allow bounded protocols (ones that don't just
send forever without receiving) to be implemented with fixed size
buffers so that sending a message is never forced to allocate. There
will be default implementations of common protocols.
On top of all that I'm hoping that we define a new channel type that
does N:M communication, is sendable, and doesn't require the
port/channel distinction (like Go you can send and receive on the same
object). For simple things you would probably just use a channel, but
for performance you would use pipe protocols.
-Brian
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev