On 07/10/2012 12:13 PM, David Bruant wrote:
Hi,
I've recently watched a Google I/O talk on Go Concurrency Patterns [1]
and I find some patterns worrisome. Go has made some choices, but I'd
like to understand how Rust addresses these issues.
Disclaimer on my background: I've learned programming mostly in C, then
a bit of Lisp, a good share of Java, then for the last few years an
intense dive into JavaScript (including node.js for almost a year).
It's very likely that this cultural background influences (a lot!) my
view on programming languages (something about a hammer and a nail).
Most of my concerns are appearant in the following piece of Go code
(32'41'' in the video):
c:= make(chan Result)
go func() { c <- Web(query) } ()
go func() { c <- Image(query) } ()
go func() { c <- Video(query) } ()
timeout := time.After(80*time.Milliseconds)
for i:=0; i<3; i++ {
select{
case result := <-c
results = append(results, result)
case <- timeout
fmt.Println("timed out");
return
}
}
return
The intention here is to do a search in different servers (Web, Image,
Video) and produce a result ('append' function) with these results.
3 goroutines (background tasks) and the "main" goroutine synchronously
blocks on either the c channel (on which each coroutine may write
eventually) or the timeout.
My analysis goes as follow:
Due to synchronous wait, the 3 server queries have to be done in 3
goroutines, otherwise, they would block the "main" goroutine or one
another. Each goroutine costs something (both for memory creation and
certainly scheduling too).
Tasks do have a memory cost, but it is theoretically quite low. Rust
tasks on linux currently have somewhere around 4K of overhead, and
that's around 3K more than we would like and think is possible. Most of
that is dedicated to the call stack, and there are a lot of potential
future optimizations to minimize the cost of creating a task (by e.g.
reusing them). The intent is that you should not have to think about
whether using a task is too expensive, because it is cheap.
I'm not familiar with how the JavaScript event loop works but I imagine
that it has similar responsibilities to the Rust scheduler. Instead of
calling callbacks in response to events though the Rust scheduler
resumes execution of tasks.
Then, because of synchronous blocking, again, you need a select block to
enable multiplexed listening of blocking waits.
Finally, maybe it was only for the purpose of an example, but the 'i'
variable is an abstraction of exactly nothing. What we really care about
is either that the result is fully composed (after all partial results
came back) or we timedout and this code doesn't make this expectation
straightforward to read.
Somewhere in the video is mentionned that synchronous blocking is a
feature meant for concurrent tasks synchronization. My experience with
JavaScript is synchronisation is the exception rather than the normal
case, but that might be the biais I talked about above (and I use
promises for synchronisation for their expressivness).
So I have several questions regarding Rust:
* Is synchronous blocking possible?
I don't understand the term 'synchronous blocking' (as opposed to just
'blocking'). Receiving a value from a Rust port does block a task.
Sending on a channel does not (whereas Go channels do block on send). We
consider channels to be asynchronous, based on the sending behavior (vs.
Go's synchronous channels).
* How does Rust deal with concurrent tasks synchronization?
Channels are the primary synchronization primitive in Rust.
* How would you write the above example in Rust?
I would basically write it like the Go example. If it didn't have to
also wait for the timeout then I would instead use a vector of futures.
Here is that Go code translated to current Rust.
// Boilerplate
use std;
import comm::{port, chan, methods, select2};
import task::{spawn};
import iter::repeat;
import either::{left, right};
import std::timer::delayed_send;
import std::uv_global_loop;
import io::println;
type query = ();
fn web(q: query) -> () { () }
fn image(q: query) -> () { () }
fn video(q: query) -> () { () }
fn main() {
let query = (); // Our 'query'
// Need an I/O task in order to send a message after
// a timeout. This is a temporary wart.
let iotask = uv_global_loop::get();
// In rust you need a port for the receiver and a channel
// for the sender
let query_port = port();
let query_chan = query_port.chan();
let timeout_port = port();
let timeout_chan = timeout_port.chan();
let mut results = [];
spawn(|| query_chan.send(web(query)) );
spawn(|| query_chan.send(image(query)) );
spawn(|| query_chan.send(video(query)) );
// Ask for a message on timeout_chan after 80ms
delayed_send(iotask, 80, timeout_chan, ());
for repeat(3) {
// This is the only form of 'select' we have right now.
// If you need to wait on multiple ports then you are out
// of luck
alt select2(query_port, timeout_port) {
left(result) {
results += [result];
}
right(timeout) {
println("timed out");
break;
}
}
}
}
* Do you think it's satisfying in terms of expressiveness?
No, but not for the reasons you suggest. Rust's split between ports and
channels cause a lot of boilerplate, and the lack on an N-ary select
function or control structure is a big omission. Rust's libraries in
general need to be designed better.
Rust is being used in the Servo project, itself aiming at building a web
browser. We've seen in the Snappy effort (and some other before that)
that encouraging asynchronisity is key in building responsive and
efficient software, hence all my questions about Rust take on synchronisity.
Here's an experiment.
So if we were doing JavaScript-style concurrency in Rust we might write
let mut results = [];
let mut timeout = false;
let add_result = |result| {
if !timeout {
results += [result];
if results.len() == 3 {
do_something_with_results(results);
}
}
};
// These should all do their work and call callbacks asynchronously
do web(query) |result| {
add_results(result);
}
do image(query) |result| {
add_result(result);
}
do video(query) |result| {
add_result(result);
}
do on_timeout(80) {
timeout = true;
}
This of course isn't how Rust works because none of those calls can
happen asynchronously on a single task. But task-based concurrency does
allow similar patterns, except that all the callbacks end up in
different tasks and cannot mutate their original environment.
We would make `fn web` spawn another task where it performs the query
and executes the callback.
fn web(q: query, callback: fn~(r: result)) {
do spawn {
callback(execute_query(q));
}
}
And the state we are manipulating would get its own task:
enum msg { msg_result(result), msg_timeout }
let result_chan = spawn_listener |msg| {
let mut results = [];
let mut timeout = false;
loop {
alt msg.recv() {
msg_result(r) { ... }
msg_timeout { ... }
}
}
};
let add_result = |result| result_chan.send(result);
// etc.
At that point you can call the `web`, `image`, `video` and `on_timeout`
functions in the continuation style. This is fairly viable pattern in
Rust because we have awesome unique closures that can be safely sent
across tasks. Putting the state into a different task where you can
mutate it involves more work than JavaScript, but with the benefit that
all this work will happen in parallel.
Regards,
Brian
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev