On Tue, Oct 29, 2013 at 3:30 PM, Brian Anderson <[email protected]>wrote:
> On 10/28/2013 10:02 PM, Simon Ruggier wrote: > > Greetings fellow Rustians! > > First of all, thanks for working on such a great language. I really like > the clean syntax, increased safety, separation of data from function > definitions, and freedom from having to declare duplicate method prototypes > in header files. > > I've been working on an alternate way to communicate between tasks in > Rust, following the same approach as the LMAX Disruptor.[1] I'm hoping to > eventually offer a superset of the functionality in the pipes API, and > replace them as the default communication mechanism between tasks. Just as > with concurrency in general, my main motivation in implementing this is to > improve performance. For more information about the disruptor approach, > there's a lot of information linked from their home page, in a variety of > formats. > > > This is really exciting work. Thanks for pursuing it. I've been interested > in exploring something like Disruptor in Rust. The current channel types in > Rust are indeed slow, and fixing them is the topic of > https://github.com/mozilla/rust/issues/8568. > I'll start paying attention to that. The Morrison & Afek 2013 paper looks like something I should read. > > > This is my first major contribution of new functionality to an open-source > project, so I didn't want to discuss it in advance until I had a working > system to demonstrate. I currently have a very basic proof of concept that > achieves almost two orders of magnitude better performance than the pipes > API. On my hardware[2], I currently see throughput of about 27 million > items per second when synchronizing with a double-checked wait condition > protocol between sender and receivers, 80+ million items with no blocking > (i.e. busy waiting), and anywhere from 240,000 to 600,000 when using pipes. > The LMAX Disruptor library gets up to 110 million items per second on the > same hardware (using busy waiting and yielding), so there's definitely > still room for significant improvement. > > > Those are awesome results! > Thanks! When I first brought it up, it was getting about 14 million with the busy waiting. Minimizing the number of atomic operations (even with relaxed memory ordering) makes a big difference in performance. The 2/3 drop in performance with the blocking wait strategy comes from merely doing a read-modify-write operation on every send (it currently uses atomic swap, I haven't experimented with others yet). To be fair, the only result I can take credit for is the blocking algorithm. The other ideas are straight from the original disruptor. > I've put the code up on GitHub (I'm using rustc from master).[3] > Currently, single and multi-stage pipelines of receivers are supported, > while many features are missing, like multiple concurrent senders, multiple > concurrent receivers, or mutation of the items as they pass through the > pipeline. However, given what I have so far, now is probably the right time > to start soliciting feedback and advice. I'm looking for review, > suggestions/constructive criticism, and guidance about contributing this to > the Rust codebase. > > > I'm not deeply familiar with Disruptor, but I believe that it uses bounded > queues. My general feeling thus far is that, as the general 'go-to' channel > type, people should not be using bounded queues that block the sender when > full because of the potential for unexpected deadlocks. I could be > convinced otherwise though if it's just not possible to have reasonably > fast unbounded channels. Note that I don't think it's critical for the > general-purpose channel to be as fast as possible - it's more important to > be convenient. > Yes, it does. I'm divided on this, because unbounded queues can also lead to memory exhaustion and added latency, but I suspect that for many use cases, you're right. For performance critical code, I think there's probably no argument: if a queue is too large, it starts causing latency problems (like with bufferbloat). A queue that accepts an unlimited number of items is like an API that doesn't let the caller know about errors. The caller needs to know that there's a large queue, and adjust its behaviour. Because of this, I doubt any performance-critical application would find it truly optimal to use unbounded queues. My opinion on this is strongly influenced by this post: http://mechanical-sympathy.blogspot.co.uk/2012/05/apply-back-pressure-when-overloaded.html For general usage, though, I need to do more research. Any application where latency is relevant really should be designed to deal with back-pressure from queues, but there may be some batch job style use cases where, as you say, it isn't worth the extra effort. On the other hand, it's relevant to think about how deadlocks occur, and decide whether or not it's reasonable for developers to expect to be able to do those things. I'll look into this and see what I come up with. If there were some general way to mitigate the deadlock issue within the runtime, it would also solve this problem. As a last resort, I suspect that I could probably figure out a way to have the sender resize the buffer when it fills, copy the elements over, and then switch the consumers over to the larger buffer. I don't know if I could do it without affecting the fast path on the receiver side. Please keep working on this. I'm excited to see your results. > I appreciate the encouragement :) > > Thanks, > Simon > > [1] http://lmax-exchange.github.io/disruptor/ > [2] A 2.66GHz Intel P8800 CPU running in a Thinkpad T500 on Linux x86_64 > [3] https://github.com/sruggier/rust-disruptor > > > _______________________________________________ > Rust-dev mailing > [email protected]https://mail.mozilla.org/listinfo/rust-dev > > > > _______________________________________________ > Rust-dev mailing list > [email protected] > https://mail.mozilla.org/listinfo/rust-dev > >
_______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
