A small update: I've gotten a resizable version of my disruptor implementation working, and the performance looks pretty good so far. I still have a few loose ends to tie up before I push out the changes. I should have the updated code on GitHub hopefully within a couple of weeks, depending on how much time I find to work on it.
On Sat, Nov 9, 2013 at 2:13 PM, Simon Ruggier <simo...@gmail.com> wrote: > Hi all, I've tentatively come up with a design that would allow the sender > to reallocate the buffer as necessary, with very little added performance > cost. The sending side would bear the cost of reallocation, and there would > be an extra test that receivers would have to make every time they process > an item (no extra atomic operations needed). However, it may be a few weeks > or more before I have a working implementation to demonstrate, so I figured > it might be worthwhile to mention now that I'll be working on this. > > Also, I think it would be interesting to investigate doing something like > the Linux kernel's deadlock detection[1], but generalized to apply to > bounded queues, and implemented as a static check. I know little about > this, but even so, I can see how it would be an enormous amount of work. On > the other hand, I would have thought the same thing about the memory safety > rules that Rust enforces. I'm hopeful that this will eventually be possible > as well. > > [1] https://www.kernel.org/doc/Documentation/lockdep-design.txt > > > On Wed, Oct 30, 2013 at 12:55 AM, Simon Ruggier <simo...@gmail.com> wrote: > >> On Tue, Oct 29, 2013 at 3:30 PM, Brian Anderson <bander...@mozilla.com>wrote: >> >>> On 10/28/2013 10:02 PM, Simon Ruggier wrote: >>> >>> Greetings fellow Rustians! >>> >>> First of all, thanks for working on such a great language. I really like >>> the clean syntax, increased safety, separation of data from function >>> definitions, and freedom from having to declare duplicate method prototypes >>> in header files. >>> >>> I've been working on an alternate way to communicate between tasks in >>> Rust, following the same approach as the LMAX Disruptor.[1] I'm hoping to >>> eventually offer a superset of the functionality in the pipes API, and >>> replace them as the default communication mechanism between tasks. Just as >>> with concurrency in general, my main motivation in implementing this is to >>> improve performance. For more information about the disruptor approach, >>> there's a lot of information linked from their home page, in a variety of >>> formats. >>> >>> >>> This is really exciting work. Thanks for pursuing it. I've been >>> interested in exploring something like Disruptor in Rust. The current >>> channel types in Rust are indeed slow, and fixing them is the topic of >>> https://github.com/mozilla/rust/issues/8568. >>> >> >> I'll start paying attention to that. The Morrison & Afek 2013 paper looks >> like something I should read. >> >> >>> >>> >>> This is my first major contribution of new functionality to an >>> open-source project, so I didn't want to discuss it in advance until I had >>> a working system to demonstrate. I currently have a very basic proof of >>> concept that achieves almost two orders of magnitude better performance >>> than the pipes API. On my hardware[2], I currently see throughput of about >>> 27 million items per second when synchronizing with a double-checked wait >>> condition protocol between sender and receivers, 80+ million items with no >>> blocking (i.e. busy waiting), and anywhere from 240,000 to 600,000 when >>> using pipes. The LMAX Disruptor library gets up to 110 million items per >>> second on the same hardware (using busy waiting and yielding), so there's >>> definitely still room for significant improvement. >>> >>> >>> Those are awesome results! >>> >> >> Thanks! When I first brought it up, it was getting about 14 million with >> the busy waiting. Minimizing the number of atomic operations (even with >> relaxed memory ordering) makes a big difference in performance. The 2/3 >> drop in performance with the blocking wait strategy comes from merely doing >> a read-modify-write operation on every send (it currently uses atomic swap, >> I haven't experimented with others yet). To be fair, the only result I can >> take credit for is the blocking algorithm. The other ideas are straight >> from the original disruptor. >> >> >>> I've put the code up on GitHub (I'm using rustc from master).[3] >>> Currently, single and multi-stage pipelines of receivers are supported, >>> while many features are missing, like multiple concurrent senders, multiple >>> concurrent receivers, or mutation of the items as they pass through the >>> pipeline. However, given what I have so far, now is probably the right time >>> to start soliciting feedback and advice. I'm looking for review, >>> suggestions/constructive criticism, and guidance about contributing this to >>> the Rust codebase. >>> >>> >>> I'm not deeply familiar with Disruptor, but I believe that it uses >>> bounded queues. My general feeling thus far is that, as the general 'go-to' >>> channel type, people should not be using bounded queues that block the >>> sender when full because of the potential for unexpected deadlocks. I could >>> be convinced otherwise though if it's just not possible to have reasonably >>> fast unbounded channels. Note that I don't think it's critical for the >>> general-purpose channel to be as fast as possible - it's more important to >>> be convenient. >>> >> >> Yes, it does. I'm divided on this, because unbounded queues can also lead >> to memory exhaustion and added latency, but I suspect that for many use >> cases, you're right. For performance critical code, I think there's >> probably no argument: if a queue is too large, it starts causing latency >> problems (like with bufferbloat). A queue that accepts an unlimited number >> of items is like an API that doesn't let the caller know about errors. The >> caller needs to know that there's a large queue, and adjust its behaviour. >> Because of this, I doubt any performance-critical application would find it >> truly optimal to use unbounded queues. My opinion on this is strongly >> influenced by this post: >> >> http://mechanical-sympathy.blogspot.co.uk/2012/05/apply-back-pressure-when-overloaded.html >> >> For general usage, though, I need to do more research. Any application >> where latency is relevant really should be designed to deal with >> back-pressure from queues, but there may be some batch job style use cases >> where, as you say, it isn't worth the extra effort. On the other hand, it's >> relevant to think about how deadlocks occur, and decide whether or not it's >> reasonable for developers to expect to be able to do those things. I'll >> look into this and see what I come up with. >> >> If there were some general way to mitigate the deadlock issue within the >> runtime, it would also solve this problem. >> >> As a last resort, I suspect that I could probably figure out a way to >> have the sender resize the buffer when it fills, copy the elements over, >> and then switch the consumers over to the larger buffer. I don't know if I >> could do it without affecting the fast path on the receiver side. >> >> Please keep working on this. I'm excited to see your results. >>> >> >> I appreciate the encouragement :) >> >> >>> >>> Thanks, >>> Simon >>> >>> [1] http://lmax-exchange.github.io/disruptor/ >>> [2] A 2.66GHz Intel P8800 CPU running in a Thinkpad T500 on Linux x86_64 >>> [3] https://github.com/sruggier/rust-disruptor >>> >>> >>> _______________________________________________ >>> Rust-dev mailing >>> listRust-dev@mozilla.orghttps://mail.mozilla.org/listinfo/rust-dev >>> >>> >>> >>> _______________________________________________ >>> Rust-dev mailing list >>> Rust-dev@mozilla.org >>> https://mail.mozilla.org/listinfo/rust-dev >>> >>> >> >
_______________________________________________ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev