Great ! Glad to hear that the plan is to support both use cases by abstracting M:N and 1:1 eventually. That will go a long way towards my later experiments with ETL (extract, transform, load) under various OS's and multi-core architectures using parallel task flow (I still don't know what I will call the flow type...lolol)
On Wed, Nov 13, 2013 at 11:50 PM, Brian Anderson <[email protected]>wrote: > Thanks for the great reply, Alex. This is the approach we are going to > take. Rust is not going to move away from green threads; the plan is to > support both use cases in the standard library. > > > On 11/13/2013 10:32 AM, Alex Crichton wrote: > >> The situation may not be as dire as you think. The runtime is still in a >> state >> of flux, and don't forget that in one summer the entire runtime was >> rewritten in >> rust and was entirely redesigned. I personally still think that M:N is a >> viable >> model for various applications, and it seems especially unfortunate to >> just >> remove everything because it's not tailored for all use cases. >> >> Rust made an explicit design decision early on to pursue lightweight/green >> tasks, and it was made with the understanding that there were drawbacks >> to the >> strategy. Using libuv as a backend for driving I/O was also an explicit >> decision >> with known drawbacks. >> >> That being said, I do not believe that all is lost. I don't believe that >> the >> rust standard library as-is today can support *every* use case, but it's >> getting >> to a point where it can get pretty close. In the recent redesign of the >> I/O >> implementation, all I/O was abstracted behind trait objects that are >> synchronous >> in their interface. This I/O interface is all implemented in librustuv by >> talking to the rust scheduler under the hood. Additionally, in pull >> #10457, I'm >> starting to add support for a native implementation of this I/O >> interface. The >> great boon of this strategy is that all std::io primitives have no idea >> if their >> underlying interface is native and blocking or libuv and asynchronous. >> The exact >> same rust code works for one as it does for the other. >> >> I personally don't see why the same strategy shouldn't work for the task >> model >> as well. When you link a program to the librustuv crate, then you're >> choosing to >> have a runtime with M:N scheduling and asynchronous I/O. Perhaps, though, >> if you >> didn't link to librustuv, you would get 1:1 scheduling with blocking I/O. >> You >> would still have all the benefits of the standard library's communication >> primitives, spawning primitives, I/O, task-local-storage etc. The only >> difference is that everything would be powered by OS-level threads >> instead of >> rust-level green tasks. >> >> I would very much like to see a standard library which supports this >> abstraction, and I believe that it is very realistically possible. Right >> now we >> have an EventLoop interface which defines interacting with I/O that is the >> abstraction between asynchronous I/O and blocking I/O. This sounds like >> we need a more formalized Scheduler interface which abstracts M:N >> scheduling vs >> 1:1 scheduling. >> >> The main goal of all of this would be to allow the same exact rust code >> to work >> in both M:N and 1:1 environments. This ability would allow authors to >> specialize >> their code for their task at-hand. Those writing web servers would be >> sure to >> link to librustuv, but those writing command-line utilities would simply >> just >> omit librustuv. Additionally, as a library author, I don't really care >> which >> implementation you're using. I can write a mysql database driver and then >> you as >> a consumer of my library decided whether my network calls are blocking or >> not. >> >> This is a fairly new concept to me (I haven't thought much about it >> before), but >> this sounds like it may be the right way forward to addressing your >> concerns >> without compromising too much existing functionality. There would >> certainly be >> plenty of work to do in this realm, and I'm not sure if this goal would >> block >> the 1.0 milestone or not. Ideally, this would be a completely >> backwards-compatible change, but there would perhaps be unintended >> consequences. >> As always, this would need plenty of discussion to see whether this is >> even a >> reasonable strategy to take. >> >> >> On Wed, Nov 13, 2013 at 2:45 AM, Daniel Micay <[email protected]> >> wrote: >> >>> Before getting right into the gritty details about why I think we should >>> think >>> about a path away from M:N scheduling, I'll go over the details of the >>> concurrency model we currently use. >>> >>> Rust uses a user-mode scheduler to cooperatively schedule many tasks >>> onto OS >>> threads. Due to the lack of preemption, tasks need to manually yield >>> control >>> back to the scheduler. Performing I/O with the standard library will >>> block the >>> *task*, but yield control back to the scheduler until the I/O is >>> completed. >>> >>> The scheduler manages a thread pool where the unit of work is a task >>> rather >>> than a queue of closures to be executed or data to be pass to a >>> function. A >>> task consists of a stack, register context and task-local storage much >>> like an >>> OS thread. >>> >>> In the world of high-performance computing, this is a proven model for >>> maximizing throughput for CPU-bound tasks. By abandoning preemption, >>> there's >>> zero overhead from context switches. For socket servers with only >>> negligible >>> server-side computations the avoidance of context switching is a boon for >>> scalability and predictable performance. >>> >>> # Lightweight? >>> >>> Rust's tasks are often called *lightweight* but at least on Linux the >>> only >>> optimization is the lack of preemption. Since segmented stacks have been >>> dropped, the resident/virtual memory usage will be identical. >>> >>> # Spawning performance >>> >>> An OS thread can actually spawn nearly as fast as a Rust task on a >>> system with >>> one CPU. On a multi-core system, there's a high chance of the new thread >>> being >>> spawned on a different CPU resulting in a performance loss. >>> >>> Sample C program, if you need to see it to believe it: >>> >>> ``` >>> #include <pthread.h> >>> #include <err.h> >>> >>> static const size_t n_thread = 100000; >>> >>> static void *foo(void *arg) { >>> return arg; >>> } >>> >>> int main(void) { >>> for (size_t i = 0; i < n_thread; i++) { >>> pthread_attr_t attr; >>> if (pthread_attr_init(&attr) < 0) { >>> return 1; >>> } >>> if (pthread_attr_setdetachstate(&attr, >>> PTHREAD_CREATE_DETACHED) < 0) { >>> return 1; >>> } >>> pthread_t thread; >>> if (pthread_create(&thread, &attr, foo, NULL) < 0) { >>> return 1; >>> } >>> } >>> pthread_exit(NULL); >>> } >>> ``` >>> >>> Sample Rust program: >>> >>> ``` >>> fn main() { >>> for _ in range(0, 100000) { >>> do spawn { >>> } >>> } >>> } >>> ``` >>> >>> For both programs, I get around 0.9s consistently when pinned to a core. >>> The >>> Rust version drops to 1.1s when not pinned and the OS thread one to >>> about 2s. >>> It drops further when asked to allocate 8MiB stacks like C is doing, and >>> will >>> drop more when it has to do `mmap` and `mprotect` calls like the pthread >>> API. >>> >>> # Asynchronous I/O >>> >>> Rust's requirements for asynchronous I/O would be filled well by direct >>> usage >>> of IOCP on Windows. However, Linux only has solid support for >>> non-blocking >>> sockets because file operations usually just retrieve a result from >>> cache and >>> do not truly have to block. This results in libuv being significantly >>> slower >>> than blocking I/O for most common cases for the sake of scalable socket >>> servers. >>> >>> On modern systems with flash memory, including mobile, there is a >>> *consistent* >>> and relatively small worst-case latency for accessing data on the disk so >>> blocking is essentially a non-issue. Memory mapped I/O is also an >>> incredibly >>> important feature for I/O performance, and there's almost no reason to >>> use >>> traditional I/O on 64-bit. However, it's a no-go with M:N scheduling >>> because >>> the page faults block the thread. >>> >>> # Overview >>> >>> Advantages: >>> >>> * lack of preemptive/fair scheduling, leading to higher throughput >>> * very fast context switches to other tasks on the same scheduler thread >>> >>> Disadvantages: >>> >>> * lack of preemptive/fair scheduling (lower-level model) >>> * poor profiler/debugger support >>> * async I/O stack is much slower for the common case; for example stat >>> is 35x >>> slower when run in a loop for an mlocate-like utility >>> * true blocking code will still block a scheduler thread >>> * most existing libraries use blocking I/O and OS threads >>> * cannot directly use fast and easy to use linker-supported thread-local >>> data >>> * many existing libraries rely on thread-local storage, so there's a >>> need to be >>> wary of hidden yields in Rust function calls and it's very difficult >>> to >>> expose a safe interface to these libraries >>> * every level of a CPU architecture adding registers needs explicit >>> support >>> from Rust, and it must be selected at runtime when not targeting a >>> specific >>> CPU (this is currently not done correctly) >>> >>> # User-mode scheduling >>> >>> Windows 7 introduced user-mode scheduling[1] to replace fibers on 64-bit. >>> Google implemented the same thing for Linux (perhaps even before Windows >>> 7 was >>> released), and plans on pushing for it upstream.[2] The linked video >>> does a >>> better job of covering this than I can. >>> >>> User-mode scheduling provides a 1:1 threading model including full >>> support for >>> normal thread-local data and existing debuggers/profilers. It can yield >>> to the >>> scheduler on system calls and page faults. The operating system is >>> responsible >>> for details like context switching, so a large maintenance/portability >>> burden >>> is dealt with. It narrows down the above disadvantage list to just the >>> point >>> about not having preemptive/fair scheduling and doesn't introduce any >>> new ones. >>> >>> I hope this is where concurrency is headed, and I hope Rust doesn't miss >>> this >>> boat by concentrating too much on libuv. I think it would allow us to >>> simply >>> drop support for pseudo-blocking I/O in the Go style and ignore >>> asynchronous >>> I/O and non-blocking sockets in the standard library. It may be useful >>> to have >>> the scheduler use them, but it wouldn't be essential. >>> >>> [1] http://msdn.microsoft.com/en-us/library/windows/desktop/ >>> dd627187(v=vs.85).aspx >>> [2] http://www.youtube.com/watch?v=KXuZi9aeGTw >>> _______________________________________________ >>> Rust-dev mailing list >>> [email protected] >>> https://mail.mozilla.org/listinfo/rust-dev >>> >> _______________________________________________ >> Rust-dev mailing list >> [email protected] >> https://mail.mozilla.org/listinfo/rust-dev >> > > _______________________________________________ > Rust-dev mailing list > [email protected] > https://mail.mozilla.org/listinfo/rust-dev > -- -Thad +ThadGuidry <https://www.google.com/+ThadGuidry> Thad on LinkedIn <http://www.linkedin.com/in/thadguidry/>
_______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
