Stack management for green tasks has been based in the past first on segmented 
stacks and then on standard large stacks.

However, I just realized that there is a third alternative which might well be 
better than both of those.

The idea is very simple: a green task would run on a large stack like now, but 
when it is descheduled, a memory buffer of the size of its used stack space is 
allocated, and its stack data is copied there; when it is rescheduled, the 
stack data is copied back to the original address.

The copying can be performed either by memcpy (perhaps with a specialized 
memcpy that assumes alignment and always uses SIMD instruction to copy with no 
checks), or perhaps by using a compression algorithm designed for maximum 
speed, such as Snappy or maybe a new ad-hoc design.

This allows to have the best possible memory efficiency and thus the maximum 
possible number of tasks, even better than segmented stacks due to precise 
sizing and optional compression, while not adding any overhead to code that 
does not block, either in terms of code generation complexity or runtime cost.

There is of course an additional runtime cost when blocking, but blocking 
already often involves both system calls and context switching, and workloads 
with lots of blocking green tasks are probably I/O bound anyway; furthermore, 
stack data is probably in the CPU cache, so there shouldn't be much penalty.

The only significant drawback is that two green tasks running from the 
same "large" stack (note that stacks cannot be easily switched, as that 
would invalidate borrowed pointers on the stack to the stack) cannot 
ever run concurrently; however, by making the number of large stacks a 
large enough multiple of the number of CPU cores, the probability of 
reduced parallelism due to this issue can be made as small as desired.

In general, one would use an adaptive approach, where this kind of copying 
would start to kick in only once the number of green tasks becomes large; when 
not copying, one would just optionally use madvise(MADV_DONTNEED) to trim 
unused whole pages of the stack.

Also, if the data to be copied is large (several OS pages), one may use 
mremap() or similar facilities to move the address space mappings instead of 
the data itself.

Some unsafe code that assumes that tasks can access each other's stacks will 
break (probably only the new Mutex implementation), but that can be fixed by 
putting the data in the task structure instead of the stack.

Overall, this should allow writing something like a parallel network server or 
web crawler in Rust in a natural blocking style, while being fully confident in 
its scalability, which is something that is not really possible at the moment.

Once this is in place, the best practices for Rust programs would become:
1. Use native tasks for tasks whose number is bounded at compile time and small
2. Use green tasks for tasks whose number is unbounded or large

One could even automate this to some extent by spawning by default a native 
task the first C times a given proc() code address is used to start a task 
(where C = number of CPU cores), and green tasks afterwards.

What do you think?                                        
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to