Late September, I've written Taskpools (<https://github.com/status-im/nim-taskpools)>, a threadpool library with the following goals:
* lightweight: the threadpool depends on few moving parts, unused memory is reclaimed. * energy-efficient: unused threads are parked to save on power consumption. * easily auditable and maintainable: Taskpools is used in blockchain, correctness and maintainability is of utmost importance as reputation and money are on the line. The library has been running 24/7 for the last couple months on our Ethereum fleet in test networks and some in production. Example usage: <https://github.com/status-im/nim-taskpools/blob/26e3b1e/examples/e01_simple_tasks.nim> import ../taskpools/taskpools block: # Async without result proc display_int(x: int) = stdout.write(x) stdout.write(" - SUCCESS\n") proc main() = echo "\nSanity check 1: Printing 123456 654321 in parallel" var tp = Taskpool.new(numThreads = 4) tp.spawn display_int(123456) tp.spawn display_int(654321) tp.shutdown() main() block: # Async/Await var tp: Taskpool proc async_fib(n: int): int = if n < 2: return n let x = tp.spawn async_fib(n-1) let y = async_fib(n-2) result = sync(x) + y proc main2() = echo "\nSanity check 2: fib(20)" tp = Taskpool.new() let f = async_fib(20) tp.shutdown() doAssert f == 6765 main2() Run The primitives are: * `Taskpool.new()` * `tp.shutdown()` * `let fut = tp.spawn fn(a, b, c)` * `tp.sync(fut)` (~await) * `tp.syncAll()` blocks until all tasks are done, in particular tasks that don't return anything * `fut.isSpawned()` * `fut.isReady()` The main logic is really short, just under 500 lines of code <https://github.com/status-im/nim-taskpools/blob/26e3b1e/taskpools/taskpools.nim>. The threadpool is work-stealing based, lock-free, except for the logic to put threads to sleep (the lockfree alternative "eventcount" is quite error-prone and would need formal verification to ensure no deadlocks are possible and run contrary to the "easily auditable/maintainable goal). Compared to Weave, here are the main differences: * Taskpool work-stealing is shared-memory based, Weave is message-passing based. This has the advantages that there is no cooperation needed and if a thread is blocked (say on IO) other threads will always make progress. This has the disadvantage that advanced scheduling and load balancing techniques like stealing many tasks depending on perf indicators or adaptative loop splitting are impossible (?). * No parallelFor (data parallelism) and no deferred event scheduling/fine-grained dependencies (dataflow parallelism) * Less efficient load balancing for very short tasks (~10µs) or splittable loops. Tasks within the 500µs should reach the same performance. * More overhead for very short tasks. Weave has an adaptative memory pool based on state-of-the-art memory allocator (Snmalloc and Mimalloc) while each task generates many allocation in taskpools. Note, despite all those perceived shortcomings, taskpools should be a high performing threadpool even compared to all other languages, and especially compared to the one included in the standard library. While it relies on the 1.6 std/tasks, it has a compatibility shim to support Nim 1.2 and Nim 1.4 as well.