@mratsim Your article about multithreading floavors is great, I had read it multiple times last months, your knowledge about taskpools is far beyond mine.
But I think my use of terms were confusing. When I talked about M:N difficulties, I was talking in the context of making use of stackful coroutines inside a N:M multithreaded environment. Moving a closure into a threads won't be difficult. Moving a closure with GC memory (like strings) and with ORC is more challenging. But not doubt achievable. But moving started stackful coroutines, that's a more complex problem IMHO. Indeed moving a started coroutine can be (is certainly?) unsafe, because it can contains GC memory, which will be hard to track. Using only stack variables and managed memory inside a coroutine moved around thread is safe (depends on how it is done) but not practical. And because stackful coroutines are long living, there is IMHO a risk of load unbalancing if they are never moved around threads. That's probably the reason Go's language choose to have their own calling convention for their functions, to make GC safety possible (with all the drawbacks you mentionned: notably difficulty and overhead to call c). However, I haven't researched a lot on how the GC can handle a shared heap stack and how unsafe it is. I stopped when I realised that closures could pass safely without std/tasks and that the compilation cost of std/tasks was too important (huge compilation time and high binary size is among the drawbacks of std/asyncdispatch I tried to avoid, even if this is a small concern). Excuse me if I inappropriatly use the terms of N:M or if mix some concepts, I am not entirely familiar with it. Go's GC has certainly a very different policy about how it tracks coroutines' stacks, and bringing that to Nim could mean a change in a way GC is tracked, which is clearly just not worth it. I don't now the internals to make affirmations. That's the reason I think stackless coroutines are more fit for a M:N environment. Passing a continuation is much easier and safe. (Even if i am not a fan of this kind of programming, I must point out merits !). However just not moving started coroutines could be a viable option, and not so hard to implement. Considering implementation, taskpools seems great and have a very straightforward API, that's solid work indeed ! Dispatching a coroutine is not very difficult, for my single threaded dispatcher, I took some inspiration from std/asyncdispatch and using the same concepts were very straightforward (If i want to simplify Callback->Coroutines, calling a callback -> resuming a coroutine). I choose to bring a strong cancellation (timeouts) support in my API, but it could have been achieved just as well with a callback dispatcher.