It is not exactly the problem of the stack, but the memory inside it. Let me 
illustrate with a multithreaded NimGo version :
    
    
    #[
       Please note that the usage of Coroutine here is useless and serves as a 
demonstration. In this example, the number of pending coroutines is limited
    ]#
    
    
    proc myCorofunction() =
        ## main thread
        var myString: "Hello"
        var myRef: RefObject(data: mydata)
        suspendUntilLater()
        ## Maybe in another thread now ?
        echo myString #-> this is unsafe!
        echo repr(myRef) #-> this is unsafe!
    
    proc userMainFunction() =
        var coro = newCoroutine(myCorofunction)
        resume(coro) # ASM jump to a stack that contains only a pointer to 
function myCorofunction
    
    proc suspendUntilLater() =
        var coro = getThreadLocalCurrentCoroutine()
        GlobalDistpatcher.pendingQueue.push coro
        suspend(coro) #-> ASM jump to the parent stack
    
    proc dispatcherLoop() =
         ## Work stealing implementation
         while true:
            var coroStolen() = GlobalDistpatcher.pendingqueue.tryPop()
            resume(coroStolen) #-> ASM jump to another stack that was created 
in another thread
    
    var myUserThread: Thread[void]
    var myWorkerThreads: array[4, Thread[void]]
    for i in 0..high(myWorkerThreads):
        createThread(myWorkerThreads[i], dispatcherLoop)
    createThread(myUserThread, userMainFunction)
    joinThreads(myUserThread, myWorkerThread)
    
    
    Run

Is it the code above safe with ARC/ORC ? I don't think so (in fact, I don't 
even understand why the single threaded version is safe, but I have not used 
valgrind or such to test for leaks, so...).

Furthermore, is a work-stealing implementation appropriate when there could be 
both CPU bound and I/O bound tasks ? I/O tasks are suspended often but are 
frequent, whereas CPU bound task is suspended almost never. So, a work stealing 
implementation would starve the event loop if there are multiple CPU tasks. We 
could reserve threads for I/O, but that would mean either useless threads or a 
computation speed close to a single threaded I/O dispatcher. That would also 
mean that we could distinguish what Coroutine is IO bound to the one which is 
CPU bound, which seems almost unpossible.

Reply via email to