> On Aug 31, 2017, at 7:50 PM, Pierre Habouzit <[email protected]> wrote: > >> On Aug 31, 2017, at 11:35 AM, Joe Groff via swift-evolution >> <[email protected]> wrote: >> >> The coroutine proposal as it stands essentially exposes raw delimited >> continuations. While this is a flexible and expressive feature in the >> abstract, for the concrete purpose of representing asynchronous coroutines, >> it provides weak user-level guarantees about where their code might be >> running after being resumed from suspension, and puts a lot of pressure on >> APIs to be well-behaved in this respect. And if we're building toward >> actors, where async actor methods should be guaranteed to run "in the >> actor", I think we'll *need* something more than the bare-bones delimited >> continuation approach to get there. I think the proposal's desire to keep >> coroutines independent of a specific runtime model is a good idea, but I >> also think there are a couple possible modifications we could add to the >> design to make it easier to reason about what context things run in for any >> runtime model that benefits from async/await: >> >> # Coroutine context >> >> Associating a context value with a coroutine would let us thread useful >> information through the execution of the coroutine. This is particularly >> useful for GCD, so you could attach a queue, QoS, and other attributes to >> the coroutine, since these aren't reliably available from the global >> environment. It could be a performance improvement even for things like >> per-pthread queues, since coroutine context should be cheaper to access than >> pthread_self. > >> [...] > > > YES! > > We need that. You're very focused on performance and affinity and whatnot > here, but knowing where the completion will run upfront is critical for > priority inheritance purposes. > > This is exactly the spirit of the mail I just wrote in reply to Chris a bit > earlier tonight. Execution context matters to the OS, a lot. > > The OS needs to know two things: > - where is the precursor of this coroutine (which work is preventing the > coroutine to execute) > - where will the coroutine go (which for GCD is critical because the OS > lazily attributes threads, so any typical OS primitive to raise an existing > thread priority doesn't work) > > In other words, a coroutine needs: > - various tags (QoS, logging context, ...) > - precursors / reverse dependencies > - where it will execute (whether it's a dispatch queue or a runloop is > completely irrelevant though). > > > And then if you do it that way when the precursor fires and allows for your > coroutine to be scheduled, then it can actually schedule it right away on the > right execution context and minimize context switches (which are way worse > than shared mutable state for your performance). > > >> # `onResume` hooks >> >> Relying on coroutine context alone still leaves responsibility wholly on >> suspending APIs to pay attention to the coroutine context and schedule the >> continuation correctly. You'd still have the expression problem when >> coroutine-spawning APIs from one framework interact with suspending APIs >> from another framework that doesn't understand the spawning framework's >> desired scheduling policy. We could provide some defense against this by >> letting the coroutine control its own resumption with an "onResume" hook, >> which would run when a suspended continuation is invoked instead of >> immediately resuming the coroutine. That would let the coroutine-aware >> dispatch_async example from above do something like this, to ensure the >> continuation always ends up back on the correct queue: >> >> extension DispatchQueue { >> func `async`(_ body: () async -> ()) { >> dispatch_async(self, { >> beginAsync( >> context: self, >> body: { await body() }, >> onResume: { continuation in >> // Defensively hop to the right queue >> dispatch_async(self, continuation) >> } >> ) >> }) >> } >> } >> >> This would let spawning APIs provide a stronger guarantee that the spawned >> coroutine is always executing as if scheduled by a specific >> queue/actor/event loop/HWND/etc., even if later suspended by an async API >> working in a different paradigm. This would also let you more strongly >> associate a coroutine with a future object representing its completion: >> >> class CoroutineFuture<T> { >> enum State { >> case busy // currently running >> case suspended(() -> ()) // suspended >> case success(T) // completed with success >> case failure(Error) // completed with error >> } >> >> var state: State = .busy >> >> init(_ body: () async -> T) { >> >> beginAsync( >> body: { >> do { >> self.state = .success(await body()) >> } catch { >> self.state = .failure(error) >> } >> }, >> onResume: { continuation in >> assert(self.state == .busy, "already running?!") >> self.state = .suspended(continuation) >> } >> } >> } >> >> // Return the result of the future, or try to make progress computing it >> func poll() throws -> T? { >> switch state { >> case .busy: >> return nil >> case .suspended(let cont): >> cont() >> switch state { >> case .success(let value): >> return value >> case .failure(let error): >> throw error >> case .busy, .suspended: >> return nil >> } >> case .success(let value): >> return value >> case .error(let error): >> throw error >> } >> } >> >> >> A downside of this design is that it incurs some cost from defensive >> rescheduling on the continuation side, and also prevents writing APIs that >> intentionally change context across an `await`, like a theoretical >> "goToMainThread()" function (though you could do that by spawning a >> semantically-independent coroutine associated with the main thread, which >> might be a better design anyway). > > Given the limitations, I'm very skeptical. Also in general > suspending/resuming work is very difficult to handle for a runtime > (implementation wise), has large memory costs, and breaks priority inversion > avoidance. dispatch_suspend()/dispatch_resume() is one of the banes of my > existence when it comes to dispatch API surface. It only makes sense for > dispatch source "I don't want to receive these events anymore for a while" is > a perfectly valid thing to say or do. But suspending a queue or work is > ripping the carpet from under the feet of the OS as you just basically make > all work that is depending on the suspended one invisible and impossible to > reason about.
Sorry, I was using the term 'suspend' somewhat imprecisely. I was specifically referring to an operation that semantically pauses the coroutine and gives you its continuation closure, to be handed off as a completion handler or something of that sort, not something that would block the thread or suspend the queue. Execution would return back up the non-async layer at the point this happens. -Joe > > The proper way to do something akin to suspension is really to "fail" your > operation with a "You need to redrive me later", or implement an event > monitoring system inside the subsystem providing the Actor that wants > suspension to have the client handle the redrive/monitoring, this way the > priority relationship is established and the OS can reason about it. Said > another way, the Actor should fail with an error that gives you some kind of > "resume token" that the requestor can hold and redrive according to his own > rules and in a way that it is clear he's the waiter. Most of the time > suspension() is a waiting-on-behalf-of relationship and this is a bad thing > to build (except in priority homogenous environments, which iOS/macOS are > *not*). > > Also implementing the state you described requires more synchronization than > you want to be useful: if you want to take action after observing a state, > then you really really really don't want that state to change while you > perform the consequence. the "on$Event" hook approach (which dispatch uses > for dispatch sources e.g.) is much better because the ordering and > serialization is provided by the actor itself. The only states that are valid > to expose as a getter are states that you cannot go back from: succes, > failure, error, canceled are all perfectly fine states to expose as getters > because they only change state once. .suspended/.busy is not such a thing. > > FWIW dispatch sources, and more importantly dispatch mach channels (which is > the private interface that is used to implement XPC Connections) have a > design that try really really really hard to not fall into any these > pitfalls, are priority inheritance friendly, execute on *distributed* > execution contexts, and have a state machine exposed through "on$Event" > callbacks. We should benefit from the many years of experience that are > condensed in these implementations when thinking about Actors and the > primitives they provide. > > -Pierre
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
