On Tue, Feb 2, 2021 at 10:35 PM Niklas Keller <m...@kelunik.com> wrote:
> Hey Nikita, > > Thank you for the proposal. Ergonomics of async I/O in PHP certainly leave >> something to be desired right now, and improvements in this area are >> welcome. >> >> Despite your explanations in the RFC and this thread, I'm still having a >> hard time understanding the purpose of the FiberScheduler. >> >> My current understanding is that the FiberScheduler is a special type of >> fiber that cannot be explicitly scheduled by the user -- it is >> automatically scheduled by Fiber::suspend() and automatically un-scheduled >> by Fiber::resume() or Fiber::throw(). It's the fiber that runs between >> fibers :) Does that sound accurate? >> > > Yes, that's accurate. Fibers are used for cooperative multi-tasking and > there's usually a single scheduler responsible for the scheduling. Multiple > schedulers would block each other or busy wait. So having multiple > schedulers is strongly discouraged in long running applications, however, > it might be acceptable in traditional applications, i.e. PHP-FPM. In > PHP-FPM, multiple schedulers partially blocking each other is still better > than blocking entirely for every I/O operation. > > >> What's not clear to me is why the scheduling fiber needs to be >> distinguished from other fibers. If we want to stick with the general >> approach, why is Fiber::suspend($scheduler) not Fiber::transferTo($fiber), >> where $fiber would be the fiber serving as scheduler (but otherwise a >> normal Fiber)? I would expect that context-switching between arbitrary >> fibers would be both most expressive, and make for the smallest interface. >> > > There are a few reasons to make a difference here: > > - SchedulerFibers are run to completion at script end, which isn't the > case for normal fibers. > - Terminating fibers need a fiber to return to. For schedulers it's fine > if a resumed fiber terminates, for normal fibers it should be an exception > if the scheduler fiber terminates without explicitly resuming the suspended > fiber. > - Keeping the previous fiber for each suspension point is complicated if > not impossible to get right and generally complicates the implementation > and cognitive load, see following example: > > main -> A -> B -> C -> A (terminates) -> C (previous) -> B (terminates) -> > C (previous, terminates) -> main > > In the example above, the previous fiber linked list from C back to main > needs to be optimized at some point, otherwise A and B need to be kept in > memory and thus leak memory until C is resumed. > > I'm sure Aaron can present a few other reasons to keep the separation. > Thanks, I didn't consider the terminating fiber case here. But possibly a combination of the two would work? That is, you generally have Fiber::suspend() return to the parent fiber, and make the parent fiber terminating while still having children an error condition. Additionally, you provide something like Fiber::runAsParent() to allow "adopting" a fiber under a new parent, that serves as scheduler (this is to cover the {main} -> scheduler use case). If you stick to the FiberScheduler concept, then you might want to consider inverting the API. Right now you're basically using a standard Fiber API, with the difference that suspend() accepts a FiberScheduler, which is unintuitive to me. If Fibers require a scheduler anyway, why are the suspend and resume methods not on the scheduler? class FiberScheduler { function suspend(Fiber $fiber); function start(Fiber $fiber); function resume(Fiber $fiber); } Both methods are bound to the scheduler in that "suspend" suspends back to a certain scheduler, while "resume" resumes a fiber such that it will return back to this scheduler on termination. This also makes it more obvious that, for example, it's not possible to just do a "$fiber->start()" without having created a scheduler first (though it does not make it obvious that the call has to be from within the scheduler). Regards, Nikita > The more limited alternative is to instead have Fiber::suspend() return to >> the parent fiber (the one that resume()d it). Here, the parent fiber >> effectively becomes the scheduler fiber. If I understand right, the reason >> why you don't want to use that approach, is that it doesn't allow you to >> call some AMP function from the {main} fiber, create the scheduler there >> and then treat {main} just like any other fiber. Is that correct? >> > > Correct, this wouldn't allow top-level Fiber::suspend(). It would also > make the starting / previously resuming party responsible for resuming the > fiber instead of the fiber being able to "choose" the scheduler for a > specific suspension point. One fiber would thus be effectively limited to a > single scheduler. > > Best, > Niklas >