On Tue, Feb 2, 2021 at 10:35 PM Niklas Keller <m...@kelunik.com> wrote:

> Hey Nikita,
>
> Thank you for the proposal. Ergonomics of async I/O in PHP certainly leave
>> something to be desired right now, and improvements in this area are
>> welcome.
>>
>> Despite your explanations in the RFC and this thread, I'm still having a
>> hard time understanding the purpose of the FiberScheduler.
>>
>> My current understanding is that the FiberScheduler is a special type of
>> fiber that cannot be explicitly scheduled by the user -- it is
>> automatically scheduled by Fiber::suspend() and automatically un-scheduled
>> by Fiber::resume() or Fiber::throw(). It's the fiber that runs between
>> fibers :) Does that sound accurate?
>>
>
> Yes, that's accurate. Fibers are used for cooperative multi-tasking and
> there's usually a single scheduler responsible for the scheduling. Multiple
> schedulers would block each other or busy wait. So having multiple
> schedulers is strongly discouraged in long running applications, however,
> it might be acceptable in traditional applications, i.e. PHP-FPM. In
> PHP-FPM, multiple schedulers partially blocking each other is still better
> than blocking entirely for every I/O operation.
>
>
>> What's not clear to me is why the scheduling fiber needs to be
>> distinguished from other fibers. If we want to stick with the general
>> approach, why is Fiber::suspend($scheduler) not Fiber::transferTo($fiber),
>> where $fiber would be the fiber serving as scheduler (but otherwise a
>> normal Fiber)? I would expect that context-switching between arbitrary
>> fibers would be both most expressive, and make for the smallest interface.
>>
>
> There are a few reasons to make a difference here:
>
> - SchedulerFibers are run to completion at script end, which isn't the
> case for normal fibers.
> - Terminating fibers need a fiber to return to. For schedulers it's fine
> if a resumed fiber terminates, for normal fibers it should be an exception
> if the scheduler fiber terminates without explicitly resuming the suspended
> fiber.
> - Keeping the previous fiber for each suspension point is complicated if
> not impossible to get right and generally complicates the implementation
> and cognitive load, see following example:
>
> main -> A -> B -> C -> A (terminates) -> C (previous) -> B (terminates) ->
> C (previous, terminates) -> main
>
> In the example above, the previous fiber linked list from C back to main
> needs to be optimized at some point, otherwise A and B need to be kept in
> memory and thus leak memory until C is resumed.
>
> I'm sure Aaron can present a few other reasons to keep the separation.
>

Thanks, I didn't consider the terminating fiber case here. But possibly a
combination of the two would work? That is, you generally have
Fiber::suspend() return to the parent fiber, and make the parent fiber
terminating while still having children an error condition. Additionally,
you provide something like Fiber::runAsParent() to allow "adopting" a fiber
under a new parent, that serves as scheduler (this is to cover the {main}
-> scheduler use case).

If you stick to the FiberScheduler concept, then you might want to consider
inverting the API. Right now you're basically using a standard Fiber API,
with the difference that suspend() accepts a FiberScheduler, which is
unintuitive to me. If Fibers require a scheduler anyway, why are the
suspend and resume methods not on the scheduler?

class FiberScheduler {
    function suspend(Fiber $fiber);
    function start(Fiber $fiber);
    function resume(Fiber $fiber);
}

Both methods are bound to the scheduler in that "suspend" suspends back to
a certain scheduler, while "resume" resumes a fiber such that it will
return back to this scheduler on termination. This also makes it more
obvious that, for example, it's not possible to just do a "$fiber->start()"
without having created a scheduler first (though it does not make it
obvious that the call has to be from within the scheduler).

Regards,
Nikita


> The more limited alternative is to instead have Fiber::suspend() return to
>> the parent fiber (the one that resume()d it). Here, the parent fiber
>> effectively becomes the scheduler fiber. If I understand right, the reason
>> why you don't want to use that approach, is that it doesn't allow you to
>> call some AMP function from the {main} fiber, create the scheduler there
>> and then treat {main} just like any other fiber. Is that correct?
>>
>
> Correct, this wouldn't allow top-level Fiber::suspend(). It would also
> make the starting / previously resuming party responsible for resuming the
> fiber instead of the fiber being able to "choose" the scheduler for a
> specific suspension point. One fiber would thus be effectively limited to a
> single scheduler.
>
> Best,
> Niklas
>

Reply via email to