> > There should be a defined ordering (or at least, some guarantees).
The execution order, which is *part of the contract*, is as follows: 1. Microtasks are executed first. 2. Then I/O events and OS signals are processed. 3. Then timer events are executed. 4. Only after that are fibers scheduled for execution. In the current implementation, fibers are stored in *a queue without priorities* (this is not a random choice). During one cycle period, only one fiber is taken from the queue. This results in the following code (I've removed unnecessary details): do { execute_microtasks_handler(); has_handles = execute_callbacks_handler(circular_buffer_is_not_empty(&ASYNC_G(deferred_resumes))); execute_microtasks_handler(); bool was_executed = execute_next_fiber_handler(); if (UNEXPECTED( false == has_handles && false == was_executed && zend_hash_num_elements(&ASYNC_G(fibers_state)) > 0 && circular_buffer_is_empty(&ASYNC_G(deferred_resumes)) && circular_buffer_is_empty(&ASYNC_G(microtasks)) && resolve_deadlocks() )) { break; } } while (zend_hash_num_elements(&ASYNC_G(fibers_state)) > 0 || circular_buffer_is_not_empty(&ASYNC_G(microtasks)) || reactor_loop_alive_fn() ); If we go into details, it is also noticeable that microtasks are executed twice - before and after event processing - because an event handler might enqueue a microtask, and the loop ensures that this code executes as early as possible. The contract for the execution order of microtasks and events is important because it must be considered when developing event handlers. The concurrent iterator relies on this rule. However, making assumptions about when a fiber will be executed is *not* part of the contract, if only because this algorithm can be changed at any moment. *// Execution is paused until the fiber completes* $result = Async\await( > $fiber); // immediately enter $fiber without queuing So is it possible to change the execution order and optimize context switches? Yes, there are ways to do this. However, it would require modifying the Fiber code, possibly in a significant way (I haven't explored this aspect in depth). But… let's consider whether this would be a good idea. We have a web server. A single thread is handling five requests. They all compete with each other because this is a typical application interacting with MySQL. In each Fiber, you send a query and wait for the result as quickly as possible. In what case should we create a new coroutine within a request handler? The answer: usually, we do this when we want to run something in the background while continuing to process the request and return a response as soon as possible. In this paradigm, it is beneficial to execute coroutines in the order they were enqueued. For other scenarios, it might be a better approach for a child coroutine to execute immediately. In that case, these scenarios should be considered, and it may be worth introducing specific semantics for such cases. Won't php code behave exactly the same as it did before once enabling the > scheduler? Suppose we have a sleep() function. Normally, it calls php_sleep((unsigned int)num). The php_sleep function blocks the execution of the thread. But we need to add an alternative path: if (IN_ASYNC_CONTEXT) { async_wait_timeout((unsigned int) num * 1000, NULL); RETURN_LONG(0); } The IN_ASYNC_CONTEXT condition consists of two points: - The current execution context is inside a *Fiber*. - The *Scheduler* is active. What’s the difference? If the *Scheduler* is not active, calling sleep() will block the entire *Thread* because, without an event loop, it simply cannot correctly handle concurrency. However, if the *Scheduler* is active, the code will set up handlers and return control to the "main loop", which will pick the next Fiber from the queue, and so on. This means that *without a Scheduler and Reactor, concurrent execution is impossible (*without additional effort*)*. >From the perspective of a PHP developer, if they are working with *AMPHP/Swoole*, nothing changes, because the code inside the if condition will *never execute* in their case. Does this change the execution order inside a *Fiber*? No. If you had code working with *RabbitMQ sockets*, and you copied this code into a *Fiber*, then enabled concurrency, it would work exactly the same way. If the code used *blocking sockets*, the *Fiber* would yield control to the *Scheduler*. And if two such *Fibers* are running, they will start working with *RabbitMQ sequentially*. Of course, each *Fiber* should use a different socket. The same applies to *CURL*. Do you have an existing module that sends requests to a service using *CURL* in a synchronous style? Just copy the code into a coroutine. This means *almost* 98% transparency. Why *almost*? Because there might be nuances in *helper functions* and *internal states*. There may also be *differences in OS state management* or *file system*, which could affect the final result. > How will a library take advantage of this feature if it cannot be certain the > scheduler is > running or not? Do I need to write a library for async and another version > for non-async? > Or do all the async functions with this feature work without the scheduler > running, or do > they throw a catchable error? > > This means that the launchScheduler() function should be called *only once* during the entire lifecycle of the application. If an error occurs and is not handled, the application should *terminate*. This is not a technical limitation but rather a *logical constraint*. If launchScheduler() were replaced with a CLI option, such as php --enable-scheduler, where the *Scheduler* is implicitly activated, then it would be like the *last line of code *it must exist *only once*. Will this change the way os signals are handled then? Will it break compatibility if a > library uses pcntl traps and I'm using true async traps too? Note there are > several > different ways (timeout) signals are handled in PHP -- so if (per-chance) the > scheduler > could always be running, maybe we can unify the way signals are handled in > php. > > > Regarding this phrase in the RFC: it refers to the *window close event* in Windows, which provides a few seconds before the process is forcibly terminated. There are signals intended for *application termination*, such as *SIGBREAK* or *CTRL-C*, which should typically be handled in *only one place* in the application. Developers are often tempted to insert signal handlers in multiple locations, making the code dependent on the environment. But more importantly, this *should not happen at all*. *True Async* explicitly defines a *Flow* for emergency or unexpected application termination. Attempting to disrupt this *Flow* by adding a custom termination signal handler introduces *ambiguity*. There should be *only one* termination handler. And at the end of its execution, it *must* call gracefulShutdown. As for *pcntl*, this will need to be tested. > What if it never resumes at all? If a *Fiber* is never resumed, it means the application has completely crashed with no way to recover :) The RFC has *two sections* dedicated to this issue: *Cancellation Operation* + *Graceful Shutdown*. If the application *terminates due to an unhandled exception*, *all Fibers must be executed*. Any *Fiber* can be canceled *at any time*, and there is *no need* to use *explicit Cancellation*, which I personally find an inconvenient pattern. The RFC doesn’t mention the stack trace. Will it throw away any information > about the inner exception? This is literally *"exception transfer"*. The stack trace will be exactly the same as if the exception were thrown at the call site. To be honest, I haven’t had enough time to thoroughly test this. Let's try it: <?php Async\async(function() { echo "async function 1\n"; Async\async(function() { echo "2\n"; throw new Error("Error"); }); }); echo "start\n"; try { Async\launchScheduler(); } catch (\Throwable $exception) { print_r($exception); } echo "end\n"; ?> 004+ Error Object 005+ ( 006+ [message:protected] => Error 007+ [string:Error:private] => 008+ [code:protected] => 0 009+ [file:protected] => async.php 010+ [line:protected] => 8 011+ [trace:Error:private] => Array 012+ ( 013+ [0] => Array 014+ ( 015+ [function] => {closure:{closure:async.php:3}:6} 016+ [args] => Array 017+ ( 018+ ) 019+ ) 020+ [1] => Array 021+ ( 022+ [file] => async.php 023+ [line] => 14 024+ [function] => Async\launchScheduler 025+ [args] => Array 026+ ( 027+ ) 028+ ) 029+ ) 030+ [previous:Error:private] => 031+ ) Seems perfectly correct. What will calling exit or die do? I completely forgot about them! Well, of course, Swoole override them. This needs to be added to the TODO. Why is this the case? For example, consider a *long-running* application where a *service* is a class that remains in memory continuously. The *web server* receives an HTTP request and starts a *Fiber* for each request. Each request has its own *User Session ID*. You want to call a service function, but you *don’t* want to pass the *Session ID* every time, because there are also *5-10 other request-related variables*. However, you *cannot* simply store the *Session ID* in a class property, because *context switching is unpredictable*. At one moment, you're handling *Request #1*, and a second later, you're already processing *Request #2*. When a *Fiber* creates another *Fiber*, it copies a *reference* to the *context object*, which has *minimal performance impact* while maintaining execution *environment consistency*. *Closure variables work as expected *they are pure *closures* with no modifications. I didn’t mean that *True Async* breaks anything at the language level. The issue is *logical*: You *cannot* use a *global variable* in two *Fibers*, modify it, read it, and expect its state to remain *consistent*. By this point we have covered FiberHandle, Resume, and Contexts. Now we > have Futures? Can we simplify this to just Futures? Why do we need all > these different ways to handle execution? *Futures* and *Notifiers* are two different patterns. - A *Future* changes its state *only once*. - A *Notifier* generates *one or more* events. - Internally, *Future* uses *Notifier*. In the *RFC*, I mention that these are essentially *two APIs*: - *High-level API* - *Low-level API* One of the open questions is whether both APIs should remain in *PHP-land*. The *low-level API* allows for close interaction with the *event loop*, which might be useful if someone wants to write a *service* in PHP that requires this level of control. Additionally, this API helps *minimize Fiber context switches*, since its callbacks execute *without switching*. This is *both an advantage and a disadvantage*. > It's also not clear what the value of most of these function is. For example: > > Your comment made me think, especially in the context of anti-patterns. And I agree that it's better to remove unnecessary methods than to let programmers shoot themselves in the foot. As for the single producer method, I am not sure why you would use this. > > Yes, in other languages there are no explicit restrictions. If the single producer approach is indeed rarely used, then it's not such an important feature to include. However, I lack certainty on whether it's truly a rare case. On the other hand, these functions are inexpensive to implement and do not affect performance. Moreover, they have another drawback: they increase the number of behavioral variants in a single class, which seems a more significant disadvantage than the frequency of use. It isn't clear what happens when `trySend` fails. Is this an error or does nothing? > > Yes, this is a documentation oversight. I'll add it to the TODO. Thinking through it, there may be cases where `trySend` is valid, Code using tryReceive could be useful in cases where a channel is used to implement a pool. Suppose you need to retrieve an object from the pool, but if it's not available, you’d prefer to do something else (like throw an exception) rather than block the fiber. Overall, though, you’re right — it’s an antipattern. It’s better to implement the pool as an explicit class and reserve channels for their classic use. Can you expand on what this means in the RFC? Why expose it if it shouldn't > be used? I answered a similar question above. > I also noticed that you seem to be relying heavily on the current > implementation to define behavior. I love an iterative approach: prototype => RFC => prototype => RFC. Thank you for the excellent remarks and analysis! Ed.