pnowojski commented on a change in pull request #10009: [FLINK-14304] Avoid
task starvation with mailbox
URL: https://github.com/apache/flink/pull/10009#discussion_r342974709
##########
File path:
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/mailbox/MailboxExecutor.java
##########
@@ -30,7 +27,44 @@
import java.util.concurrent.RejectedExecutionException;
/**
- * Interface for an {@link Executor} build around a {@link Mailbox}-based
execution model.
+ * Interface for an {@link Executor} build around a mailbox-based execution
model (see {@link TaskMailbox}). {@code
+ * MailboxExecutor} can also execute downstream messages of a mailbox by
yielding control from the task thread.
+ *
+ * <p>All submission functions can be called from any thread and will enqueue
the action for further processing in a
+ * FIFO fashion.
+ *
+ * <p>The yielding functions avoid the following situation: One operator
cannot fully process an input record and
+ * blocks the task thread until some resources are available. However, since
the introduction of the mailbox model
+ * blocking the task thread will not only block new inputs but also all events
from being processed. If the resources
+ * depend on downstream operators being able to process such events (e.g.,
timers), then we may easily arrive at some
+ * livelocks.
+ *
+ * <p>The yielding functions will only process events from the operator itself
and any downstream operator. Events of upstream
+ * operators are only processed when the input has been fully processed or if
they yield themselves. This method avoid
+ * congestion and potential deadlocks, but will process {@link Mail}s slightly
out-of-order, effectively creating a view
+ * on the mailbox that contains no message from upstream operators.
+ *
+ * <p><b>All yielding functions must be called in the mailbox thread</b> (see
{@link TaskMailbox#isMailboxThread()}) to not
+ * violate the single-threaded execution model. There are two typical cases,
both waiting until the resource is
+ * available. The main difference is if the resource becomes available through
a mailbox message itself or not.
+ *
+ * <p>If the resource becomes available through a mailbox mail, we can
effectively block the task thread.
+ * Implicitly, this requires the mail to be enqueued by a different thread.
+ * <pre>{@code
+ * while (resource not available) {
+ * mailboxExecutor.yield();
+ * }
+ * }</pre>
+ *
+ * <p>If the resource becomes available through an external mechanism or the
corresponding mail needs to be enqueued
+ * in the task thread, we cannot block.
+ * <pre>{@code
+ * while (resource not available) {
+ * if (!mailboxExecutor.tryYield()) {
+ * do stuff or sleep for a small amount of time
+ * }
+ * }
+ * }</pre>
*/
Review comment:
aren't we missing `@PublicEvolving` annotation here?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services