[
https://issues.apache.org/jira/browse/FLINK-39914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chanhae Oh updated FLINK-39914:
-------------------------------
Description:
Execution.deploy() now creates TaskDeploymentDescriptor asynchronously, which
means the IO executor thread calls jobMasterMainThreadExecutor.execute() to
hand off the work.
The problem is that forMainThread() has an assertion inside execute() that
checks whether the caller is the test thread — and since it's being called from
the IO executor thread instead, this assertion fires and throws an
AssertionError.
This error then bubbles up through the CompletableFuture chain, flipping the
producer vertex into FAILED state, which causes an IllegalStateException when
the consumer vertex tries to read from it.
was:
With the async TaskDeploymentDescriptor creation introduced in
Execution.deploy(), the IO executor thread now calls
jobMasterMainThreadExecutor.execute() to enqueue the TaskDeploymentDescriptor
creation step.
However, the test was using
ComponentMainThreadExecutorServiceAdapter.forMainThread(), whose execute()
method asserts that the caller is the registered main thread (the test thread).
Since Java assertions (-ea) are enabled in the test JVM, this triggers an
AssertionError when the IO executor thread calls execute().
The error propagates through the CompletableFuture chain, causing the producer
vertex to transition to FAILED state, which subsequently leads to an
IllegalStateException when the consumer vertex attempts to consume partitions
from the failed producer.
> Fix flaky TaskDeploymentDescriptorFactoryTest#testHybridVertexFinish caused
> by async TDD creation
> -------------------------------------------------------------------------------------------------
>
> Key: FLINK-39914
> URL: https://issues.apache.org/jira/browse/FLINK-39914
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Task
> Affects Versions: 2.3.0
> Reporter: Chanhae Oh
> Priority: Minor
> Fix For: 2.3.0
>
>
> Execution.deploy() now creates TaskDeploymentDescriptor asynchronously, which
> means the IO executor thread calls jobMasterMainThreadExecutor.execute() to
> hand off the work.
> The problem is that forMainThread() has an assertion inside execute() that
> checks whether the caller is the test thread — and since it's being called
> from the IO executor thread instead, this assertion fires and throws an
> AssertionError.
> This error then bubbles up through the CompletableFuture chain, flipping the
> producer vertex into FAILED state, which causes an IllegalStateException when
> the consumer vertex tries to read from it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)