zentol commented on a change in pull request #15945:
URL: https://github.com/apache/flink/pull/15945#discussion_r636142237
##########
File path:
flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/DefaultSchedulerTest.java
##########
@@ -1131,6 +1131,51 @@ public void testExceptionHistoryWithRestartableFailure()
{
failingException,
updateStateTriggeringJobFailureTimestamp)));
}
+ @Test
+ public void testExceptionHistoryWithPreDeployFailure() {
+ final JobGraph jobGraph = singleNonParallelJobVertexJobGraph();
+
+ // disable auto-completing slot requests to simulate timeout
+ executionSlotAllocatorFactory
+ .getTestExecutionSlotAllocator()
+ .disableAutoCompletePendingRequests();
+ final DefaultScheduler scheduler =
createSchedulerAndStartScheduling(jobGraph);
+
+
executionSlotAllocatorFactory.getTestExecutionSlotAllocator().timeoutPendingRequests();
+
+ final ArchivedExecutionVertex taskFailureExecutionVertex =
+ Iterables.getOnlyElement(
+ scheduler
+ .requestJob()
+ .getArchivedExecutionGraph()
+ .getAllExecutionVertices());
+
+ // pending slot request timeout triggers a NoResourceAvailableException
+ final NoResourceAvailableException noResourceAvailableException =
+ new NoResourceAvailableException();
+ final long updateStateTriggeringRestartTimestamp =
+ initiateFailure(
Review comment:
why do we need to initiate the failure? Shouldn't this have already
happened when the pending requests timed out?
##########
File path:
flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/DefaultSchedulerTest.java
##########
@@ -1131,6 +1131,51 @@ public void testExceptionHistoryWithRestartableFailure()
{
failingException,
updateStateTriggeringJobFailureTimestamp)));
}
+ @Test
+ public void testExceptionHistoryWithPreDeployFailure() {
+ final JobGraph jobGraph = singleNonParallelJobVertexJobGraph();
+
+ // disable auto-completing slot requests to simulate timeout
+ executionSlotAllocatorFactory
+ .getTestExecutionSlotAllocator()
+ .disableAutoCompletePendingRequests();
+ final DefaultScheduler scheduler =
createSchedulerAndStartScheduling(jobGraph);
+
+
executionSlotAllocatorFactory.getTestExecutionSlotAllocator().timeoutPendingRequests();
+
+ final ArchivedExecutionVertex taskFailureExecutionVertex =
+ Iterables.getOnlyElement(
+ scheduler
+ .requestJob()
+ .getArchivedExecutionGraph()
+ .getAllExecutionVertices());
+
+ // pending slot request timeout triggers a NoResourceAvailableException
+ final NoResourceAvailableException noResourceAvailableException =
+ new NoResourceAvailableException();
+ final long updateStateTriggeringRestartTimestamp =
+ initiateFailure(
+ scheduler,
+
taskFailureExecutionVertex.getCurrentExecutionAttempt().getAttemptId(),
+ noResourceAvailableException);
+ taskRestartExecutor.triggerNonPeriodicScheduledTask();
+
+ // the TaskManagerLocation of the failed ExecutionVertex is still not
set
Review comment:
```suggestion
// sanity check that the TaskManagerLocation of the failed task is
indeed null, as expected
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]