rkhachatryan commented on a change in pull request #16523:
URL: https://github.com/apache/flink/pull/16523#discussion_r677350866
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
##########
@@ -1435,8 +1437,18 @@ private boolean transitionState(
getAttemptId(),
currentState,
targetState);
- } else {
- if (LOG.isInfoEnabled()) {
+ } else if (LOG.isInfoEnabled()) {
+ Optional<NoResourceAvailableException> noResourceException =
+ findThrowable(error,
NoResourceAvailableException.class);
+ if (noResourceException.isPresent()) {
+ LOG.info(
+ "{} ({}) switched from {} to {}",
+ getVertex().getTaskNameWithSubtaskIndex(),
+ getAttemptId(),
+ currentState,
+ targetState,
+ noResourceException.get());
+ } else {
Review comment:
I think it's still necessary because `NoResourceAvailableException` can
be wrapped into `CompletionException` and the latter preserves it's own
stacktrace:
```
11239 [flink-akka.actor.default-dispatcher-7] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Custom
Source -> Map -> Sink: Unnamed (1/4) (f2065a556b8c1e6f9ae996644439b6e2)
switched from SCHEDULED to FAILED on [unassigned resource].
java.util.concurrent.CompletionException:
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException:
Could not acquire the minimu*m required resources.
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
~[?:1.8.0_271]
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
~[?:1.8.0_271]
at
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607)
~[?:1.8.0_271]
at
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
~[?:1.8.0_271]
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
~[?:1.8.0_271]
at
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
~[?:1.8.0_271]
at
org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge$PendingRequest.failRequest(DeclarativeSlotPoolBridge.java:545)
~[ classes/:?]
at
org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge.cancelPendingRequests(DeclarativeSlotPoolBridge.java:127)
~[class es/:?]
at
org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge.failPendingRequests(DeclarativeSlotPoolBridge.java:355)
~[classes /:?]
at
org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge.notifyNotEnoughResourcesAvailable(DeclarativeSlotPoolBridge.java:
344) ~[classes/:?]
at
org.apache.flink.runtime.jobmaster.JobMaster.notifyNotEnoughResourcesAvailable(JobMaster.java:800)
~[classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
~[?:1.8.0_271]
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:1.8.0_271]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:1.8.0_271]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_271]
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:301)
~[classes/:?]
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212)
~[classes/:?]
at
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
~[classes/:?]
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
~[classes/:?]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
[akka-actor_2.11-2.5.21.jar:2.5.21]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
[akka-actor_2.11-2.5.21.jar:2.5.21]
at
scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
[scala-library-2.11.12.jar:?]
at
akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
[akka-actor_2.11-2.5.21.jar:2.5.21]
at
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
[scala-library-2.11.12.jar:?]
at
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
[scala-library-2.11.12.jar:?]
at
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
[scala-library-2.11.12.jar:?]
at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
[akka-actor_2.11-2.5.21.jar:2.5.21]
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
[akka-actor_2.11-2.5.21.jar:2.5.21]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
[akka-actor_2.11-2.5.21.jar:2.5.21]
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
[akka-actor_2.11-2.5.21.jar:2.5.21]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
[akka-actor_2.11-2.5.21.jar:2.5.21]
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
[akka-actor_2.11-2.5.21.jar:2.5.21]
Caused by:
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException:
Could not acquire the minimum required resources.
11247 [flink-akka.actor.default-dispatcher-7] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the
results produced by task execution f2065a556b8c1e6f9ae996644439b6e2.
11251 [flink-akka.actor.default-dispatcher-7] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job Flink Streaming
Job (00f78e7db...
```
And this stacktrace seems unnecessary too. WDYT?
With stripping `CompletionException`, it's logged like this:
```
10957 [flink-akka.actor.default-dispatcher-2] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Custom
Source -> Map -> Sink: Unnamed (1/4) (efc741037c36846d9baaecb1a0b5b346)
switched from SCHEDULED to FAILED
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException:
Could not acquire the minimum required resources.
10963 [flink-akka.actor.default-dispatcher-2] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the
results produced by
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]