[GitHub] [flink] zhuzhurk commented on a diff in pull request #19747: [FLINK-17295][runtime] Refactor the ExecutionAttemptID to consist of ExecutionGraphID, ExecutionVertexID and attemptNumber

GitBox Tue, 17 May 2022 23:17:05 -0700


zhuzhurk commented on code in PR #19747:
URL: https://github.com/apache/flink/pull/19747#discussion_r875505631



##########
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionAttemptID.java:
##########
@@ -68,19 +117,32 @@ public boolean equals(Object obj) {
             return true;
         } else if (obj != null && obj.getClass() == getClass()) {
             ExecutionAttemptID that = (ExecutionAttemptID) obj;
-            return that.executionAttemptId.equals(this.executionAttemptId);
+            return that.executionGraphId.equals(this.executionGraphId)
+                    && that.executionVertexId.equals(this.executionVertexId)
+                    && that.attemptNumber == this.attemptNumber;
         } else {
             return false;
         }
     }
 
     @Override
     public int hashCode() {
-        return executionAttemptId.hashCode();
+        return Objects.hash(executionGraphId, executionVertexId, 
attemptNumber);
     }
 
     @Override
     public String toString() {
-        return executionAttemptId.toString();
+        return String.format(
+                "%s_%s_%d", executionGraphId.toString(), executionVertexId, 
attemptNumber);
+    }
+
+    public String getLogString() {
+        if (DefaultExecutionGraph.LOG.isDebugEnabled()) {
+            return toString();
+        } else {
+            return String.format(
+                    "%s_%s_%d",
+                    executionGraphId.toString().substring(0, 4), 
executionVertexId, attemptNumber);

Review Comment:
   > It might also make sense to return a more structured representation that 
actually tells the reader what they are looking at.
   
   Agreed. Actually I'm planning to open a separate JIRA&pr to refine the logs 
which contains an `ExecutionAttemptID`.
   Currently, an execution is usually represented as "`job vertex name` 
(`subtaskIndex+1`/`vertex parallelism`) (`attemptId`)", which may be redundant 
after this refactoring work. I'm planning to change the format to be "`job 
vertex name` (`short ExecutionGraphID`:`JobVertexID`) (`subtaskIndex+1`/`vertex 
parallelism`) (`#attemptNumber`)" and avoid directly display the 
`ExecutionAttemptID`. This displayed `JobVertexID` can also help to distinguish 
job vertices of the same name, which is common in DataStream jobs (e.g. 
multiple `Map`).
   
   The logs are spread among multiple classes and needs some further 
examination. Therefore I'd like to remove the current 
`ExecutionAttemptID#getLogString()` and do this work in a separate task.
   
   WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] zhuzhurk commented on a diff in pull request #19747: [FLINK-17295][runtime] Refactor the ExecutionAttemptID to consist of ExecutionGraphID, ExecutionVertexID and attemptNumber

Reply via email to