zhuzhurk commented on code in PR #546: URL: https://github.com/apache/flink-web/pull/546#discussion_r898915849
########## _posts/2022-06-06-adaptive-batch-scheduler.md: ########## @@ -54,11 +54,16 @@ The adaptive batch scheduler only automatically decides parallelism for operator # Implementation Details -In this section, we will elaborate the details of the implementation. To automatically decide parallelism of operators, we introduced the following changes: +In this section, we will elaborate the details of the implementation. Before that, we need to briefly introduce some concepts involved: + +- [JobVertex](https://github.com/apache/flink/blob/release-1.15/flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobVertex.java) and [JobGraph](https://github.com/apache/flink/blob/release-1.15/flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/JobGraph.java): A job vertex is an operator chain formed by chaining several operators together for better performance. The job graph is a data flow consisting of job vertices. +- [ExecutionVertex](https://github.com/apache/flink/blob/release-1.15/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionVertex.java) and [ExecutionGraph](https://github.com/apache/flink/blob/release-1.15/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java): An execution vertex represents a parallel subtask of a job vertex, which will eventually be instantiated as a physical task. For example, a job vertex with a parallelism of 100 will generate 100 execution vertices. The execution graph is the physical execution topology consisting of all execution vertices. + +More details about the above concepts can be found in the [Flink documentation](https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/internals/job_scheduling/#jobmanager-data-structures). To be precise, the adaptive batch scheduler actually automatically decides the parallelism of job vertices (in the previous sections, in order not to introduce more concepts, **operator** was used to refer to **job vertex**, but they are actually slightly different). We introduced the following changes to automatically decide parallelism of job vertices: Review Comment: > (in the previous sections, in order not to introduce more concepts, **operator** was used to refer to **job vertex**, but they are actually slightly different) I think the correct logic is "the adaptive batch scheduler automatically decides the parallelism of job vertices. In this way, it decides the parallelism of the operators within that vertex." -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
