[
https://issues.apache.org/jira/browse/FLINK-34535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822472#comment-17822472
]
yuanfenghu edited comment on FLINK-34535 at 3/1/24 9:45 AM:
------------------------------------------------------------
Hi [~jeyhunkarimov]
Thank you for your comment. In fact, my idea is very simple. I hope that when
using flink sql, sql-client or sql-gateway, I can have a way to independently
set the parallelism of each vertex in the generated flink task.
For example, I now have a simple task and the jobgraph generated is
:
-> window1 -> sink1
KAFKA SOURCE -> window2 -> sink2
-> window3 -> sink3
In flinksql, I can set the parallelism of the entire job through set
parallelism.default=x, but if I want to set the parallelism of each vertex, I
cannot do it.
So if I get the jobgraph information of the entire task through explain
syntax(with vertexId):
-> window1(id2) -> sink1(id5)
KAFKA SOURCE(id1) -> window2(id3) -> sink2(id6)
-> window3(id4) -> sink3(id7)
Fit parameters: pipeline.jobvertex-parallelism-overrides
like:
set
pipeline.jobvertex-parallelism-overrides=id1:1,id2:2,id3:2,id4:2,id5:1,id6:1,id7:1
I can change the graph to:
-> window1(id2 p=2) -> sink1(id5
p=1)
KAFKA SOURCE(id1 p=1) -> window2(id3 p=2) -> sink2(id6 p=1)
-> window3(id4 p=2) -> sink3(id7
p=1)
So I hope that this part of the information can be planned out when explaining.
Maybe this is not necessarily the best method. Maybe it can be done through
`COMPILE PLAN`?
was (Author: JIRAUSER296932):
Hi [~jeyhunkarimov]
谢谢你的评论,其实我的想法很简单,我希望在使用 flink sql 时, sql-client or
sql-gateway,生成的flink任务中我能有办法能单独设置我每个 vertex的并行度
比如我现在有个简单的任务生成的jobgraph是:
-> window1 -> sink1
KAFKA SOURCE -> window2 -> sink2
-> window3 -> sink3
In flinksql, I can set the parallelism of the entire job through set
parallelism.default=x, but if I want to set the parallelism of each vertex, I
cannot do it.
所以我如果通过explain语法拿到整个任务的jobgraph信息:
-> window1(id2) -> sink1(id5)
KAFKA SOURCE(id1) -> window2(id3) -> sink2(id6)
-> window3(id4) -> sink3(id7)
配合参数: pipeline.jobvertex-parallelism-overrides
> Support JobPlanInfo for the explain result
> ------------------------------------------
>
> Key: FLINK-34535
> URL: https://issues.apache.org/jira/browse/FLINK-34535
> Project: Flink
> Issue Type: Improvement
> Components: Table SQL / Planner
> Reporter: yuanfenghu
> Priority: Major
> Labels: pull-request-available
>
> In the Flink Sql Explain syntax, we can set ExplainDetails to plan
> JSON_EXECUTION_PLAN, but we cannot plan JobPlanInfo. If we can explain this
> part of the information, referring to JobPlanInfo, I can combine it with the
> parameter `pipeline.jobvertex-parallelism-overrides` to set up my task
> parallelism
--
This message was sent by Atlassian Jira
(v8.20.10#820010)