[ 
https://issues.apache.org/jira/browse/FLINK-34535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822472#comment-17822472
 ] 

yuanfenghu edited comment on FLINK-34535 at 3/1/24 9:45 AM:
------------------------------------------------------------

Hi [~jeyhunkarimov] 

Thank you for your comment. In fact, my idea is very simple. I hope that when 
using flink sql, sql-client or sql-gateway, I can have a way to independently 
set the parallelism of each vertex in the generated flink task.

For example, I now have a simple task and the jobgraph generated is

: 

 

                              -> window1  -> sink1

KAFKA SOURCE ->  window2  -> sink2

                              -> window3  -> sink3

 
In flinksql, I can set the parallelism of the entire job through set 
parallelism.default=x, but if I want to set the parallelism of each vertex, I 
cannot do it.

 
So if I get the jobgraph information of the entire task through explain 
syntax(with vertexId):

 

                                     -> window1(id2)  -> sink1(id5)

KAFKA SOURCE(id1) ->  window2(id3)  -> sink2(id6)

                                     -> window3(id4)  -> sink3(id7)

 
Fit parameters: pipeline.jobvertex-parallelism-overrides

like:  

set 
pipeline.jobvertex-parallelism-overrides=id1:1,id2:2,id3:2,id4:2,id5:1,id6:1,id7:1

 
I can change the graph to:

                                           -> window1(id2 p=2)  -> sink1(id5 
p=1)

KAFKA SOURCE(id1 p=1) ->  window2(id3 p=2)  -> sink2(id6 p=1)

                                           -> window3(id4  p=2)  -> sink3(id7 
p=1)

 
So I hope that this part of the information can be planned out when explaining. 
Maybe this is not necessarily the best method. Maybe it can be done through 
`COMPILE PLAN`?
 

 
 


was (Author: JIRAUSER296932):
Hi [~jeyhunkarimov] 

谢谢你的评论,其实我的想法很简单,我希望在使用 flink sql 时, sql-client or 
sql-gateway,生成的flink任务中我能有办法能单独设置我每个 vertex的并行度

比如我现在有个简单的任务生成的jobgraph是: 

 

                              -> window1  -> sink1

KAFKA SOURCE ->  window2  -> sink2

                              -> window3  -> sink3

 
In flinksql, I can set the parallelism of the entire job through set 
parallelism.default=x, but if I want to set the parallelism of each vertex, I 
cannot do it.

所以我如果通过explain语法拿到整个任务的jobgraph信息:

                                     -> window1(id2)  -> sink1(id5)

KAFKA SOURCE(id1) ->  window2(id3)  -> sink2(id6)

                                     -> window3(id4)  -> sink3(id7)

配合参数: pipeline.jobvertex-parallelism-overrides
 
 

> Support JobPlanInfo for the explain result
> ------------------------------------------
>
>                 Key: FLINK-34535
>                 URL: https://issues.apache.org/jira/browse/FLINK-34535
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / Planner
>            Reporter: yuanfenghu
>            Priority: Major
>              Labels: pull-request-available
>
> In the Flink Sql Explain syntax, we can set ExplainDetails to plan 
> JSON_EXECUTION_PLAN, but we cannot plan JobPlanInfo. If we can explain this 
> part of the information, referring to JobPlanInfo, I can combine it with the 
> parameter `pipeline.jobvertex-parallelism-overrides` to set up my task 
> parallelism



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to