[jira] [Commented] (FLINK-14676) Introduce parallelism inference for InputFormatTableSource

Jingsong Lee (Jira) Mon, 11 Nov 2019 01:31:35 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-14676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971408#comment-16971408
 ]


Jingsong Lee commented on FLINK-14676:
--------------------------------------

Hi [~jark], the parallelism of StreamTableSource need be considered more, 
because we can not set the parallelism of DataStream, DataStream could be a 
intermediate node, it is dangerous to change its parallelism, that's why 
FLINK-13494 revert previous implementation.

I think we can re-think parallelism of StreamTableSource after table source 
refactoring. For this ticket, we should limit the function to 
InputFormatTableSource, which is not controversial, and it is very important. 

BTW,  can you assign this to me?

> Introduce parallelism inference for InputFormatTableSource
> ----------------------------------------------------------
>
>                 Key: FLINK-14676
>                 URL: https://issues.apache.org/jira/browse/FLINK-14676
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / Planner
>            Reporter: Jingsong Lee
>            Priority: Major
>             Fix For: 1.10.0
>
>
> FLINK-12801 has introduce parallelism setting for table, but because 
> TableSource generate DataStream, maybe DataStream is not a real source, that 
> will lead to some shuffle errors. So FLINK-13494 remove these implementations.
> In this ticket, I would like to introduce parallelism inference only for 
> InputFormatTableSource, the RowCount of InputFormatTableSource is more 
> accurate than downstream stages. It is worth to automatically generate its 
> parallelism.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14676) Introduce parallelism inference for InputFormatTableSource

Reply via email to