[jira] [Updated] (FLINK-22456) Support InitializeOnMaster and FinalizeOnMaster to be used in InputFormat

2021-05-03 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-22456:
---
Component/s: (was: Runtime / Task)
 Runtime / Coordination
 API / DataStream

> Support InitializeOnMaster and FinalizeOnMaster to be used in InputFormat
> -
>
> Key: FLINK-22456
> URL: https://issues.apache.org/jira/browse/FLINK-22456
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Runtime / Coordination
>Reporter: Li
>Priority: Minor
>  Labels: pull-request-available
>
>         In _InputOutputFormatVertex_, _initializeGlobal_ and _finalizeGlobal_ 
> are only called when the Format is _OutputFormat_, however _InputFormat_ is 
> not be called.
>         In FLINK-1722, its say _HadoopOutputFormats_ ues it to do something 
> before and after the task. And they only support _initializeGlobal_ and 
> _finalizeGlobal_ in _OutputFormat_.
>         I don't know why _InputFormat_ doesn't support, anyone can tell me 
> why?
>         But I think _InitializeOnMaster_ and _FinalizeOnMaster_ should also 
> be supported in _InputFormat_.
>         For example, an offline task in _JdbcInputFormat_, user can use 
> _initializeGlobal_ to query the total counts of this task, and then user can 
> create InputSplits by total counts. While task running, user can add progress 
> indicators metric by calculating the total number of records divided by the 
> current number of reads, and even the remaining time of the task can be 
> estimated. It is very helpful for users to view task progress and remaining 
> time through external systems.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22456) Support InitializeOnMaster and FinalizeOnMaster to be used in InputFormat

2021-04-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-22456:
---
Labels: pull-request-available  (was: )

> Support InitializeOnMaster and FinalizeOnMaster to be used in InputFormat
> -
>
> Key: FLINK-22456
> URL: https://issues.apache.org/jira/browse/FLINK-22456
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: Li
>Priority: Minor
>  Labels: pull-request-available
>
>         In _InputOutputFormatVertex_, _initializeGlobal_ and _finalizeGlobal_ 
> are only called when the Format is _OutputFormat_, however _InputFormat_ is 
> not be called.
>         In FLINK-1722, its say _HadoopOutputFormats_ ues it to do something 
> before and after the task. And they only support _initializeGlobal_ and 
> _finalizeGlobal_ in _OutputFormat_.
>         I don't know why _InputFormat_ doesn't support, anyone can tell me 
> why?
>         But I think _InitializeOnMaster_ and _FinalizeOnMaster_ should also 
> be supported in _InputFormat_.
>         For example, an offline task in _JdbcInputFormat_, user can use 
> _initializeGlobal_ to query the total counts of this task, and then user can 
> create InputSplits by total counts. While task running, user can add progress 
> indicators metric by calculating the total number of records divided by the 
> current number of reads, and even the remaining time of the task can be 
> estimated. It is very helpful for users to view task progress and remaining 
> time through external systems.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)