[
https://issues.apache.org/jira/browse/FLINK-22456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17331687#comment-17331687
]
Li commented on FLINK-22456:
----------------------------
cc [~lzljs3620320]
> Support InitializeOnMaster and FinalizeOnMaster to be used in InputFormat
> -------------------------------------------------------------------------
>
> Key: FLINK-22456
> URL: https://issues.apache.org/jira/browse/FLINK-22456
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Task
> Reporter: Li
> Priority: Minor
>
> In _InputOutputFormatVertex_, _initializeGlobal_ and _finalizeGlobal_
> are only called when the Format is _OutputFormat_, however _InputFormat_ is
> not be called.
> In FLINK-1722, its say _HadoopOutputFormats_ ues it to do something
> before and after the task. And they only support _initializeGlobal_ and
> _finalizeGlobal_ in _OutputFormat_.
> I don't know why _InputFormat_ doesn't support, anyone can tell me
> why?
> But I think _InitializeOnMaster_ and _FinalizeOnMaster_ should also
> be supported in _InputFormat_.
> For example, an offline task in _JdbcInputFormat_, user can use
> _initializeGlobal_ to query the total counts of this task, and then user can
> create InputSplits by total counts. While task running, user can add progress
> indicators metric by calculating the total number of records divided by the
> current number of reads, and even the remaining time of the task can be
> estimated. It is very helpful for users to view task progress and remaining
> time through external systems.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)