Li created FLINK-22456:
--------------------------
Summary: Support InitializeOnMaster and FinalizeOnMaster to be
used in InputFormat
Key: FLINK-22456
URL: https://issues.apache.org/jira/browse/FLINK-22456
Project: Flink
Issue Type: Improvement
Components: Runtime / Task
Reporter: Li
In _InputOutputFormatVertex_, _initializeGlobal_ and _finalizeGlobal_
are only called when the Format is _OutputFormat_, however _InputFormat_ is not
be called.
In FLINK-1722, its say _HadoopOutputFormats_ ues it to do something
before and after the task. And they only support _initializeGlobal_ and
_finalizeGlobal_ in _OutputFormat_.
I don't know why _InputFormat_ doesn't support, anyone can tell me why?
But I think _InitializeOnMaster_ and _FinalizeOnMaster_ should also be
supported in _InputFormat_.
For example, an offline task in _JdbcInputFormat_, user can use
_initializeGlobal_ to query the total counts of this task, and then user can
create InputSplits by total counts. While task running, user can add progress
indicators metric by calculating the total number of records divided by the
current number of reads, and even the remaining time of the task can be
estimated. It is very helpful for users to view task progress and remaining
time through external systems.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)