[ https://issues.apache.org/jira/browse/FLINK-22456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated FLINK-22456: ----------------------------------- Labels: pull-request-available (was: ) > Support InitializeOnMaster and FinalizeOnMaster to be used in InputFormat > ------------------------------------------------------------------------- > > Key: FLINK-22456 > URL: https://issues.apache.org/jira/browse/FLINK-22456 > Project: Flink > Issue Type: Improvement > Components: Runtime / Task > Reporter: Li > Priority: Minor > Labels: pull-request-available > > In _InputOutputFormatVertex_, _initializeGlobal_ and _finalizeGlobal_ > are only called when the Format is _OutputFormat_, however _InputFormat_ is > not be called. > In FLINK-1722, its say _HadoopOutputFormats_ ues it to do something > before and after the task. And they only support _initializeGlobal_ and > _finalizeGlobal_ in _OutputFormat_. > I don't know why _InputFormat_ doesn't support, anyone can tell me > why? > But I think _InitializeOnMaster_ and _FinalizeOnMaster_ should also > be supported in _InputFormat_. > For example, an offline task in _JdbcInputFormat_, user can use > _initializeGlobal_ to query the total counts of this task, and then user can > create InputSplits by total counts. While task running, user can add progress > indicators metric by calculating the total number of records divided by the > current number of reads, and even the remaining time of the task can be > estimated. It is very helpful for users to view task progress and remaining > time through external systems. -- This message was sent by Atlassian Jira (v8.3.4#803005)