[
https://issues.apache.org/jira/browse/FLINK-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019866#comment-17019866
]
Guowei Ma commented on FLINK-2646:
----------------------------------
I want to understand what specific scenarios the jira wants to resolve. After
reading some mail/doc[1][2] and some scenarios I encountered. I summary it as
below(correct me if I miss something):
Currently, the scenarios that _close_ interface could not be satisfied
# The user wants to accelerate the failover process. Currently, some users
implement the _close_ interface to flush the memory data to the external system
because the job would deal with the bounded stream sometimes. However, it slows
down the failover process because when canceling the task the Flink would also
call the close method which might do some heavy i/o processing.
# The user wants the exactly once semantics for the bounded stream. If the
user implements the _close_ interface which commits the results some results
would be committed multi-times because when failover occurs some messages would
be replayed. If the user implements the _close_ interface which does not commit
the result some results would be lost.
Because many users implement the _close_ interface to release the resources so
we could not break this semantics that whenever a task is terminated the
_close_ method should always be called.
If Flink provides an interface such as `_closeAtEndofStream_` I think we could
resolve the second problem in most situations. However I think this also needs
some other efforts such as dedupe the commit at the _close_ or using the
_finalizeOnMaster_ callback.
[1]
https://lists.apache.org/thread.html/4cf28a9fa3732dfdd9e673da6233c5288ca80b20d58cee130bf1c141%40%3Cuser.flink.apache.org%3E
[2]
https://docs.google.com/document/d/1SXfhmeiJfWqi2ITYgCgAoSDUv5PNq1T8Zu01nR5Ebog/edit#
> User functions should be able to differentiate between successful close and
> erroneous close
> -------------------------------------------------------------------------------------------
>
> Key: FLINK-2646
> URL: https://issues.apache.org/jira/browse/FLINK-2646
> Project: Flink
> Issue Type: Improvement
> Components: API / DataStream
> Affects Versions: 0.10.0
> Reporter: Stephan Ewen
> Assignee: Kostas Kloudas
> Priority: Major
> Labels: usability
>
> Right now, the {{close()}} method of rich functions is invoked in case of
> proper completion, and in case of canceling in case of error (to allow for
> cleanup).
> In certain cases, the user function needs to know why it is closed, whether
> the task completed in a regular fashion, or was canceled/failed.
> I suggest to add a method {{closeAfterFailure()}} to the {{RichFunction}}. By
> default, this method calls {{close()}}. The runtime is the changed to call
> {{close()}} as part of the regular execution and {{closeAfterFailure()}} in
> case of an irregular exit.
> Because by default all cases call {{close()}} the change would not be API
> breaking.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)