[
https://issues.apache.org/jira/browse/FLINK-17769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17126505#comment-17126505
]
Piotr Nowojski commented on FLINK-17769:
----------------------------------------
1.
I think as you wrote option is not a good one because of duplicated logging
problems.
2.
problem will be that we do not clean up all of the resources. In this method we
are supposed to clean up everything that we can, regardless of the errors.
3.
Maybe if there was no other way.
One remark, I think the problem might a bit more common. There are other places
that are logging errors in {{cleanUpInvoke}}.
What about an option 4. Remember the first exception and suppress the later
ones similar how {{TaskExecutor#stopTaskExecutorServices}} is doing for
example? Keep in mind that an exception thrown from {{cleanUpInvoke}} is
already subject to a similar logic in {{StreamTask#invoke}} if
{{runMailboxLoop}} or {{afterInvoke}} has thrown some exception. So if there
was a previous exception thrown during normal execution, an error thrown from
{{cleanUpInvoke}} would be suppressed.
> Wrong order of log events on a task failure
> -------------------------------------------
>
> Key: FLINK-17769
> URL: https://issues.apache.org/jira/browse/FLINK-17769
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Task
> Reporter: Robert Metzger
> Priority: Critical
> Fix For: 1.11.0
>
>
> In this example, errors from the {{close()}} method call are logged before
> the {{switched from RUNNING to FAILED}} log line with the actual exception
> (which is confusing, because the exceptions coming from {{close()}} could be
> considered as the failure root cause, because they are first in the log)
> {code}
> 2020-05-14 10:12:42,660 INFO
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer [] -
> Started Kinesis producer instance for region 'eu-central-1'
> 2020-05-14 10:12:42,660 DEBUG
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure [] -
> Creating operator state backend for
> StreamSource_cbc357ccb763df2852fee8c4fc7d55f2_(1/1) with empty state.
> 2020-05-14 10:12:42,823 INFO
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer [] -
> Closing producer
> 2020-05-14 10:12:42,823 INFO
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer [] -
> Flushing outstanding 2 records
> 2020-05-14 10:12:42,826 ERROR
> org.apache.flink.streaming.runtime.tasks.StreamTask [] - Error
> during disposal of stream operator.
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.DaemonException:
> The child process has been shutdown and can no longer accept messages.
> 2020-05-14 10:12:42,834 WARN org.apache.flink.runtime.taskmanager.Task
> [] - Source: Custom Source -> Sink: Unnamed (1/1)
> (4a49aea047aeb3e67cf79c788df0e558) switched from RUNNING to FAILED.
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)