Xintong Song created FLINK-19068:
------------------------------------
Summary: Filter verbose pod events for
KubernetesResourceManagerDriver
Key: FLINK-19068
URL: https://issues.apache.org/jira/browse/FLINK-19068
Project: Flink
Issue Type: Improvement
Components: Deployment / Kubernetes
Reporter: Xintong Song
A status of a Kubernetes pod consists of many detailed fields. Currently, Flink
receives pod {{MODIFIED}} events from theĀ {{KubernetesPodsWatcher}} on every
single change to these fields, many of which Flink does not care.
The verbose events will not affect the functionality of Flink, but will pollute
the logs with repeated messages, because Flink only looks into the fields it
interested in and those fields are identical.
E.g., when a task manager is stopped due to idle timeout, Flink receives 3
events:
* MODIFIED: container terminated
* MODIFIED: {{deletionGracePeriodSeconds}} changes from 30 to 0, which is a
Kubernetes internal status change after containers are gracefully terminated
* DELETED: Flink removes metadata of the terminated pod
Among the 3 messages, Flink is only interested in the 1st MODIFIED message, but
will try to process all of them because the container status is terminated.
I propose to Filter the verbose events in
{{KubernetesResourceManagerDriver.PodCallbackHandlerImpl}}, to only process the
status changes interested by Flink. This probably requires recording the status
of all living pods, to compare with the incoming events for detecting status
changes.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)