[
https://issues.apache.org/jira/browse/FLINK-37730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gyula Fora closed FLINK-37730.
------------------------------
Fix Version/s: kubernetes-operator-1.12.0
Resolution: Fixed
> Collect job exceptions as kubernetes events
> -------------------------------------------
>
> Key: FLINK-37730
> URL: https://issues.apache.org/jira/browse/FLINK-37730
> Project: Flink
> Issue Type: Improvement
> Components: Kubernetes Operator
> Reporter: Robert Metzger
> Assignee: Santwana Verma
> Priority: Major
> Labels: pull-request-available
> Fix For: kubernetes-operator-1.12.0
>
>
> In my understanding, the Flink Kubernetes Operator is currently not tracking
> the exception history for a job, listed in the JobManager UI.
> Exposing the exception history in the CR is not feasible due to size concerns.
> Exposing the exception history as kubernetes events seems to be a reasonable
> middle ground. Events have a default expiration of 1 hour on the Kubernetes
> API server.
> We could introduce a config parameter for the number of exceptions from the
> history to replicate into k8s events.
> Assume a Flink Job has 5 exceptions, the user has configured the history size
> to be 4. FKO will regularly check, if there are exception events (based on
> the exception timestamp) for the last 4 exceptions. If not, those events will
> be created.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)