[
https://issues.apache.org/jira/browse/FLINK-26236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17494727#comment-17494727
]
Gyula Fora commented on FLINK-26236:
------------------------------------
Seems like the operator SDK provides an out of the box logic for retrying
errors and setting a custom status by implementing a simple interface.
We should probably use this and instead if catching the errors and setting the
error status, use this directly:
[https://javaoperatorsdk.io/docs/features]
{{public interface ErrorStatusHandler<T extends HasMetadata> {}}
> Track and cap retries in ReconciliationStatus
> ---------------------------------------------
>
> Key: FLINK-26236
> URL: https://issues.apache.org/jira/browse/FLINK-26236
> Project: Flink
> Issue Type: Sub-task
> Components: Kubernetes Operator
> Reporter: Gyula Fora
> Priority: Major
>
> At the moment we retry errors again and again indefinitely. As suggested by
> [[email protected]] we should cap the number of retries (or the time spent
> retrying).
> For this we can include a retrycount in the reconciliiation status,
> Also we should distinguish fatal (like config errors) and recoverable errors
> with a different exception type and those should not be retried.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)