[
https://issues.apache.org/jira/browse/FLINK-36404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hong Liang Teoh updated FLINK-36404:
------------------------------------
Fix Version/s: prometheus-connector-1.0.0
> PrometheusSinkWriteException thrown by the response callback may not cause
> job to fail
> --------------------------------------------------------------------------------------
>
> Key: FLINK-36404
> URL: https://issues.apache.org/jira/browse/FLINK-36404
> Project: Flink
> Issue Type: Sub-task
> Components: Connectors / Prometheus
> Affects Versions: prometheus-connector-1.0.0
> Reporter: Lorenzo Nicora
> Priority: Critical
> Fix For: prometheus-connector-1.0.0
>
>
> *Issue*
> {{PrometheusSinkWriteException}} thrown by {{HttpResponseCallback}} do not
> cause the httpclient IOReactor to fail, being actually swallowed, and
> preventing the job from failing.
> Also, related: exceptions from the IOReactor eventually causes the response
> callback {{failed}} to be called. Allowing the user to set
> DISCARD_AND_CONTINUE on generic exceptions thrown by the client may hide
> rethrown exceptions. Also, there is really no use of not failing on a generic
> unhandled exceptions from the client.
> *Solution*
> 1. Intercept {{PrometheusSinkWriteException}} up the httpclient stack, adding
> to the client a {{IOSessionListener}} to that can rethow those exceptions,
> causing the reactor to actually fail, and consequently also the operator to
> fail.
> 2. Remove the ability to configure of error handling behaviour on generic
> exceptions thrown by the httpclient. The job should always fail.
> 3. When the httpclient IOReactor fail, a long chain of exceptions is logged.
> To keep the actual root cause evident, the response callback should log to
> ERROR when the exception happens
--
This message was sent by Atlassian Jira
(v8.20.10#820010)