[
https://issues.apache.org/jira/browse/KAFKA-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yash Mayya reassigned KAFKA-14353:
----------------------------------
Assignee: (was: Yash Mayya)
> Kafka Connect REST API configuration validation timeout improvements
> --------------------------------------------------------------------
>
> Key: KAFKA-14353
> URL: https://issues.apache.org/jira/browse/KAFKA-14353
> Project: Kafka
> Issue Type: Improvement
> Components: connect
> Reporter: Yash Mayya
> Priority: Minor
> Labels: kip-required
>
> Kafka Connect currently defines a default REST API request timeout of [90
> seconds|https://github.com/apache/kafka/blob/5e399fe6f3aa65b42b9cdbf1c4c53f6989a570f0/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectResource.java#L30].
> If a REST API request takes longer than this timeout value, a {{500 Internal
> Server Error}} response is returned with the message "Request timed out".
> The {{POST /connectors}} and the {{PUT /connectors/\{connector}/config}}
> endpoints that are used to create or update connectors internally do a
> connector configuration validation (the details of which vary depending on
> the connector plugin) before proceeding to write a message to the Connect
> cluster's config topic. If the configuration validation takes longer than 90
> seconds, the connector is still eventually created after the config
> validation completes (even though a {{500 Internal Server Error}} response
> is returned to the user) which leads to a fairly confusing user experience.
> Furthermore, this situation is exacerbated by the potential for config
> validations occurring twice for a single request. If Kafka Connect is running
> in distributed mode, requests to create or update a connector are forwarded
> to the Connect worker which is currently the leader of the group, if the
> initial request is made to a worker which is not the leader. In this case,
> the config validation occurs both on the initial worker, as well as the
> leader (assuming that the first config validation is successful) - this means
> that if a config validation takes longer than 45 seconds to complete each
> time, it will result in the original create / update connector request timing
> out.
> Slow config validations can occur in certain exceptional scenarios - consider
> a database connector which has elaborate validation logic involving querying
> information schema to get a list of tables and views to validate the user's
> connector configuration. If the database has a very high number of tables and
> views and the database is under a heavy load in terms of query volume, such
> information schema queries can end up being considerably slow to complete.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)