[ 
https://issues.apache.org/jira/browse/KAFKA-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yash Mayya updated KAFKA-14353:
-------------------------------
    Priority: Minor  (was: Major)

> Kafka Connect REST API configuration validation timeout improvements
> --------------------------------------------------------------------
>
>                 Key: KAFKA-14353
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14353
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>            Reporter: Yash Mayya
>            Assignee: Yash Mayya
>            Priority: Minor
>              Labels: kip-required
>
> Kafka Connect currently defines a default REST API request timeout of [90 
> seconds|https://github.com/apache/kafka/blob/5e399fe6f3aa65b42b9cdbf1c4c53f6989a570f0/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectResource.java#L30].
>  If a REST API request takes longer than this timeout value, a {{500 Internal 
> Server Error}}  response is returned with the message "Request timed out".
> The {{POST /connectors}}  and the {{PUT /connectors/\{connector}/config}}  
> endpoints that are used to create or update connectors internally do a 
> connector configuration validation (the details of which vary depending on 
> the connector plugin) before proceeding to write a message to the Connect 
> cluster's config topic. If the configuration validation takes longer than 90 
> seconds, the connector is still eventually created after the config 
> validation completes (even though a {{500 Internal Server Error}}  response 
> is returned to the user) which leads to a fairly confusing user experience.
> Furthermore, this situation is exacerbated by the potential for config 
> validations occurring twice for a single request. If Kafka Connect is running 
> in distributed mode, requests to create or update a connector are forwarded 
> to the Connect worker which is currently the leader of the group, if the 
> initial request is made to a worker which is not the leader. In this case, 
> the config validation occurs both on the initial worker, as well as the 
> leader (assuming that the first config validation is successful) - this means 
> that if a config validation takes longer than 45 seconds to complete each 
> time, it will result in the original create / update connector request timing 
> out.
> Slow config validations can occur in certain exceptional scenarios - consider 
> a database connector which has elaborate validation logic involving querying 
> information schema to get a list of tables and views to validate the user's 
> connector configuration. If the database has a very high number of tables and 
> views and the database is under a heavy load in terms of query volume, such 
> information schema queries can end up being considerably slow to complete.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to