[
https://issues.apache.org/jira/browse/IGNITE-19910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ilya Shishkov updated IGNITE-19910:
-----------------------------------
Description:
Currently, in CDC through Kafka applications, single timeout property
({{kafkaRequestTimeout)}} is used for all Kafka related operations instead of
built-in timeouts of Kafka clients API (moreover, default value of 3 seconds
does not correspond to Kafka clients defaults):
||Client||Timeout||Default value, s||
|{{KafkaProducer}}|{{delivery.timeout.ms}}|120|
|{{KafkaProducer}}|{{request.timeout.ms}}|30|
|{{KafkaConsumer}}|{{default.api.timeout.ms}}|60|
|{{KafkaConsumer}}|{{request.timeout.ms}}|30|
Table below describes places where {{kafkaRequestTimeout}} is _explicitly
specified_ instead of using default timeouts:
||CDC application||API||Default timeout||
|ignite-cdc.sh:
{{IgniteToKafkaCdcStreamer}}|{{KafkaProducer#send}}|{{delivery.timeout.ms}} *|
|kafka-to-ignite.sh:
{{KafkaToIgniteCdcStreamerApplier}}|{{KafkaConsumer#commitSync}}|{{default.api.timeout.ms}}|
|kafka-to-ignite.sh:
{{KafkaToIgniteCdcStreamerApplier}}|{{KafkaConsumer#close}}|{{KafkaConsumer#DEFAULT_CLOSE_TIMEOUT_MS}}
(30s)|
|kafka-to-ignite.sh:
{{KafkaToIgniteMetadataUpdater}}|{{KafkaConsumer#partitionsFor}}|{{default.api.timeout.ms}}|
|kafka-to-ignite.sh:
{{KafkaToIgniteMetadataUpdater}}|{{KafkaConsumer#endOffsets}}|{{request.timeout.ms}}|
\* - waits for future during specified timeout ({{kafkaRequestTimeout}}), but
future fails itself if delivery timeout exceeded.
All above methods will fail, when specified timeout will be exceeded. They will
raise an exception, thus, timeout _should not be to low for them_.
On the other hand, kafka-to-ignite.sh also invokes {{KafkaConsumer#poll}} with
timeout {{kafkaRequestTimeout}}, but it just waits for data until specified
timeout expires. So, {{#poll}} _should be called quite often_ and we _should
not set too large timeout_ for it, otherwise, we can face with delays of
replication, when some topic partitions have no new data. It is not desired
behavior, because in this case some partitions will wait to be processed.
*Retries:*
{{request.timeout.ms}} [2, 4] is used as maximum timeout for each single
request. In case of timeout request will be retried. Minimal amount of retries
equals to ratio of total operation timeout (explicitly set as argument or
default) to {{request.timeout.ms}}.
It is obvious, that currently {{kafkaRequestTimeout}} have to be N times
greater, than {{request.timeout.ms}} in order to make request retries possible.
*Conclusion:*
# It seems, that the better approach is to rely on kafka clients timeouts,
because they provide all functions necessary to perform retries and handle
timeout issues.
# {{kafkaRequestTimeout}} should be used only for {{KafkaConsumer#poll}},
default value of 3s can remain the same.
----
Links:
#
https://kafka.apache.org/27/documentation.html#producerconfigs_delivery.timeout.ms
#
https://kafka.apache.org/27/documentation.html#producerconfigs_request.timeout.ms
#
https://kafka.apache.org/27/documentation.html#consumerconfigs_default.api.timeout.ms
#
https://kafka.apache.org/27/documentation.html#consumerconfigs_request.timeout.ms
was:
Currently, in CDC through Kafka applications, single timeout property
({{kafkaRequestTimeout)}} is used for all Kafka related operations instead of
built-in timeouts of Kafka clients API (moreover, default value of 3 seconds
does not correspond to Kafka clients defaults):
||Client||Timeout||Default value, s||
|{{KafkaProducer}}|{{delivery.timeout.ms}}|120|
|{{KafkaProducer}}|{{request.timeout.ms}}|30|
|{{KafkaConsumer}}|{{default.api.timeout.ms}}|60|
|{{KafkaConsumer}}|{{request.timeout.ms}}|30|
Also timeouts are used in recovery process [2, 4].
Table below describes places where {{kafkaRequestTimeout}} is _explicitly
specified_ instead of using default timeouts:
||CDC application||API||Default timeout||
|ignite-cdc.sh:
{{IgniteToKafkaCdcStreamer}}|{{KafkaProducer#send}}|{{delivery.timeout.ms}} *|
|kafka-to-ignite.sh:
{{KafkaToIgniteCdcStreamerApplier}}|{{KafkaConsumer#commitSync}}|{{default.api.timeout.ms}}|
|kafka-to-ignite.sh:
{{KafkaToIgniteCdcStreamerApplier}}|{{KafkaConsumer#close}}|{{KafkaConsumer#DEFAULT_CLOSE_TIMEOUT_MS}}
(30s)|
|kafka-to-ignite.sh:
{{KafkaToIgniteMetadataUpdater}}|{{KafkaConsumer#partitionsFor}}|{{default.api.timeout.ms}}|
|kafka-to-ignite.sh:
{{KafkaToIgniteMetadataUpdater}}|{{KafkaConsumer#endOffsets}}|{{request.timeout.ms}}|
\* - waits for future during specified timeout ({{kafkaRequestTimeout}}), but
future fails itself if delivery timeout exceeded.
All above methods will fail, when specified timeout will be exceeded. They will
raise an exception, thus, timeout _should not be to low for them_.
On the other hand, kafka-to-ignite.sh also invokes {{KafkaConsumer#poll}} with
timeout {{kafkaRequestTimeout}}, but it just waits for data until specified
timeout expires. So, {{#poll}} _should be called quite often_ and we _should
not set too large timeout_ for it, otherwise, we can face with delays of
replication, when some topic partitions have no new data. It is not desired
behavior, because in this case some partitions will wait to be processed.
*Conclusion:*
# It seems, that the better approach is to rely on kafka clients timeouts,
because they provide all functions necessary to handle recovery and timeout
issues.
# {{kafkaRequestTimeout}} should be used only for {{KafkaConsumer#poll}},
default value of 3s can remain the same.
----
Links:
#
https://kafka.apache.org/27/documentation.html#producerconfigs_delivery.timeout.ms
#
https://kafka.apache.org/27/documentation.html#producerconfigs_request.timeout.ms
#
https://kafka.apache.org/27/documentation.html#consumerconfigs_default.api.timeout.ms
#
https://kafka.apache.org/27/documentation.html#consumerconfigs_request.timeout.ms
> CDC through Kafka: refactor timeouts
> ------------------------------------
>
> Key: IGNITE-19910
> URL: https://issues.apache.org/jira/browse/IGNITE-19910
> Project: Ignite
> Issue Type: Task
> Components: extensions
> Reporter: Ilya Shishkov
> Priority: Minor
> Labels: IEP-59, ise
>
> Currently, in CDC through Kafka applications, single timeout property
> ({{kafkaRequestTimeout)}} is used for all Kafka related operations instead of
> built-in timeouts of Kafka clients API (moreover, default value of 3 seconds
> does not correspond to Kafka clients defaults):
> ||Client||Timeout||Default value, s||
> |{{KafkaProducer}}|{{delivery.timeout.ms}}|120|
> |{{KafkaProducer}}|{{request.timeout.ms}}|30|
> |{{KafkaConsumer}}|{{default.api.timeout.ms}}|60|
> |{{KafkaConsumer}}|{{request.timeout.ms}}|30|
> Table below describes places where {{kafkaRequestTimeout}} is _explicitly
> specified_ instead of using default timeouts:
> ||CDC application||API||Default timeout||
> |ignite-cdc.sh:
> {{IgniteToKafkaCdcStreamer}}|{{KafkaProducer#send}}|{{delivery.timeout.ms}} *|
> |kafka-to-ignite.sh:
> {{KafkaToIgniteCdcStreamerApplier}}|{{KafkaConsumer#commitSync}}|{{default.api.timeout.ms}}|
> |kafka-to-ignite.sh:
> {{KafkaToIgniteCdcStreamerApplier}}|{{KafkaConsumer#close}}|{{KafkaConsumer#DEFAULT_CLOSE_TIMEOUT_MS}}
> (30s)|
> |kafka-to-ignite.sh:
> {{KafkaToIgniteMetadataUpdater}}|{{KafkaConsumer#partitionsFor}}|{{default.api.timeout.ms}}|
> |kafka-to-ignite.sh:
> {{KafkaToIgniteMetadataUpdater}}|{{KafkaConsumer#endOffsets}}|{{request.timeout.ms}}|
> \* - waits for future during specified timeout ({{kafkaRequestTimeout}}), but
> future fails itself if delivery timeout exceeded.
> All above methods will fail, when specified timeout will be exceeded. They
> will raise an exception, thus, timeout _should not be to low for them_.
> On the other hand, kafka-to-ignite.sh also invokes {{KafkaConsumer#poll}}
> with timeout {{kafkaRequestTimeout}}, but it just waits for data until
> specified timeout expires. So, {{#poll}} _should be called quite often_ and
> we _should not set too large timeout_ for it, otherwise, we can face with
> delays of replication, when some topic partitions have no new data. It is not
> desired behavior, because in this case some partitions will wait to be
> processed.
> *Retries:*
> {{request.timeout.ms}} [2, 4] is used as maximum timeout for each single
> request. In case of timeout request will be retried. Minimal amount of
> retries equals to ratio of total operation timeout (explicitly set as
> argument or default) to {{request.timeout.ms}}.
> It is obvious, that currently {{kafkaRequestTimeout}} have to be N times
> greater, than {{request.timeout.ms}} in order to make request retries
> possible.
> *Conclusion:*
> # It seems, that the better approach is to rely on kafka clients timeouts,
> because they provide all functions necessary to perform retries and handle
> timeout issues.
> # {{kafkaRequestTimeout}} should be used only for {{KafkaConsumer#poll}},
> default value of 3s can remain the same.
> ----
> Links:
> #
> https://kafka.apache.org/27/documentation.html#producerconfigs_delivery.timeout.ms
> #
> https://kafka.apache.org/27/documentation.html#producerconfigs_request.timeout.ms
> #
> https://kafka.apache.org/27/documentation.html#consumerconfigs_default.api.timeout.ms
> #
> https://kafka.apache.org/27/documentation.html#consumerconfigs_request.timeout.ms
--
This message was sent by Atlassian Jira
(v8.20.10#820010)