[
https://issues.apache.org/jira/browse/IGNITE-19369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nikolay Izhikov updated IGNITE-19369:
-------------------------------------
Priority: Blocker (was: Minor)
> Metadata topic offset must be stored only after commit
> ------------------------------------------------------
>
> Key: IGNITE-19369
> URL: https://issues.apache.org/jira/browse/IGNITE-19369
> Project: Ignite
> Issue Type: Improvement
> Components: extensions
> Reporter: Nikolay Izhikov
> Assignee: Nikolay Izhikov
> Priority: Blocker
> Labels: IEP-59, ise
>
> Currently, when CDC through Kafka is used there are possible delays in
> replication between clusters when {{KafkaToIgniteCdcStreamerApplier}} tries
> to update binary metadata and marshaller mappings.
> Delays caused by calls of {{KafkaConsumer#poll}} in
> {{KafkaToIgniteMetadataUpdater#updateMetadata}} , when meta topic is empty:
> # When first {{KafkaToIgniteCdcStreamerApplier}} meets {{META_UPDATE_MARKER}}
> it calls {{KafkaToIgniteMetadataUpdater#updateMetadata}} which in turn calls
> {{KafkaConsumer#poll}}, which returns immediately [1] when data is present in
> metadata topic. If there are few binary types and mappings to update, first
> {{KafkaToIgniteCdcStreamerApplier}} will consume all entries from metadata
> topic.
> # {{KafkaToIgniteCdcStreamerApplier}} consequently call
> {{KafkaToIgniteMetadataUpdater#updateMetadata}} for each partition with meta
> update marker. All further consequent calls will wait for
> {{kafkaReqTimeout}}.
> # Also there is a bottleneck, when multiple applier threads tries to update
> metadata and call synchronized method
> {{KafkaToIgniteMetadataUpdater#updateMetadata}}, because
> {{KafkaToIgniteMetadataUpdater}} is shared between applier threads.
> # Because {{META_UPDATE_MARKER}} is sent twice to each Kafka partition of
> event topic from every node: firstly, in case of type mappings updates,
> secondly, in case of binary types update there are possible delays up to
> {{clusterSize x (topicPartitions x 2 - 1) x kafkaReqTimeout}}.
> # Data updates are blocked for Kafka partitions with unprocessed update
> markers.
> # For example for default timeout and 16 Kafka partitions _last partition
> will be consumed after 1.5 minutes_ in case of two one-node clusters.
> Links:
> #
> [https://kafka.apache.org/27/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#poll-java.time.Duration-]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)