[ 
https://issues.apache.org/jira/browse/KAFKA-12254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dhruvil Shah updated KAFKA-12254:
---------------------------------
    Description: 
`MirrorSourceConnector` implements the logic for replicating data, 
configurations, and other metadata between the source and destination clusters. 
This includes the tasks below:
 # `refreshTopicPartitions` for syncing topics / partitions from source to 
destination.
 # `syncTopicConfigs` for syncing topic configurations from source to 
destination.

A limitation is that `computeAndCreateTopicPartitions` creates topics with 
default configurations on the destination cluster. A separate async task 
`syncTopicConfigs` is responsible for syncing the topic configs. Before that 
sync happens, topic configurations could be out of sync between the two 
clusters.

In the worst case, this could lead to data loss eg. when we have a compacted 
topic being mirrored between clusters which is incorrectly created with the 
default configuration of `cleanup.policy = delete` on the destination before 
the configurations are sync'd via `syncTopicConfigs`.

Here is an example of the divergence:

Source Topic:

```

Topic: foobar PartitionCount: 1 ReplicationFactor: 1 Configs: 
cleanup.policy=compact,segment.bytes=1073741824

```

Destination Topic:

```

Topic: A.foobar PartitionCount: 1 ReplicationFactor: 1 Configs: 
segment.bytes=1073741824

```

A safer approach is to ensure that the right configurations are set on the 
destination cluster before data is replicated to it.

  was:
`MirrorSourceConnector` implements the logic for replicating data, 
configurations, and other metadata between the source and destination clusters. 
This includes the tasks below:
 # `refreshTopicPartitions` for syncing topics / partitions from source to 
destination.
 # `syncTopicConfigs` for syncing topic configurations from source to 
destination.

A limitation is that `computeAndCreateTopicPartitions` creates topics with 
default configurations on the destination cluster. A separate async task 
`syncTopicConfigs` is responsible for syncing the topic configs. Before that 
sync happens, topic configurations could be out of sync between the two 
clusters.

In the worst case, this could lead to data loss eg. when we have a compacted 
topic being mirrored between clusters which is incorrectly created with the 
default configuration of `cleanup.policy = delete` on the destination before 
the configurations are sync'd via `syncTopicConfigs`.

Here is an example of the divergence:

Source Topic:

```

Topic: foobar PartitionCount: 1 ReplicationFactor: 1 Configs: 
cleanup.policy=compact,segment.bytes=1073741824

```

Destination Topic:

```

Topic: A.foobar PartitionCount: 1 ReplicationFactor: 1 Configs: 
segment.bytes=1073741824

```


> MirrorMaker 2.0 creates destination topic with default configs
> --------------------------------------------------------------
>
>                 Key: KAFKA-12254
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12254
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Dhruvil Shah
>            Priority: Major
>
> `MirrorSourceConnector` implements the logic for replicating data, 
> configurations, and other metadata between the source and destination 
> clusters. This includes the tasks below:
>  # `refreshTopicPartitions` for syncing topics / partitions from source to 
> destination.
>  # `syncTopicConfigs` for syncing topic configurations from source to 
> destination.
> A limitation is that `computeAndCreateTopicPartitions` creates topics with 
> default configurations on the destination cluster. A separate async task 
> `syncTopicConfigs` is responsible for syncing the topic configs. Before that 
> sync happens, topic configurations could be out of sync between the two 
> clusters.
> In the worst case, this could lead to data loss eg. when we have a compacted 
> topic being mirrored between clusters which is incorrectly created with the 
> default configuration of `cleanup.policy = delete` on the destination before 
> the configurations are sync'd via `syncTopicConfigs`.
> Here is an example of the divergence:
> Source Topic:
> ```
> Topic: foobar PartitionCount: 1 ReplicationFactor: 1 Configs: 
> cleanup.policy=compact,segment.bytes=1073741824
> ```
> Destination Topic:
> ```
> Topic: A.foobar PartitionCount: 1 ReplicationFactor: 1 Configs: 
> segment.bytes=1073741824
> ```
> A safer approach is to ensure that the right configurations are set on the 
> destination cluster before data is replicated to it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to