[ 
https://issues.apache.org/jira/browse/SOLR-18077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-18077:
----------------------------------
    Labels: pull-request-available  (was: )

> CrossDC Consumer - out-of-order Kafka partition processing
> ----------------------------------------------------------
>
>                 Key: SOLR-18077
>                 URL: https://issues.apache.org/jira/browse/SOLR-18077
>             Project: Solr
>          Issue Type: Bug
>          Components: module - crossDC
>    Affects Versions: 9.10
>            Reporter: Andrzej Bialecki
>            Assignee: Andrzej Bialecki
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When mirrored requests are submitted to Kafka in {{KafkaMirroringSink}} the 
> default partitioner is used (\{{BuiltInPartitioner}}), which is submits 
> messages to partitions in batches, switching between partitions in a 
> round-robin fashion.
> The same partitioner will be used (see below) by the MirrorMaker when adding 
> messages to the target Kafka topic. Then 
> {{KafkaCrossDcConsumer.pollAndProcessRequests()}} method retrieves new 
> records - BUT then it iterates over partitions in a basically random order 
> because {{ConsumerRecords.partitions}} is a HashMap.
> This means that the batches of messages retrieved from multiple partitions 
> are no longer necessarily in the same order as they were submitted. If 
> requests in these batches from multiple partitions refer to the same 
> collection then they may be applied out of order, leading to data divergence.
> One possible solution is to explicitly use a different partitioning scheme 
> when submitting messages from {{KafkaMirroringSink}} . This happens 
> automatically when {{ProducerRecord}} key is explicitly set, and we can use 
> the {{collection}} name as the key - this way all requests for the same 
> collection will end up in the same partition, thus preserving the ordering.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to