[jira] [Resolved] (KAFKA-3810) replication of internal topics should not be limited by replica.fetch.max.bytes

Ewen Cheslack-Postava (JIRA) Sun, 19 Jun 2016 11:44:07 -0700

     [ 
https://issues.apache.org/jira/browse/KAFKA-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ewen Cheslack-Postava resolved KAFKA-3810.
------------------------------------------
       Resolution: Fixed
    Fix Version/s: 0.10.1.0

Issue resolved by pull request 1484
[https://github.com/apache/kafka/pull/1484]

> replication of internal topics should not be limited by 
> replica.fetch.max.bytes
> -------------------------------------------------------------------------------
>
>                 Key: KAFKA-3810
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3810
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Onur Karaman
>            Assignee: Onur Karaman
>             Fix For: 0.10.1.0
>
>
> From the kafka-dev mailing list discussion:
> [\[DISCUSS\] scalability limits in the 
> coordinator|http://mail-archives.apache.org/mod_mbox/kafka-dev/201605.mbox/%3ccamquqbzddtadhcgl6h4smtgo83uqt4s72gc03b3vfghnme3...@mail.gmail.com%3E]
> There's a scalability limit on the new consumer / coordinator regarding the 
> amount of group metadata we can fit into one message. This restricts a 
> combination of consumer group size, topic subscription sizes, topic 
> assignment sizes, and any remaining member metadata.
> Under more strenuous use cases like mirroring clusters with thousands of 
> topics, this limitation can be reached even after applying gzip to the 
> __consumer_offsets topic.
> Various options were proposed in the discussion:
> # Config change: reduce the number of consumers in the group. This isn't 
> always a realistic answer in more strenuous use cases like MirrorMaker 
> clusters or for auditing.
> # Config change: split the group into smaller groups which together will get 
> full coverage of the topics. This gives each group member a smaller 
> subscription.(ex: g1 has topics starting with a-m while g2 has topics 
> starting with n-z). This would be operationally painful to manage.
> # Config change: split the topics among members of the group. Again this 
> gives each group member a smaller subscription. This would also be 
> operationally painful to manage.
> # Config change: bump up KafkaConfig.messageMaxBytes (a topic-level config) 
> and KafkaConfig.replicaFetchMaxBytes (a broker-level config). Applying 
> messageMaxBytes to just the __consumer_offsets topic seems relatively 
> harmless, but bumping up the broker-level replicaFetchMaxBytes would probably 
> need more attention.
> # Config change: try different compression codecs. Based on 2 minutes of 
> googling, it seems like lz4 and snappy are faster than gzip but have worse 
> compression, so this probably won't help.
> # Implementation change: support sending the regex over the wire instead of 
> the fully expanded topic subscriptions. I think people said in the past that 
> different languages have subtle differences in regex, so this doesn't play 
> nicely with cross-language groups.
> # Implementation change: maybe we can reverse the mapping? Instead of mapping 
> from member to subscriptions, we can map a subscription to a list of members.
> # Implementation change: maybe we can try to break apart the subscription and 
> assignments from the same SyncGroupRequest into multiple records? They can 
> still go to the same message set and get appended together. This way the 
> limit become the segment size, which shouldn't be a problem. This can be 
> tricky to get right because we're currently keying these messages on the 
> group, so I think records from the same rebalance might accidentally compact 
> one another, but my understanding of compaction isn't that great.
> # Implementation change: try to apply some tricks on the assignment 
> serialization to make it smaller.
> # Config and Implementation change: bump up the __consumer_offsets topic 
> messageMaxBytes and (from [~junrao]) fix how we deal with the case when a 
> message is larger than the fetch size. Today, if the fetch size is smaller 
> than the fetch size, the consumer will get stuck. Instead, we can simply 
> return the full message if it's larger than the fetch size w/o requiring the 
> consumer to manually adjust the fetch size.
> # Config and Implementation change: same as above but only apply the special 
> fetch logic when fetching from internal topics



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-3810) replication of internal topics should not be limited by replica.fetch.max.bytes

Reply via email to