[jira] [Commented] (KAFKA-3015) Improve JBOD data balancing

Joe Stein (JIRA) Tue, 22 Dec 2015 22:23:02 -0800

    [ 
https://issues.apache.org/jira/browse/KAFKA-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069215#comment-15069215
 ]


Joe Stein commented on KAFKA-3015:
----------------------------------

Can we do both of these at the same time 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-18+-+JBOD+Support and 
provide an option for folks by topic for which it is using? I haven't taken a 
look in a while at KAFKA-2188 if that is also a good direction for folks we 
should talk about picking that back up too. Its a little stale but some re-base 
and reviews, fixes, reviews if folks have need for Kafka brokers staying up on 
disk failure without RAID. So it would be like at least 3 parts to it. There 
may be other items in the "JBOD" realm folks want to work on too.

> Improve JBOD data balancing
> ---------------------------
>
>                 Key: KAFKA-3015
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3015
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Jay Kreps
>
> When running with multiple data directories (i.e. JBOD) we currently place 
> partitions entirely within one data directory. This tends to lead to poor 
> balancing across disks as some topics have more throughput/retention and not 
> all disks get data from all topics. You can't fix this problem with smarter 
> partition placement strategies because ultimately you don't know when a 
> partition is created when or how heavily it will be used (this is a subtle 
> point, and the tendency is to try to think of some more sophisticated way to 
> place partitions based on current data size but this is actually 
> exceptionally dangerous and can lead to much worse imbalance when creating 
> many partitions at once as they would all go to the disk with the least 
> data). We don't support online rebalancing across directories/disks so this 
> imbalance is a big problem and limits the usefulness of this configuration. 
> Implementing online rebalancing of data across disks without downtime is 
> actually quite hard and requires lots of I/O since you have to actually 
> rewrite full partitions of data.
> An alternative would be to place each partition in *all* directories/drives 
> and round-robin *segments* within the partition across the directories. So 
> the layout would be something like:
>   drive-a/mytopic-0/
>       0000000.data
>       0000000.index
>       0024680.data
>       0024680.index
>   drive-a/mytopic-0/
>       0012345.data
>       0012345.index
>       0036912.data
>       0036912.index
> This is a little harder to implement than the current approach but not very 
> hard, and it is a lot easier than implementing online data balancing across 
> disks while retaining the current approach. I think this could easily be done 
> in a backwards compatible way.
> I think the balancing you would get from this in most cases would be good 
> enough to make JBOD the default configuration. Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3015) Improve JBOD data balancing

Reply via email to