b-goyal commented on PR #484: URL: https://github.com/apache/kafka/pull/484#issuecomment-1997293906
Currently when creating a log, the directory is chosen by calculating the number of partitions in each directory and then choosing the data directory with the fewest partitions. However, the sizes of different TopicParitions are very different, which lead to usage vary greatly between different logDirs. And usually each logDir corresponds to a disk, so the disk usage between different disks is very imbalance . The possible solution is to reassign partitions in high-usage logDirs to low-usage logDirs. I change the format of /admin/reassign_partitions,add replicaDirs field. At reassigning Partitions, when broker’s LogManager.createLog() is invoked , if replicaDir is specified , the specified logDir will be chosen, otherwise the logDir with the fewest partitions will be chosen. the old /admin/reassign_partitions: ``` {"version":1, "partitions": [ { "topic" : "Foo", "partition": 1, "replicas": [1, 2, 3] } ] } ``` the new /admin/reassign_partitions: ``` {"version":1, "partitions": [ { "topic" : "Foo", "partition": 1, "replicas": [1, 2, 3], "replicaDirs": {"1":"/data1/kafka_data", "3":"/data10/kakfa_data" } } ] } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org