b-goyal commented on PR #484:
URL: https://github.com/apache/kafka/pull/484#issuecomment-1997293906
Currently when creating a log, the directory is chosen by calculating the
number of partitions
in each directory and then choosing the data directory with the fewest
partitions.
However, the sizes of different TopicParitions are very different, which
lead to usage vary greatly between different logDirs. And usually each logDir
corresponds to a disk, so the disk usage between different disks is very
imbalance .
The possible solution is to reassign partitions in high-usage logDirs to
low-usage logDirs. I change the format of /admin/reassign_partitions,add
replicaDirs field. At reassigning Partitions, when broker’s
LogManager.createLog() is invoked , if replicaDir is specified , the specified
logDir will be chosen, otherwise the logDir with the fewest partitions will be
chosen.
the old /admin/reassign_partitions:
```
{"version":1,
"partitions":
[
{
"topic" : "Foo",
"partition": 1,
"replicas": [1, 2, 3]
}
]
}
```
the new /admin/reassign_partitions:
```
{"version":1,
"partitions":
[
{
"topic" : "Foo",
"partition": 1,
"replicas": [1, 2, 3],
"replicaDirs": {"1":"/data1/kafka_data", "3":"/data10/kakfa_data" }
}
]
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]