Hi all,
I’ve recently noticed that our broker log.dirs are using up different amounts
of storage. We use JBOD for our brokers, with 12 log.dirs, 1 on each disk.
One of our topics is larger than the others, and has 12 partitions.
Replication factor is 3, and we have 4 brokers. Each broker then has to store
9 partitions for this topic (12*3/4 == 9).
I guess I had originally assumed that Kafka would be smart enough to spread
partitions for a given topic across each of the log.dirs as evenly as it could.
However, on some brokers this one topic has 2 partitions in a single log.dir,
meaning that the storage taken up on a single disk by this topic on those
brokers is twice what it should be.
e.g.
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 1.8T 1.2T 622G 66% /var/spool/kafka/a
/dev/sdb3 1.8T 1.7T 134G 93% /var/spool/kafka/b
…
$ du -sh /var/spool/kafka/{a,b}/data/webrequest_upload-*
501G a/data/webrequest_upload-4
500G b/data/webrequest_upload-11
501G b/data/webrequest_upload-8
This also means that those over populated disks have more writes to do. My I/O
is imbalanced!
This is sort of documented at http://kafka.apache.org/documentation.html
<http://kafka.apache.org/documentation.html>:
"If you configure multiple data directories partitions will be assigned
round-robin to data directories. Each partition will be entirely in one of the
data directories. If data is not well balanced among partitions this can lead
to load imbalance between disks.”
But my data is well balanced among partitions! It’s just that multiple
partitions are assigned to a single disk.
Anyyyyyyway, on to a question: Is it possible to move partitions between
log.dirs? Is there tooling to do so? Poking around in there, it looks like it
might be as simple as shutting down the broker, moving the partition directory,
and then editing both replication-offset-checkpoint and
recovery-point-offset-checkpoint files so that they say the appropriate things
in the appropriate directories, and then restarting broker.
Someone tell me that this is a horrible idea. :)
-Ao