On 13. april 2018 05:32, Chad William Seys wrote:
I think your observations suggest that, to a first approximation,
filling drives with bytes to the same absolute level is better for
performance than filling drives to the same percentage full. Assuming
random distribution of PGs, this would cause the smallest drives to be
as active as the largest drives.
E.g. if every drive had 1TB of data, each would be equally likely to
contain the PG of interest.
Of course, as more data was added the smallest drives could not hold
more and the larger drives become more active, but at least the smaller
drives would as active as possible.
but in this case you would have a steep drop off of performance. when
you reach the fill level where small drives do not accept more data,
suddenly you would have a performance cliff where only your larger disks
are doing new writes. and only larger disks doing reads on new data.
it is also easier to make the logical connection while you are
installing new nodes/disks. then a year later when your cluster just
happen to reach that fill level.
it would also be an easier job balancing disks between nodes when you
are adding osd's anyway and the new ones are mostly empty. rather then
when your small osd's are full and your large disks have significant
data on them.
ceph-users mailing list