It sounds like an I/O bottleneck (either max IOPS or max throughput) in
If you are looking for cold storage archival data only, then it may be
ok.(if it doesn't matter how long it takes to write the data)
If this is production data with any sort of IOPs load or data change
rate, I'd be concerned.
Too big of spin disks, will get killed on seek times. Too many & too big
spinners will likely bottleneck the i/O controller. It would be better
to use more of cheaper nodes to yield way more disks which are smaller.
(2TB max) (more disks, more i/o controllers, more motherboards = more
perf) Think "scale out" in # of nodes not "scale up" the individual nodes
Software Defined Storage Engineer
On 1/9/2020 3:52 PM, Stefan Priebe - Profihost AG wrote:
As a starting point the current idea is to use something like:
4-6 nodes with 12x 12tb disks each
AMD EPYC 7302P 3GHz, 16C/32T
Something to discuss is
- EC or go with 3 replicas. We'll use bluestore with compression.
- Do we need something like Intel Optane for WAL / DB or not?
Since we started using ceph we're mostly subscribed to SSDs - so no
knowlege about HDD in place.
Am 09.01.20 um 16:49 schrieb Stefan Priebe - Profihost AG:
Am 09.01.2020 um 16:10 schrieb Wido den Hollander <w...@42on.com>:
On 1/9/20 2:27 PM, Stefan Priebe - Profihost AG wrote:
Am 09.01.20 um 14:18 schrieb Wido den Hollander:
On 1/9/20 2:07 PM, Daniel Aberger - Profihost AG wrote:
Am 09.01.20 um 13:39 schrieb Janne Johansson:
I'm currently trying to workout a concept for a ceph cluster which can
be used as a target for backups which satisfies the following
- approx. write speed of 40.000 IOP/s and 2500 Mbyte/s
You might need to have a large (at least non-1) number of writers to get
to that sum of operations, as opposed to trying to reach it with one
single stream written from one single client.
We are aiming for about 100 writers.
So if I read it correctly the writes will be 64k each.
may be ;-) see below
That should be doable, but you probably want something like NVMe for DB+WAL.
You might want to tune that larger writes also go into the WAL to speed
up the ingress writes. But you mainly want more spindles then less.
I would like to give a little bit more insight about this and most
probobly some overhead we currently have in those numbers. Those values
come from our old classic raid storage boxes. Those use btrfs + zlib
compression + subvolumes for those backups and we've collected those
numbers from all of them.
The new system should just replicate snapshots from the live ceph.
Hopefully being able to use Erase Coding and compression? ;-)
Compression might work, but only if the data is compressable.
EC usually writes very fast, so that's good. I would recommend a lot of
spindles those. More spindles == more OSDs == more performance.
So instead of using 12TB drives you can consider 6TB or 8TB drives.
Currently we have a lot of 5TB 2.5 drives in place so we could use them.we
would like to start with around 4000 Iops and 250 MB per second while using 24
Drive boxes. We could please one or two NVMe PCIe cards in them.
ceph-users mailing list
ceph-users mailing list