On Mon, Jul 16, 2018 at 11:41 PM Christian Wimmer < [email protected]> wrote:
> Hi all, > > I am trying to use Ceph with RGW to store lots (>300M) of small files (80% > 2-15kB, 20% up to 500kB). > After some testing, I wonder if Ceph is the right tool for that. > > Does anybody of you have experience with this use case? > > Things I came across: > - EC pools: default stripe-width is 4kB. Does it make sense to lower the > stripe width for small objects or is EC a bad idea for this use case? > - Bluestore: bluestore min alloc size is per default 64kB. Would it be > better to lower it to say 2kB or am I better off with Filestore (probably > not if I want to store a huge amount of small files)? > - Bluestore / RocksDB: RocksDB seems to consume a lot of disk space when > storing lots of files. > For example: I have OSDs with about 500k onodes (which should translate > to 500k stored objects, right?) and the DB size is about 30GB. That's about > 63kB per onode - which is a lot, considering the original object is about > 5kB. > Those numbers seem a little large to me (although with erasure coding they could make sense due to the "object info" replication across shards), but in general I would not expect Ceph or RGW to be a good fit for files which tend to be that small from a data storage efficiency standpoint. That said, you've got about 4TB of data there. Are you sure some large SSDs in a RAID1(+0) or something wouldn't fulfill your needs? ;) If you're more concerned about scaling out the IO than the ratio of data stored to data used, Ceph may still be a good choice. *shrug* -Greg > > Thanks, > Christian > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
