Re: [ceph-users] bluestore object overhead
Does the bluestore min alloc size apply for 4k block-size files [1]? [1] https://github.com/ceph/ceph/blob/master/src/common/config_opts.h#L1063 On Wed, Apr 19, 2017 at 4:51 PM, Gregory Farnumwrote: > On Wed, Apr 19, 2017 at 1:49 PM, Pavel Shub wrote: >> On Wed, Apr 19, 2017 at 4:33 PM, Gregory Farnum wrote: >>> On Wed, Apr 19, 2017 at 1:26 PM, Pavel Shub wrote: Hey All, I'm running a test of bluestore in a small VM and seeing 2x overhead for each object in cephfs. Here's the output of df detail https://gist.github.com/pavel-citymaps/868a7c4b1c43cea9ab86cdf2e79198ee This is on a VM with all daemons & 20gb disk, all pools are of size 1. Is this the expected amount of overhead per object? Is there anyway to tweak bluestore settings? >>> >>> You're going to need to be clearer about what you mean by 2x overhead. >>> Bluestore itself has a minimum size beneath which it will journal >>> objects and then copy them into place, which might be considered 2x >>> overhead. If you're talking about total number of cluster-wide disk >>> ops, there's also a CephFS log which journals metadata updates that >>> get flushed out to backing objects later, which might be considered 2x >>> overhead. But I don't know what you mean just based on a ceph df. :) >>> -Greg >> >> Sorry, I meant the disk space taken up by the files. I have a dataset >> with lots of small files, my sample set 2.5gb in total size and 5gb on >> a filesystem with a 4kb block size. When put the files inside ceph >> bluestore they take up 6gb. Does bluestore have an internal block >> size? Is there a way to adjust it? For comparison I created a >> filestore OSD with 2kb block size and the data took up only 4.5gb. > > I can't speak with authority on bluestore, but at those total sizes I > think you're just seeing the effects of the internal journaling. > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] bluestore object overhead
On Wed, Apr 19, 2017 at 1:49 PM, Pavel Shubwrote: > On Wed, Apr 19, 2017 at 4:33 PM, Gregory Farnum wrote: >> On Wed, Apr 19, 2017 at 1:26 PM, Pavel Shub wrote: >>> Hey All, >>> >>> I'm running a test of bluestore in a small VM and seeing 2x overhead >>> for each object in cephfs. Here's the output of df detail >>> https://gist.github.com/pavel-citymaps/868a7c4b1c43cea9ab86cdf2e79198ee >>> >>> This is on a VM with all daemons & 20gb disk, all pools are of size 1. >>> Is this the expected amount of overhead per object? Is there anyway to >>> tweak bluestore settings? >> >> You're going to need to be clearer about what you mean by 2x overhead. >> Bluestore itself has a minimum size beneath which it will journal >> objects and then copy them into place, which might be considered 2x >> overhead. If you're talking about total number of cluster-wide disk >> ops, there's also a CephFS log which journals metadata updates that >> get flushed out to backing objects later, which might be considered 2x >> overhead. But I don't know what you mean just based on a ceph df. :) >> -Greg > > Sorry, I meant the disk space taken up by the files. I have a dataset > with lots of small files, my sample set 2.5gb in total size and 5gb on > a filesystem with a 4kb block size. When put the files inside ceph > bluestore they take up 6gb. Does bluestore have an internal block > size? Is there a way to adjust it? For comparison I created a > filestore OSD with 2kb block size and the data took up only 4.5gb. I can't speak with authority on bluestore, but at those total sizes I think you're just seeing the effects of the internal journaling. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] bluestore object overhead
On Wed, Apr 19, 2017 at 4:33 PM, Gregory Farnumwrote: > On Wed, Apr 19, 2017 at 1:26 PM, Pavel Shub wrote: >> Hey All, >> >> I'm running a test of bluestore in a small VM and seeing 2x overhead >> for each object in cephfs. Here's the output of df detail >> https://gist.github.com/pavel-citymaps/868a7c4b1c43cea9ab86cdf2e79198ee >> >> This is on a VM with all daemons & 20gb disk, all pools are of size 1. >> Is this the expected amount of overhead per object? Is there anyway to >> tweak bluestore settings? > > You're going to need to be clearer about what you mean by 2x overhead. > Bluestore itself has a minimum size beneath which it will journal > objects and then copy them into place, which might be considered 2x > overhead. If you're talking about total number of cluster-wide disk > ops, there's also a CephFS log which journals metadata updates that > get flushed out to backing objects later, which might be considered 2x > overhead. But I don't know what you mean just based on a ceph df. :) > -Greg Sorry, I meant the disk space taken up by the files. I have a dataset with lots of small files, my sample set 2.5gb in total size and 5gb on a filesystem with a 4kb block size. When put the files inside ceph bluestore they take up 6gb. Does bluestore have an internal block size? Is there a way to adjust it? For comparison I created a filestore OSD with 2kb block size and the data took up only 4.5gb. - Pavel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] bluestore object overhead
On Wed, Apr 19, 2017 at 1:26 PM, Pavel Shubwrote: > Hey All, > > I'm running a test of bluestore in a small VM and seeing 2x overhead > for each object in cephfs. Here's the output of df detail > https://gist.github.com/pavel-citymaps/868a7c4b1c43cea9ab86cdf2e79198ee > > This is on a VM with all daemons & 20gb disk, all pools are of size 1. > Is this the expected amount of overhead per object? Is there anyway to > tweak bluestore settings? You're going to need to be clearer about what you mean by 2x overhead. Bluestore itself has a minimum size beneath which it will journal objects and then copy them into place, which might be considered 2x overhead. If you're talking about total number of cluster-wide disk ops, there's also a CephFS log which journals metadata updates that get flushed out to backing objects later, which might be considered 2x overhead. But I don't know what you mean just based on a ceph df. :) -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com