Re: [ceph-users] bluestore object overhead

2017-04-19 Thread Jason Dillaman
Does the bluestore min alloc size apply for 4k block-size files [1]?

[1] https://github.com/ceph/ceph/blob/master/src/common/config_opts.h#L1063

On Wed, Apr 19, 2017 at 4:51 PM, Gregory Farnum  wrote:
> On Wed, Apr 19, 2017 at 1:49 PM, Pavel Shub  wrote:
>> On Wed, Apr 19, 2017 at 4:33 PM, Gregory Farnum  wrote:
>>> On Wed, Apr 19, 2017 at 1:26 PM, Pavel Shub  wrote:
 Hey All,

 I'm running a test of bluestore in a small VM and seeing 2x overhead
 for each object in cephfs. Here's the output of df detail
 https://gist.github.com/pavel-citymaps/868a7c4b1c43cea9ab86cdf2e79198ee

 This is on a VM with all daemons & 20gb disk, all pools are of size 1.
 Is this the expected amount of overhead per object? Is there anyway to
 tweak bluestore settings?
>>>
>>> You're going to need to be clearer about what you mean by 2x overhead.
>>> Bluestore itself has a minimum size beneath which it will journal
>>> objects and then copy them into place, which might be considered 2x
>>> overhead. If you're talking about total number of cluster-wide disk
>>> ops, there's also a CephFS log which journals metadata updates that
>>> get flushed out to backing objects later, which might be considered 2x
>>> overhead. But I don't know what you mean just based on a ceph df. :)
>>> -Greg
>>
>> Sorry, I meant the disk space taken up by the files. I have a dataset
>> with lots of small files, my sample set 2.5gb in total size and 5gb on
>> a filesystem with a 4kb block size. When put the files inside ceph
>> bluestore they take up 6gb. Does bluestore have an internal block
>> size? Is there a way to adjust it? For comparison I created a
>> filestore OSD with 2kb block size and the data took up only 4.5gb.
>
> I can't speak with authority on bluestore, but at those total sizes I
> think you're just seeing the effects of the internal journaling.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] bluestore object overhead

2017-04-19 Thread Gregory Farnum
On Wed, Apr 19, 2017 at 1:49 PM, Pavel Shub  wrote:
> On Wed, Apr 19, 2017 at 4:33 PM, Gregory Farnum  wrote:
>> On Wed, Apr 19, 2017 at 1:26 PM, Pavel Shub  wrote:
>>> Hey All,
>>>
>>> I'm running a test of bluestore in a small VM and seeing 2x overhead
>>> for each object in cephfs. Here's the output of df detail
>>> https://gist.github.com/pavel-citymaps/868a7c4b1c43cea9ab86cdf2e79198ee
>>>
>>> This is on a VM with all daemons & 20gb disk, all pools are of size 1.
>>> Is this the expected amount of overhead per object? Is there anyway to
>>> tweak bluestore settings?
>>
>> You're going to need to be clearer about what you mean by 2x overhead.
>> Bluestore itself has a minimum size beneath which it will journal
>> objects and then copy them into place, which might be considered 2x
>> overhead. If you're talking about total number of cluster-wide disk
>> ops, there's also a CephFS log which journals metadata updates that
>> get flushed out to backing objects later, which might be considered 2x
>> overhead. But I don't know what you mean just based on a ceph df. :)
>> -Greg
>
> Sorry, I meant the disk space taken up by the files. I have a dataset
> with lots of small files, my sample set 2.5gb in total size and 5gb on
> a filesystem with a 4kb block size. When put the files inside ceph
> bluestore they take up 6gb. Does bluestore have an internal block
> size? Is there a way to adjust it? For comparison I created a
> filestore OSD with 2kb block size and the data took up only 4.5gb.

I can't speak with authority on bluestore, but at those total sizes I
think you're just seeing the effects of the internal journaling.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] bluestore object overhead

2017-04-19 Thread Pavel Shub
On Wed, Apr 19, 2017 at 4:33 PM, Gregory Farnum  wrote:
> On Wed, Apr 19, 2017 at 1:26 PM, Pavel Shub  wrote:
>> Hey All,
>>
>> I'm running a test of bluestore in a small VM and seeing 2x overhead
>> for each object in cephfs. Here's the output of df detail
>> https://gist.github.com/pavel-citymaps/868a7c4b1c43cea9ab86cdf2e79198ee
>>
>> This is on a VM with all daemons & 20gb disk, all pools are of size 1.
>> Is this the expected amount of overhead per object? Is there anyway to
>> tweak bluestore settings?
>
> You're going to need to be clearer about what you mean by 2x overhead.
> Bluestore itself has a minimum size beneath which it will journal
> objects and then copy them into place, which might be considered 2x
> overhead. If you're talking about total number of cluster-wide disk
> ops, there's also a CephFS log which journals metadata updates that
> get flushed out to backing objects later, which might be considered 2x
> overhead. But I don't know what you mean just based on a ceph df. :)
> -Greg

Sorry, I meant the disk space taken up by the files. I have a dataset
with lots of small files, my sample set 2.5gb in total size and 5gb on
a filesystem with a 4kb block size. When put the files inside ceph
bluestore they take up 6gb. Does bluestore have an internal block
size? Is there a way to adjust it? For comparison I created a
filestore OSD with 2kb block size and the data took up only 4.5gb.

- Pavel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] bluestore object overhead

2017-04-19 Thread Gregory Farnum
On Wed, Apr 19, 2017 at 1:26 PM, Pavel Shub  wrote:
> Hey All,
>
> I'm running a test of bluestore in a small VM and seeing 2x overhead
> for each object in cephfs. Here's the output of df detail
> https://gist.github.com/pavel-citymaps/868a7c4b1c43cea9ab86cdf2e79198ee
>
> This is on a VM with all daemons & 20gb disk, all pools are of size 1.
> Is this the expected amount of overhead per object? Is there anyway to
> tweak bluestore settings?

You're going to need to be clearer about what you mean by 2x overhead.
Bluestore itself has a minimum size beneath which it will journal
objects and then copy them into place, which might be considered 2x
overhead. If you're talking about total number of cluster-wide disk
ops, there's also a CephFS log which journals metadata updates that
get flushed out to backing objects later, which might be considered 2x
overhead. But I don't know what you mean just based on a ceph df. :)
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com