On Thu, Aug 27, 2015 at 3:43 AM, huang jun <[email protected]> wrote:
> hi,llya
>
> 2015-08-26 23:56 GMT+08:00 Ilya Dryomov <[email protected]>:
>> On Wed, Aug 26, 2015 at 6:22 PM, Haomai Wang <[email protected]> wrote:
>>> On Wed, Aug 26, 2015 at 11:16 PM, huang jun <[email protected]> wrote:
>>>> hi,all
>>>> we create a 2TB rbd image, after map it to local,
>>>> then we format it to xfs with 'mkfs.xfs /dev/rbd0', it spent 318
>>>> seconds to finish, but local physical disk with the same size just
>>>> need 6 seconds.
>>>>
>>>
>>> I think librbd have two PR related to this.
>>>
>>>> After debug, we found there are two steps in rbd module during formating:
>>>> a) send 233093 DELETE requests to osds(number_of_requests = 2TB / 4MB),
>>>> this step spent almost 92 seconds.
>>>
>>> I guess this(https://github.com/ceph/ceph/pull/4221/files) may help
>>
>> It's submitting deletes for non-existent objects, not zeroing. The
>> only thing that will really help here is the addition of rbd object map
>> support to the kernel client. That could happen in 4.4, but 4.5 is
>> a safer bet.
>>
>>>
>>>> b) send 4238 messages like this: [set-alloc-hint object_size 4194304
>>>> write_size 4194304,write 0~512] to osds, that spent 227 seconds.
>>>
>>> I think kernel rbd also need to use
>>> https://github.com/ceph/ceph/pull/4983/files
>>
>> set-alloc-hint may be a problem, but I think a bigger problem is the
>> size of the write. Are all those writes 512 bytes long?
>>
> In another test to format 2TB rbd device,
> there are :
> 2 messages,each write 131072 bytes
> 4000 messages, each write 262144 bytes
> 112 messages, each write 4096 bytes
> 194 messages, each write 512 bytes
So the majority of writes is not 512 bytes long. I don't think
disabling set-alloc-hint (and, as of now at least, you can't disable it
anyway) would drastically change the numbers. If you are doing mkfs
right after creating and mapping an image for the first time, you can
add -K option to mkfs, which will tell it to not try to discard. As
for the write phase, I can't suggest anything off hand.
Thanks,
Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html