>>Could you describe more about 2x70000 iops?
>>So you mean 8 OSD each backend with SSD can achieve with 14w iops?
It's a small rbd (10G), so mostly read hit the buffer cache.
But yes, it's able to deliver 140000iops with 8 osd. (I check also stats in
ceph cluster to be sure).
(and I'm not cpu bound on osd nodes)
>> 2014-10-31 05:58:34.231037 mon.0 [INF] pgmap v7109: 1264 pgs: 1264
>> active+clean; 165 GB data, 109 GB used, 6226 GB / 6335 GB avail; 560 MB/s
>> rd, 140 kop/s
here the ceph.conf of osd nodes
[global]
fsid = c29f4643-9577-4671-ae25-59ad14550aba
auth_cluster_required = none
auth_service_required = none
auth_client_required = none
filestore_xattr_use_omap = true
debug lockdep = 0/0
debug context = 0/0
debug crush = 0/0
debug buffer = 0/0
debug timer = 0/0
debug journaler = 0/0
debug osd = 0/0
debug optracker = 0/0
debug objclass = 0/0
debug filestore = 0/0
debug journal = 0/0
debug ms = 0/0
debug monc = 0/0
debug tp = 0/0
debug auth = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug perfcounter = 0/0
debug asok = 0/0
debug throttle = 0/0
osd_op_threads = 5
filestore_op_threads = 4
osd_op_num_threads_per_shard = 1
osd_op_num_shards = 25
filestore_fd_cache_size = 64
filestore_fd_cache_shards = 32
osd_enable_op_tracker = false
>>is it read or write? could you give fio options?
random read 4K
Here the fio config.
[global]
ioengine=aio
invalidate=1
rw=randread
bs=4K
direct=1
numjobs=1
group_reporting=1
size=10G
[test1]
iodepth=64
filename=/dev/rbd/test/test
On 1 client node, I can't reach more than 50000iops with 6osd or 70000iops with
8 osd.
(I had try to increasing numjobs to have more fio process or with 2 differents
rbd volume at the same time,
but performance is the same).
>> 2014-10-31 05:57:30.078348 mon.0 [INF] pgmap v7070: 1264 pgs: 1264
>> active+clean; 165 GB data, 109 GB used, 6226 GB / 6335 GB avail; 290 MB/s
>> rd, 74572 op/s
But If I launch same fio test on another client node, I can reach same
70000iops at the same time.
>> 2014-10-31 05:58:34.231037 mon.0 [INF] pgmap v7109: 1264 pgs: 1264
>> active+clean; 165 GB data, 109 GB used, 6226 GB / 6335 GB avail; 560 MB/s
>> rd, 140 kop/s
----- Mail original -----
De: "Haomai Wang" <[email protected]>
À: "Alexandre DERUMIER" <[email protected]>
Cc: "Sage Weil" <[email protected]>, "Christoph Hellwig" <[email protected]>,
"Ceph Devel" <[email protected]>
Envoyé: Jeudi 30 Octobre 2014 18:05:26
Objet: Re: krbd blk-mq support ?
Could you describe more about 2x70000 iops?
So you mean 8 OSD each backend with SSD can achieve with 14w iops?
is it read or write? could you give fio options?
On Fri, Oct 31, 2014 at 12:01 AM, Alexandre DERUMIER
<[email protected]> wrote:
>>>I'll try to add more OSD next week, if it's scale it's a very good news !
>
> I just tried to add 2 more osds,
>
> I can now reach 2x 70000 iops on 2 client nodes (vs 2 x 50000 previously).
>
> and kworker cpu usage is also lower (84% vs 97%).
> (don't understand why exactly)
>
> So, Thanks for help everybody !
>
>
>
>
>
> ----- Mail original -----
>
> De: "Alexandre DERUMIER" <[email protected]>
> À: "Sage Weil" <[email protected]>
> Cc: "Christoph Hellwig" <[email protected]>, "Ceph Devel"
> <[email protected]>
> Envoyé: Jeudi 30 Octobre 2014 09:11:11
> Objet: Re: krbd blk-mq support ?
>
>>>Hmm, this is probably the messenger.c worker then that is feeding messages
>>>to the network. How many OSDs do you have? It should be able to scale
>>>with the number of OSDs.
>
> Thanks Sage for your reply.
>
> Currently 6 OSD (ssd) on the test platform.
>
> But I can reach 2x 50000iops on same rbd volume with 2 clients on 2
> differents host.
> Do you think messenger.c worker can be the bottleneck in this case ?
>
>
> I'll try to add more OSD next week, if it's scale it's a very good news !
>
>
>
>
>
>
>
> ----- Mail original -----
>
> De: "Sage Weil" <[email protected]>
> À: "Alexandre DERUMIER" <[email protected]>
> Cc: "Christoph Hellwig" <[email protected]>, "Ceph Devel"
> <[email protected]>
> Envoyé: Mercredi 29 Octobre 2014 16:00:56
> Objet: Re: krbd blk-mq support ?
>
> On Wed, 29 Oct 2014, Alexandre DERUMIER wrote:
>> >>Oh, that's without the blk-mq patch?
>>
>> Yes, sorry, I don't how to use perf with a custom compiled kernel.
>> (Usualy I'm using perf from debian, with linux-tools package provided with
>> the debian kernel package)
>>
>> >>Either way the profile doesn't really sum up to a fully used up cpu.
>>
>> But I see mostly same behaviour with or without blk-mq patch, I have always
>> 1 kworker at around 97-100%cpu (1core) for 50000iops.
>>
>> I had also tried to map the rbd volume with nocrc, it's going to 60000iops
>> with same kworker at around 97-100%cpu
>
> Hmm, this is probably the messenger.c worker then that is feeding messages
> to the network. How many OSDs do you have? It should be able to scale
> with the number of OSDs.
>
> sage
>
>
>>
>>
>>
>> ----- Mail original -----
>>
>> De: "Christoph Hellwig" <[email protected]>
>> ?: "Alexandre DERUMIER" <[email protected]>
>> Cc: "Ceph Devel" <[email protected]>
>> Envoy?: Mardi 28 Octobre 2014 19:07:25
>> Objet: Re: krbd blk-mq support ?
>>
>> On Mon, Oct 27, 2014 at 11:00:46AM +0100, Alexandre DERUMIER wrote:
>> > >>Can you do a perf report -ag and then a perf report to see where these
>> > >>cycles are spent?
>> >
>> > Yes, sure.
>> >
>> > I have attached the perf report to this mail.
>> > (This is with kernel 3.14, don't have access to my 3.18 host for now)
>>
>> Oh, that's without the blk-mq patch?
>>
>> Either way the profile doesn't really sum up to a fully used up
>> cpu. Sage, Alex - are there any ordring constraints in the rbd client?
>> If not we could probably aim for per-cpu queues using blk-mq and a
>> socket per cpu or similar.
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Best Regards,
Wheat
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html