I know FileStore.ondisk_finisher handle C_OSD_OpCommit , and from
"journaled_completion_queue" to "op_commit" cost 3.6 seconds, maybe cost in the
function of ReplicatedPG::op_commit .
Through OpTracker , I find that ReplicatedPG::op_commit first lock pg, but it
sometimes cost from 0.5 to 1 second , so the whole ondisk_finisher must wait 1
second. How can cancel pg lock in the
ReplicatedPG::op_commit ?
thanks
[email protected]
发件人: Guang Yang
发送时间: 2014-07-01 11:39
收件人: baijiaruo
抄送: ceph-users
主题: Re: [ceph-users] Ask a performance question for the RGW
On Jun 30, 2014, at 3:59 PM, [email protected] wrote:
> Hello,
> thanks for you answer the question.
> But when there are less than 50 thousand objects, and latency is very big . I
> see the write ops for the bucket index object., from
> "journaled_completion_queue" to "op_commit" cost 3.6 seconds,this mean that
> from “writing journal finish” to "op_commit" cost 3.6 seconds。
> so I can't understand this and what happened?
The operations updating the same bucket index object get serialized, one
possibility is that those operation was hang there waiting other ops finishing
their work.
>
> thanks
> [email protected]
>
> 发件人: Guang Yang
> 发送时间: 2014-06-30 14:57
> 收件人: baijiaruo
> 抄送: ceph-users
> 主题: Re: [ceph-users] Ask a performance question for the RGW
> Hello,
> There is a known limitation of bucket scalability, and there is a blueprint
> tracking it -
> https://wiki.ceph.com/Planning/Blueprints/Submissions/rgw%3A_bucket_index_scalability.
>
> At time being, I would recommend to do sharding at application level (create
> multiple buckets) to workaround this limitation.
>
> Thanks,
> Guang
>
> On Jun 30, 2014, at 2:54 PM, [email protected] wrote:
>
> >
> > hello, everyone!
> >
> > when I user rest bench test RGW performance and the cmd is:
> > ./rest-bench --access-key=ak --secret=sk --bucket=bucket_name --seconds=600
> > -t 200 -b 524288 -no-cleanup write
> >
> > test result:
> > Total time run: 362.962324 T
> > otal writes made: 48189
> > Write size: 524288
> > Bandwidth (MB/sec): 66.383
> > Stddev Bandwidth: 40.7776
> > Max bandwidth (MB/sec): 173
> > Min bandwidth (MB/sec): 0
> > Average Latency: 1.50435
> > Stddev Latency: 0.910731
> > Max latency: 9.12276
> > Min latency: 0.19867
> >
> > my environment is 4 host and 40 disk(osd)。 but test result is very bad,
> > average latency is 1.5 seconds 。and I find write obj metadate is very
> > slowly。because it puts so many object to one bucket, we know writing object
> > metadate can call method “bucket_prepare_op”,and test find this op is very
> > slowly。 I find the osd which contain bucket-obj。and see the
> > “bucket_prepare_op”by dump_historic_ops :
> > { "description": "osd_op(client.4742.0:87613 .dir.default.4243.3 [call
> > rgw.bucket_prepare_op] 3.3670fe74 e317)",
> > "received_at": "2014-06-30 13:35:55.409597",
> > "age": "51.148026",
> > "duration": "4.130137",
> > "type_data": [
> > "commit sent; apply or cleanup",
> > { "client": "client.4742",
> > "tid": 87613},
> > [
> > { "time": "2014-06-30 13:35:55.409660",
> > "event": "waiting_for_osdmap"},
> > { "time": "2014-06-30 13:35:55.409669",
> > "event": "queue op_wq"},
> > { "time": "2014-06-30 13:35:55.896766",
> > "event": "reached_pg"},
> > { "time": "2014-06-30 13:35:55.896793",
> > "event": "started"},
> > { "time": "2014-06-30 13:35:55.896796",
> > "event": "started"},
> > { "time": "2014-06-30 13:35:55.899450",
> > "event": "waiting for subops from [40,43]"},
> > { "time": "2014-06-30 13:35:55.899757",
> > "event": "commit_queued_for_journal_write"},
> > { "time": "2014-06-30 13:35:55.899799",
> > "event": "write_thread_in_journal_buffer"},
> > { "time": "2014-06-30 13:35:55.899910",
> > "event": "journaled_completion_queued"},
> > { "time": "2014-06-30 13:35:55.899936",
> > "event": "journal first callback"},
> > { "time": "2014-06-30 13:35:55.899944",
> > "event": "queuing ondisk"},
> > { "time": "2014-06-30 13:35:56.142104",
> > "event": "sub_op_commit_rec"},
> > { "time": "2014-06-30 13:35:56.176950",
> > "event": "sub_op_commit_rec"},
> > { "time": "2014-06-30 13:35:59.535301",
> > "event": "op_commit"},
> > { "time": "2014-06-30 13:35:59.535331",
> > "event": "commit_sent"},
> > { "time": "2014-06-30 13:35:59.539723",
> > "event": "op_applied"},
> > { "time": "2014-06-30 13:35:59.539734",
> > "event": "done"}]]},
> >
> > so why from "journaled_completion_queued" to "op_commit" is very slowly,
> > and what happened?
> > thanks
> >
> > [email protected]
> > _______________________________________________
> > ceph-users mailing list
> > [email protected]
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com