On 12/22/2014 07:04 PM, Gregory Farnum wrote:
> On Mon, Dec 22, 2014 at 8:20 AM, Wido den Hollander <[email protected]> wrote:
>> Hi,
>>
>> While investigating slow requests on a Firefly (0.80.7) I looked at the
>> historic ops from the admin socket.
>>
>> On a OSD which just spitted out some slow requests I noticed:
>>
>> "received_at": "2014-12-22 17:08:41.496391",
>> "age": "9.948475",
>> "duration": "5.915489"
>>
>> { "time": "2014-12-22 17:08:41.496687",
>> "event": "waiting_for_osdmap"},
>> { "time": "2014-12-22 17:08:46.216946",
>> "event": "reached_pg"},
>>
>> It spend 5 seconds at "waitinf_for_osdmap"
>>
>> Another request:
>>
>> "received_at": "2014-12-22 17:08:41.499092",
>> "age": "9.945774",
>> "duration": "9.851261",
>>
>> { "time": "2014-12-22 17:08:41.499322",
>> "event": "waiting_for_osdmap"},
>> { "time": "2014-12-22 17:08:51.349938",
>> "event": "reached_pg"}
>>
>> How should I see this? What is the OSD actually doing?
>>
>> In this case it is a RBD workload with all clients running with 0.80.5
>> librados.
>>
>> The mons are in quorum and time is in sync and there are no osdmap
>> changes happing at this moment.
>>
>> A earlier thread [0] suggested that it might also be a PG issue where
>> requests are serialized.
>>
>> I do at some occasions see disks spiking to 100% busy for some time, but
>> I just want to understand the waiting_for_osdmap better to fully
>> understand what is happening there.
>
> What message types are these? The waiting_for_osdmap state is supposed
> to cover only that, but there might be some overlooked blocking points
> or something.
For example, two ops:
#1:
{ "description": "osd_sub_op(client.2433432.0:61603164 20.424
19038c24\/rbd_data.d7c912ae8944a.00000000000008b6\/head\/\/20 [] v
63283'8301089 snapset=0=[]:[] snapc=0=[])",
"received_at": "2014-12-22 19:26:37.458680",
"age": "2.719850",
"duration": "2.520937",
"type_data": [
"commit sent; apply or cleanup",
[
{ "time": "2014-12-22 19:26:37.458914",
"event": "waiting_for_osdmap"},
{ "time": "2014-12-22 19:26:39.310569",
"event": "reached_pg"},
{ "time": "2014-12-22 19:26:39.310728",
"event": "started"},
{ "time": "2014-12-22 19:26:39.310951",
"event": "started"},
{ "time": "2014-12-22 19:26:39.979292",
"event": "commit_queued_for_journal_write"},
{ "time": "2014-12-22 19:26:39.979348",
"event": "write_thread_in_journal_buffer"},
{ "time": "2014-12-22 19:26:39.979594",
"event": "journaled_completion_queued"},
{ "time": "2014-12-22 19:26:39.979617",
"event": "commit_sent"}]]},
#2:
{ "description": "osd_sub_op(client.2188703.0:10420738 20.641
6673ee41\/rbd_data.9497e32794ff7.0000000000000454\/head\/\/20 [] v
63283'5215076 snapset=0=[]:[] snapc=0=[])",
"received_at": "2014-12-22 19:26:38.040551",
"age": "2.137979",
"duration": "1.537128",
"type_data": [
"started",
[
{ "time": "2014-12-22 19:26:38.040717",
"event": "waiting_for_osdmap"},
{ "time": "2014-12-22 19:26:39.577609",
"event": "reached_pg"},
{ "time": "2014-12-22 19:26:39.577624",
"event": "started"},
{ "time": "2014-12-22 19:26:39.577679",
"event": "started"}]]},
Can this be something which has to do with the amount of RBD snapshots?
Since I see snapc involved in both ops?
This cluster has about 4.000 RBD images and about 1.000 of them have a
snapshot.
Total number of OSDs in this case is 174.
> -Greg
>
--
Wido den Hollander
42on B.V.
Ceph trainer and consultant
Phone: +31 (0)20 700 9902
Skype: contact42on
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com