Hi,
While investigating slow requests on a Firefly (0.80.7) I looked at the
historic ops from the admin socket.
On a OSD which just spitted out some slow requests I noticed:
"received_at": "2014-12-22 17:08:41.496391",
"age": "9.948475",
"duration": "5.915489"
{ "time": "2014-12-22 17:08:41.496687",
"event": "waiting_for_osdmap"},
{ "time": "2014-12-22 17:08:46.216946",
"event": "reached_pg"},
It spend 5 seconds at "waitinf_for_osdmap"
Another request:
"received_at": "2014-12-22 17:08:41.499092",
"age": "9.945774",
"duration": "9.851261",
{ "time": "2014-12-22 17:08:41.499322",
"event": "waiting_for_osdmap"},
{ "time": "2014-12-22 17:08:51.349938",
"event": "reached_pg"}
How should I see this? What is the OSD actually doing?
In this case it is a RBD workload with all clients running with 0.80.5
librados.
The mons are in quorum and time is in sync and there are no osdmap
changes happing at this moment.
A earlier thread [0] suggested that it might also be a PG issue where
requests are serialized.
I do at some occasions see disks spiking to 100% busy for some time, but
I just want to understand the waiting_for_osdmap better to fully
understand what is happening there.
[0]: https://www.mail-archive.com/[email protected]/msg12754.html
--
Wido den Hollander
42on B.V.
Ceph trainer and consultant
Phone: +31 (0)20 700 9902
Skype: contact42on
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com