On Mon, Oct 12, 2015 at 2:22 PM, Deneau, Tom <tom.den...@amd.com> wrote:
> I have a small ceph cluster (3 nodes, 5 osds each, journals all just 
> partitions
> on the spinner disks) and I have noticed that when I hit it with a bunch of
> rados bench clients all doing writes of large (40M objects) with --no-cleanup,
> the rados bench commands seem to finish OK but I often get health warnings 
> like
>     HEALTH_WARN 4 requests are blocked > 32 sec;
>                 2 osds have slow requests 3 ops are blocked > 32.768 sec on 
> osd.9
>                 1 ops are blocked > 32.768 sec on osd.10
>                 2 osds have slow requests
> After a couple of minutes, health goes to HEALTH_OK.
>
> But if I go to the node containing osd.10 for example and do dump_historic_ops
> I do get lots of around 20-sec durations but nothing over 32 sec.
>
> The 20-sec or so ops are always  "ack+ondisk+write+known_if_redirected"
> with type_data = "commit sent: apply or cleanup"
> and the following are typical event timings
>
>                                initiated: 14:06:58.205937
>                               reached_pg: 14:07:01.823288, gap=  3617.351
>                                  started: 14:07:01.823359, gap=     0.071
>                waiting for subops from 3: 14:07:01.855259, gap=    31.900
>          commit_queued_for_journal_write: 14:07:03.132697, gap=  1277.438
>           write_thread_in_journal_buffer: 14:07:03.143356, gap=    10.659
>              journaled_completion_queued: 14:07:04.175863, gap=  1032.507
>                                op_commit: 14:07:04.585040, gap=   409.177
>                               op_applied: 14:07:04.589751, gap=     4.711
>                 sub_op_commit_rec from 3: 14:07:14.682925, gap= 10093.174
>                              commit_sent: 14:07:14.683081, gap=     0.156
>                                     done: 14:07:14.683119, gap=     0.038
>
> Should I expect to see a historic op with duration greater than 32 sec?

IIRC, the historic ops generally consist of up to the N longest
operations in the last Y time period, on a rolling basis. Something
like 20 ops from the last 6 minutes. So if you get a report of a
32-second blocked op and immediately go to the OSD, it should have
that op in either in-progress or historic ops.

One possibility is that if you've got peering happening, it's got ops
blocked on that but they get thrown out because the OSD turns out not
to be responsible for them.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to