Re: [ceph-users] More than 50% osds down, CPUs still busy; will the cluster recover without help?

Gregory Farnum Wed, 18 Mar 2015 07:02:41 -0700

On Wed, Mar 18, 2015 at 3:28 AM, Chris Murray <[email protected]> wrote:
> Hi again Greg :-)
>
> No, it doesn't seem to progress past that point. I started the OSD again a 
> couple of nights ago:
>
> 2015-03-16 21:34:46.221307 7fe4a8aa7780 10 journal op_apply_finish 13288339 
> open_ops 1 -> 0, max_applied_seq 13288338 -> 13288339
> 2015-03-16 21:34:46.221445 7fe4a8aa7780  3 journal journal_replay: r = 0, 
> op_seq now 13288339
> 2015-03-16 21:34:46.221513 7fe4a8aa7780  2 journal read_entry 3951706112 : 
> seq 13288340 1755 bytes
> 2015-03-16 21:34:46.221547 7fe4a8aa7780  3 journal journal_replay: applying 
> op seq 13288340
> 2015-03-16 21:34:46.221579 7fe4a8aa7780 10 journal op_apply_start 13288340 
> open_ops 0 -> 1
> 2015-03-16 21:34:46.221610 7fe4a8aa7780 10 
> filestore(/var/lib/ceph/osd/ceph-1) _do_transaction on 0x3142480
> 2015-03-16 21:34:46.221651 7fe4a8aa7780 15 
> filestore(/var/lib/ceph/osd/ceph-1) _omap_setkeys meta/16ef7597/infos/head//-1
> 2015-03-16 21:34:46.222017 7fe4a8aa7780 10 filestore oid: 
> 16ef7597/infos/head//-1 not skipping op, *spos 13288340.0.1
> 2015-03-16 21:34:46.222053 7fe4a8aa7780 10 filestore  > header.spos 0.0.0
> 2015-03-16 21:34:48.096002 7fe49a5ac700 20 
> filestore(/var/lib/ceph/osd/ceph-1) sync_entry woke after 5.000178
> 2015-03-16 21:34:48.096037 7fe49a5ac700 10 journal commit_start 
> max_applied_seq 13288339, open_ops 1
> 2015-03-16 21:34:48.096040 7fe49a5ac700 10 journal commit_start waiting for 1 
> open ops to drain
>
> There's the success line for 13288339, like you mentioned. But not one for 
> 13288340.
>
> Intriguing. So, those same 1755 bytes seem problematic every time the journal 
> is replayed? Interestingly, there is a lot (in time, not exactly data mass or 
> IOPs, but still more than 1755 bytes!) of activity while the log is at this 
> line:
>
> 2015-03-16 21:34:48.096040 7fe49a5ac700 10 journal commit_start waiting for 1 
> open ops to drain
>
> ... but then the IO ceases and the log still doesn't go any further. I wonder 
> why 13288339 doesn't have that same  'waiting for ... open ops to drain' 
> line. Or the 'woke after' one for that matter.
>
> While there is activity on sdb, it 'pulses' every 10 seconds or so, like this:
>
> sdb              20.00         0.00      3404.00          0       3404
> sdb              16.00         0.00      2100.00          0       2100
> sdb              10.00         0.00      1148.00          0       1148
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               1.00         0.00       496.00          0        496
> sdb              32.00         0.00      4940.00          0       4940
> sdb               8.00         0.00      1144.00          0       1144
> sdb               1.00         0.00         4.00          0          4
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb              17.00         0.00      3340.00          0       3340
> sdb              23.00         0.00      3368.00          0       3368
> sdb               1.00         0.00         4.00          0          4
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb               0.00         0.00         0.00          0          0
> sdb              13.00         0.00      3332.00          0       3332
> sdb              18.00         0.00      2360.00          0       2360
> sdb              59.00         0.00      7464.00          0       7464
> sdb               0.00         0.00         0.00          0          0
>
> I was hoping Google may have held some clues, but it seems I'm the only one 
> :-)
>
> https://www.google.co.uk/?gws_rd=ssl#q=%22journal+commit_start+waiting+for%22+%22open+ops+to+drain%22
>
> I tried removing "compress-force=lzo" from "osd mount options btrfs" in 
> ceph.conf, in case it was the compression algorithm which was 'blocking' the 
> writes, but the same seems to be observed. Last entry in log, activity, 
> activity stops, log doesn't progress further.


Ooooohhhhhh.....I'm not terribly familiar with the code that's stuck,
but last time it came up btrfs compression was known to be broken
under the Ceph workload. :( I think it tended to crash rather than
hang like this so I'm a bit surprised, but if this op is touching a
"broken" file or something that could explain it.

> On another note, I'm not getting my own e-mails back to myself from the list 
> anymore. In case I should be troubleshooting this side, is/isn't this 
> happening to others?

Uh, no idea. I think that's a config option you can set on mailman?
But I'm using Gmail so this has always been hidden from me.
-Greg
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] More than 50% osds down, CPUs still busy; will the cluster recover without help?

Reply via email to