Re: [ceph-users] disk timeouts in libvirt/qemu VMs...
The exclusive-lock feature should only require grabbing the lock on the very first IO, so if this is an issue that pops up after extended use, it's either most likely not related to exclusive-lock or perhaps you had a client<->OSD link hiccup. In the latter case, you will see a log message like "image watch failed" in your logs. Since this isn't something that we have run into during our regular testing, I would greatly appreciate if someone could capture a "gcore" dump from a running but stuck process and use "ceph-post-file" to provide us with the dump (along with versions of installed RPMs/DEBs so we can configure the proper debug symbols). On Thu, Mar 30, 2017 at 7:18 AM, Peter Maloneywrote: > On 03/28/17 17:28, Brian Andrus wrote: >> Just adding some anecdotal input. It likely won't be ultimately >> helpful other than a +1.. >> >> Seemingly, we also have the same issue since enabling exclusive-lock >> on images. We experienced these messages at a large scale when making >> a CRUSH map change a few weeks ago that resulted in many many VMs >> experiencing the blocked task kernel messages, requiring reboots. >> >> We've since disabled on all images we can, but there are still >> jewel-era instances that cannot have the feature disabled. Since >> disabling the feature, I have not observed any cases of blocked tasks, >> but so far given the limited timeframe I'd consider that anecdotal. >> >> > > Why do you need it enabled in jewel-era instances? With jewel you can > set them on the fly, and live migrate the VM to get the client to update > its usage of it. > > I couldn't find any difference except removing big images is faster with > object-map (which depends on exclusive-lock). So I can't imagine why it > can be required. > > And how long did you test it? I tested it a few weeks ago for about a > week, with no hangs. Normally there are hangs after a few days. And I > have permanently disabled it since the 20th, without any hangs since. > And I'm gradually adding back the VMs that died when they were there, > starting with the worst offenders. With that small time, I'm still very > convinced. > > And did you test other features? I suspected exclusive-lock, so I only > tested removing that one, which required removing object-map and > fast-diff too, so I didn't test those 2 separately. > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] disk timeouts in libvirt/qemu VMs...
On 03/28/17 17:28, Brian Andrus wrote: > Just adding some anecdotal input. It likely won't be ultimately > helpful other than a +1.. > > Seemingly, we also have the same issue since enabling exclusive-lock > on images. We experienced these messages at a large scale when making > a CRUSH map change a few weeks ago that resulted in many many VMs > experiencing the blocked task kernel messages, requiring reboots. > > We've since disabled on all images we can, but there are still > jewel-era instances that cannot have the feature disabled. Since > disabling the feature, I have not observed any cases of blocked tasks, > but so far given the limited timeframe I'd consider that anecdotal. > > Why do you need it enabled in jewel-era instances? With jewel you can set them on the fly, and live migrate the VM to get the client to update its usage of it. I couldn't find any difference except removing big images is faster with object-map (which depends on exclusive-lock). So I can't imagine why it can be required. And how long did you test it? I tested it a few weeks ago for about a week, with no hangs. Normally there are hangs after a few days. And I have permanently disabled it since the 20th, without any hangs since. And I'm gradually adding back the VMs that died when they were there, starting with the worst offenders. With that small time, I'm still very convinced. And did you test other features? I suspected exclusive-lock, so I only tested removing that one, which required removing object-map and fast-diff too, so I didn't test those 2 separately. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] disk timeouts in libvirt/qemu VMs...
Just adding some anecdotal input. It likely won't be ultimately helpful other than a +1.. Seemingly, we also have the same issue since enabling exclusive-lock on images. We experienced these messages at a large scale when making a CRUSH map change a few weeks ago that resulted in many many VMs experiencing the blocked task kernel messages, requiring reboots. We've since disabled on all images we can, but there are still jewel-era instances that cannot have the feature disabled. Since disabling the feature, I have not observed any cases of blocked tasks, but so far given the limited timeframe I'd consider that anecdotal. On Mon, Mar 27, 2017 at 12:31 PM, Hall, Ericwrote: > In an OpenStack (mitaka) cloud, backed by a ceph cluster (10.2.6 jewel), > using libvirt/qemu (1.3.1/2.5) hypervisors on Ubuntu 14.04.5 compute and > ceph hosts, we occasionally see hung processes (usually during boot, but > otherwise as well), with errors reported in the instance logs as shown > below. Configuration is vanilla, based on openstack/ceph docs. > > Neither the compute hosts nor the ceph hosts appear to be overloaded in > terms of memory or network bandwidth, none of the 67 osds are over 80% > full, nor do any of them appear to be overwhelmed in terms of IO. Compute > hosts and ceph cluster are connected via a relatively quiet 1Gb network, > with an IBoE net between the ceph nodes. Neither network appears > overloaded. > > I don’t see any related (to my eye) errors in client or server logs, even > with 20/20 logging from various components (rbd, rados, client, > objectcacher, etc.) I’ve increased the qemu file descriptor limit > (currently 64k... overkill for sure.) > > I “feels” like a performance problem, but I can’t find any capacity issues > or constraining bottlenecks. > > Any suggestions or insights into this situation are appreciated. Thank > you for your time, > -- > Eric > > > [Fri Mar 24 20:30:40 2017] INFO: task jbd2/vda1-8:226 blocked for more > than 120 seconds. > [Fri Mar 24 20:30:40 2017] Not tainted 3.13.0-52-generic #85-Ubuntu > [Fri Mar 24 20:30:40 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [Fri Mar 24 20:30:40 2017] jbd2/vda1-8 D 88043fd13180 0 226 > 2 0x > [Fri Mar 24 20:30:40 2017] 88003728bbd8 0046 > 88042690 88003728bfd8 > [Fri Mar 24 20:30:40 2017] 00013180 00013180 > 88042690 88043fd13a18 > [Fri Mar 24 20:30:40 2017] 88043ffb9478 0002 > 811ef7c0 88003728bc50 > [Fri Mar 24 20:30:40 2017] Call Trace: > [Fri Mar 24 20:30:40 2017] [] ? > generic_block_bmap+0x50/0x50 > [Fri Mar 24 20:30:40 2017] [] io_schedule+0x9d/0x140 > [Fri Mar 24 20:30:40 2017] [] sleep_on_buffer+0xe/0x20 > [Fri Mar 24 20:30:40 2017] [] __wait_on_bit+0x62/0x90 > [Fri Mar 24 20:30:40 2017] [] ? > generic_block_bmap+0x50/0x50 > [Fri Mar 24 20:30:40 2017] [] > out_of_line_wait_on_bit+0x77/0x90 > [Fri Mar 24 20:30:40 2017] [] ? > autoremove_wake_function+0x40/0x40 > [Fri Mar 24 20:30:40 2017] [] __wait_on_buffer+0x2a/0x30 > [Fri Mar 24 20:30:40 2017] [] jbd2_journal_commit_ > transaction+0x185d/0x1ab0 > [Fri Mar 24 20:30:40 2017] [] ? > try_to_del_timer_sync+0x4f/0x70 > [Fri Mar 24 20:30:40 2017] [] kjournald2+0xbd/0x250 > [Fri Mar 24 20:30:40 2017] [] ? > prepare_to_wait_event+0x100/0x100 > [Fri Mar 24 20:30:40 2017] [] ? commit_timeout+0x10/0x10 > [Fri Mar 24 20:30:40 2017] [] kthread+0xd2/0xf0 > [Fri Mar 24 20:30:40 2017] [] ? > kthread_create_on_node+0x1c0/0x1c0 > [Fri Mar 24 20:30:40 2017] [] ret_from_fork+0x7c/0xb0 > [Fri Mar 24 20:30:40 2017] [] ? > kthread_create_on_node+0x1c0/0x1c0 > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Brian Andrus | Cloud Systems Engineer | DreamHost brian.and...@dreamhost.com | www.dreamhost.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] disk timeouts in libvirt/qemu VMs...
Eric, If you already have debug level 20 logs captured from one of these events, I would love to be able to take a look at them to see what's going on. Depending on the size, you could either attach the log to a new RBD tracker ticket [1] or use the ceph-post-file helper to upload a large file. Thanks, Jason [1] http://tracker.ceph.com/projects/rbd/issues On Mon, Mar 27, 2017 at 3:31 PM, Hall, Ericwrote: > In an OpenStack (mitaka) cloud, backed by a ceph cluster (10.2.6 jewel), > using libvirt/qemu (1.3.1/2.5) hypervisors on Ubuntu 14.04.5 compute and ceph > hosts, we occasionally see hung processes (usually during boot, but otherwise > as well), with errors reported in the instance logs as shown below. > Configuration is vanilla, based on openstack/ceph docs. > > Neither the compute hosts nor the ceph hosts appear to be overloaded in terms > of memory or network bandwidth, none of the 67 osds are over 80% full, nor do > any of them appear to be overwhelmed in terms of IO. Compute hosts and ceph > cluster are connected via a relatively quiet 1Gb network, with an IBoE net > between the ceph nodes. Neither network appears overloaded. > > I don’t see any related (to my eye) errors in client or server logs, even > with 20/20 logging from various components (rbd, rados, client, objectcacher, > etc.) I’ve increased the qemu file descriptor limit (currently 64k... > overkill for sure.) > > I “feels” like a performance problem, but I can’t find any capacity issues or > constraining bottlenecks. > > Any suggestions or insights into this situation are appreciated. Thank you > for your time, > -- > Eric > > > [Fri Mar 24 20:30:40 2017] INFO: task jbd2/vda1-8:226 blocked for more than > 120 seconds. > [Fri Mar 24 20:30:40 2017] Not tainted 3.13.0-52-generic #85-Ubuntu > [Fri Mar 24 20:30:40 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [Fri Mar 24 20:30:40 2017] jbd2/vda1-8 D 88043fd13180 0 226 > 2 0x > [Fri Mar 24 20:30:40 2017] 88003728bbd8 0046 > 88042690 88003728bfd8 > [Fri Mar 24 20:30:40 2017] 00013180 00013180 > 88042690 88043fd13a18 > [Fri Mar 24 20:30:40 2017] 88043ffb9478 0002 > 811ef7c0 88003728bc50 > [Fri Mar 24 20:30:40 2017] Call Trace: > [Fri Mar 24 20:30:40 2017] [] ? > generic_block_bmap+0x50/0x50 > [Fri Mar 24 20:30:40 2017] [] io_schedule+0x9d/0x140 > [Fri Mar 24 20:30:40 2017] [] sleep_on_buffer+0xe/0x20 > [Fri Mar 24 20:30:40 2017] [] __wait_on_bit+0x62/0x90 > [Fri Mar 24 20:30:40 2017] [] ? > generic_block_bmap+0x50/0x50 > [Fri Mar 24 20:30:40 2017] [] > out_of_line_wait_on_bit+0x77/0x90 > [Fri Mar 24 20:30:40 2017] [] ? > autoremove_wake_function+0x40/0x40 > [Fri Mar 24 20:30:40 2017] [] __wait_on_buffer+0x2a/0x30 > [Fri Mar 24 20:30:40 2017] [] > jbd2_journal_commit_transaction+0x185d/0x1ab0 > [Fri Mar 24 20:30:40 2017] [] ? > try_to_del_timer_sync+0x4f/0x70 > [Fri Mar 24 20:30:40 2017] [] kjournald2+0xbd/0x250 > [Fri Mar 24 20:30:40 2017] [] ? > prepare_to_wait_event+0x100/0x100 > [Fri Mar 24 20:30:40 2017] [] ? commit_timeout+0x10/0x10 > [Fri Mar 24 20:30:40 2017] [] kthread+0xd2/0xf0 > [Fri Mar 24 20:30:40 2017] [] ? > kthread_create_on_node+0x1c0/0x1c0 > [Fri Mar 24 20:30:40 2017] [] ret_from_fork+0x7c/0xb0 > [Fri Mar 24 20:30:40 2017] [] ? > kthread_create_on_node+0x1c0/0x1c0 > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] disk timeouts in libvirt/qemu VMs...
On Mon, Mar 27, 2017 at 11:17 PM, Peter Maloney < peter.malo...@brockmann-consult.de> wrote: > I can't guarantee it's the same as my issue, but from that it sounds the > same. > > Jewel 10.2.4, 10.2.5 tested > hypervisors are proxmox qemu-kvm, using librbd > 3 ceph nodes with mon+osd on each > > -faster journals, more disks, bcache, rbd_cache, fewer VMs on ceph, iops > and bw limits on client side, jumbo frames, etc. all improve/smooth out > performance and mitigate the hangs, but don't prevent it. > -hangs are usually associated with blocked requests (I set the complaint > time to 5s to see them) > -hangs are very easily caused by rbd snapshot + rbd export-diff to do > incremental backup (one snap persistent, plus one more during backup) > -when qemu VM io hangs, I have to kill -9 the qemu process for it to > stop. Some broken VMs don't appear to be hung until I try to live > migrate them (live migrating all VMs helped test solutions) > > Finally I have a workaround... disable exclusive-lock, object-map, and > fast-diff rbd features (and restart clients via live migrate). > (object-map and fast-diff appear to have no effect on dif or export-diff > ... so I don't miss them). I'll file a bug at some point (after I move > all VMs back and see if it is still stable). And one other user on IRC > said this solved the same problem (also using rbd snapshots). > > And strangely, they don't seem to hang if I put back those features, > until a few days later (making testing much less easy...but now I'm very > sure removing them prevents the issue) > > I hope this works for you (and maybe gets some attention from devs too), > so you don't waste months like me. > > On 03/27/17 19:31, Hall, Eric wrote: > > In an OpenStack (mitaka) cloud, backed by a ceph cluster (10.2.6 jewel), > using libvirt/qemu (1.3.1/2.5) hypervisors on Ubuntu 14.04.5 compute and > ceph hosts, we occasionally see hung processes (usually during boot, but > otherwise as well), with errors reported in the instance logs as shown > below. Configuration is vanilla, based on openstack/ceph docs. > > > > Neither the compute hosts nor the ceph hosts appear to be overloaded in > terms of memory or network bandwidth, none of the 67 osds are over 80% > full, nor do any of them appear to be overwhelmed in terms of IO. Compute > hosts and ceph cluster are connected via a relatively quiet 1Gb network, > with an IBoE net between the ceph nodes. Neither network appears > overloaded. > > > > I don’t see any related (to my eye) errors in client or server logs, > even with 20/20 logging from various components (rbd, rados, client, > objectcacher, etc.) I’ve increased the qemu file descriptor limit > (currently 64k... overkill for sure.) > > > > I “feels” like a performance problem, but I can’t find any capacity > issues or constraining bottlenecks. > > > > Any suggestions or insights into this situation are appreciated. Thank > you for your time, > > -- > > Eric > > > > > > [Fri Mar 24 20:30:40 2017] INFO: task jbd2/vda1-8:226 blocked for more > than 120 seconds. > > [Fri Mar 24 20:30:40 2017] Not tainted 3.13.0-52-generic #85-Ubuntu > > [Fri Mar 24 20:30:40 2017] "echo 0 > > > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > > [Fri Mar 24 20:30:40 2017] jbd2/vda1-8 D 88043fd13180 0 > 226 2 0x > > [Fri Mar 24 20:30:40 2017] 88003728bbd8 0046 > 88042690 88003728bfd8 > > [Fri Mar 24 20:30:40 2017] 00013180 00013180 > 88042690 88043fd13a18 > > [Fri Mar 24 20:30:40 2017] 88043ffb9478 0002 > 811ef7c0 88003728bc50 > > [Fri Mar 24 20:30:40 2017] Call Trace: > > [Fri Mar 24 20:30:40 2017] [] ? > generic_block_bmap+0x50/0x50 > > [Fri Mar 24 20:30:40 2017] [] io_schedule+0x9d/0x140 > > [Fri Mar 24 20:30:40 2017] [] sleep_on_buffer+0xe/0x20 > > [Fri Mar 24 20:30:40 2017] [] __wait_on_bit+0x62/0x90 > > [Fri Mar 24 20:30:40 2017] [] ? > generic_block_bmap+0x50/0x50 > > [Fri Mar 24 20:30:40 2017] [] > out_of_line_wait_on_bit+0x77/0x90 > > [Fri Mar 24 20:30:40 2017] [] ? > autoremove_wake_function+0x40/0x40 > > [Fri Mar 24 20:30:40 2017] [] > __wait_on_buffer+0x2a/0x30 > > [Fri Mar 24 20:30:40 2017] [] jbd2_journal_commit_ > transaction+0x185d/0x1ab0 > > [Fri Mar 24 20:30:40 2017] [] ? > try_to_del_timer_sync+0x4f/0x70 > > [Fri Mar 24 20:30:40 2017] [] kjournald2+0xbd/0x250 > > [Fri Mar 24 20:30:40 2017] [] ? > prepare_to_wait_event+0x100/0x100 > > [Fri Mar 24 20:30:40 2017] [] ? > commit_timeout+0x10/0x10 > > [Fri Mar 24 20:30:40 2017] [] kthread+0xd2/0xf0 > > [Fri Mar 24 20:30:40 2017] [] ? > kthread_create_on_node+0x1c0/0x1c0 > > [Fri Mar 24 20:30:40 2017] [] ret_from_fork+0x7c/0xb0 > > [Fri Mar 24 20:30:40 2017] [] ? > kthread_create_on_node+0x1c0/0x1c0 > > > > > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > >
Re: [ceph-users] disk timeouts in libvirt/qemu VMs...
I can't guarantee it's the same as my issue, but from that it sounds the same. Jewel 10.2.4, 10.2.5 tested hypervisors are proxmox qemu-kvm, using librbd 3 ceph nodes with mon+osd on each -faster journals, more disks, bcache, rbd_cache, fewer VMs on ceph, iops and bw limits on client side, jumbo frames, etc. all improve/smooth out performance and mitigate the hangs, but don't prevent it. -hangs are usually associated with blocked requests (I set the complaint time to 5s to see them) -hangs are very easily caused by rbd snapshot + rbd export-diff to do incremental backup (one snap persistent, plus one more during backup) -when qemu VM io hangs, I have to kill -9 the qemu process for it to stop. Some broken VMs don't appear to be hung until I try to live migrate them (live migrating all VMs helped test solutions) Finally I have a workaround... disable exclusive-lock, object-map, and fast-diff rbd features (and restart clients via live migrate). (object-map and fast-diff appear to have no effect on dif or export-diff ... so I don't miss them). I'll file a bug at some point (after I move all VMs back and see if it is still stable). And one other user on IRC said this solved the same problem (also using rbd snapshots). And strangely, they don't seem to hang if I put back those features, until a few days later (making testing much less easy...but now I'm very sure removing them prevents the issue) I hope this works for you (and maybe gets some attention from devs too), so you don't waste months like me. On 03/27/17 19:31, Hall, Eric wrote: > In an OpenStack (mitaka) cloud, backed by a ceph cluster (10.2.6 jewel), > using libvirt/qemu (1.3.1/2.5) hypervisors on Ubuntu 14.04.5 compute and ceph > hosts, we occasionally see hung processes (usually during boot, but otherwise > as well), with errors reported in the instance logs as shown below. > Configuration is vanilla, based on openstack/ceph docs. > > Neither the compute hosts nor the ceph hosts appear to be overloaded in terms > of memory or network bandwidth, none of the 67 osds are over 80% full, nor do > any of them appear to be overwhelmed in terms of IO. Compute hosts and ceph > cluster are connected via a relatively quiet 1Gb network, with an IBoE net > between the ceph nodes. Neither network appears overloaded. > > I don’t see any related (to my eye) errors in client or server logs, even > with 20/20 logging from various components (rbd, rados, client, objectcacher, > etc.) I’ve increased the qemu file descriptor limit (currently 64k... > overkill for sure.) > > I “feels” like a performance problem, but I can’t find any capacity issues or > constraining bottlenecks. > > Any suggestions or insights into this situation are appreciated. Thank you > for your time, > -- > Eric > > > [Fri Mar 24 20:30:40 2017] INFO: task jbd2/vda1-8:226 blocked for more than > 120 seconds. > [Fri Mar 24 20:30:40 2017] Not tainted 3.13.0-52-generic #85-Ubuntu > [Fri Mar 24 20:30:40 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [Fri Mar 24 20:30:40 2017] jbd2/vda1-8 D 88043fd13180 0 226 > 2 0x > [Fri Mar 24 20:30:40 2017] 88003728bbd8 0046 > 88042690 88003728bfd8 > [Fri Mar 24 20:30:40 2017] 00013180 00013180 > 88042690 88043fd13a18 > [Fri Mar 24 20:30:40 2017] 88043ffb9478 0002 > 811ef7c0 88003728bc50 > [Fri Mar 24 20:30:40 2017] Call Trace: > [Fri Mar 24 20:30:40 2017] [] ? > generic_block_bmap+0x50/0x50 > [Fri Mar 24 20:30:40 2017] [] io_schedule+0x9d/0x140 > [Fri Mar 24 20:30:40 2017] [] sleep_on_buffer+0xe/0x20 > [Fri Mar 24 20:30:40 2017] [] __wait_on_bit+0x62/0x90 > [Fri Mar 24 20:30:40 2017] [] ? > generic_block_bmap+0x50/0x50 > [Fri Mar 24 20:30:40 2017] [] > out_of_line_wait_on_bit+0x77/0x90 > [Fri Mar 24 20:30:40 2017] [] ? > autoremove_wake_function+0x40/0x40 > [Fri Mar 24 20:30:40 2017] [] __wait_on_buffer+0x2a/0x30 > [Fri Mar 24 20:30:40 2017] [] > jbd2_journal_commit_transaction+0x185d/0x1ab0 > [Fri Mar 24 20:30:40 2017] [] ? > try_to_del_timer_sync+0x4f/0x70 > [Fri Mar 24 20:30:40 2017] [] kjournald2+0xbd/0x250 > [Fri Mar 24 20:30:40 2017] [] ? > prepare_to_wait_event+0x100/0x100 > [Fri Mar 24 20:30:40 2017] [] ? commit_timeout+0x10/0x10 > [Fri Mar 24 20:30:40 2017] [] kthread+0xd2/0xf0 > [Fri Mar 24 20:30:40 2017] [] ? > kthread_create_on_node+0x1c0/0x1c0 > [Fri Mar 24 20:30:40 2017] [] ret_from_fork+0x7c/0xb0 > [Fri Mar 24 20:30:40 2017] [] ? > kthread_create_on_node+0x1c0/0x1c0 > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] disk timeouts in libvirt/qemu VMs...
In an OpenStack (mitaka) cloud, backed by a ceph cluster (10.2.6 jewel), using libvirt/qemu (1.3.1/2.5) hypervisors on Ubuntu 14.04.5 compute and ceph hosts, we occasionally see hung processes (usually during boot, but otherwise as well), with errors reported in the instance logs as shown below. Configuration is vanilla, based on openstack/ceph docs. Neither the compute hosts nor the ceph hosts appear to be overloaded in terms of memory or network bandwidth, none of the 67 osds are over 80% full, nor do any of them appear to be overwhelmed in terms of IO. Compute hosts and ceph cluster are connected via a relatively quiet 1Gb network, with an IBoE net between the ceph nodes. Neither network appears overloaded. I don’t see any related (to my eye) errors in client or server logs, even with 20/20 logging from various components (rbd, rados, client, objectcacher, etc.) I’ve increased the qemu file descriptor limit (currently 64k... overkill for sure.) I “feels” like a performance problem, but I can’t find any capacity issues or constraining bottlenecks. Any suggestions or insights into this situation are appreciated. Thank you for your time, -- Eric [Fri Mar 24 20:30:40 2017] INFO: task jbd2/vda1-8:226 blocked for more than 120 seconds. [Fri Mar 24 20:30:40 2017] Not tainted 3.13.0-52-generic #85-Ubuntu [Fri Mar 24 20:30:40 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Mar 24 20:30:40 2017] jbd2/vda1-8 D 88043fd13180 0 226 2 0x [Fri Mar 24 20:30:40 2017] 88003728bbd8 0046 88042690 88003728bfd8 [Fri Mar 24 20:30:40 2017] 00013180 00013180 88042690 88043fd13a18 [Fri Mar 24 20:30:40 2017] 88043ffb9478 0002 811ef7c0 88003728bc50 [Fri Mar 24 20:30:40 2017] Call Trace: [Fri Mar 24 20:30:40 2017] [] ? generic_block_bmap+0x50/0x50 [Fri Mar 24 20:30:40 2017] [] io_schedule+0x9d/0x140 [Fri Mar 24 20:30:40 2017] [] sleep_on_buffer+0xe/0x20 [Fri Mar 24 20:30:40 2017] [] __wait_on_bit+0x62/0x90 [Fri Mar 24 20:30:40 2017] [] ? generic_block_bmap+0x50/0x50 [Fri Mar 24 20:30:40 2017] [] out_of_line_wait_on_bit+0x77/0x90 [Fri Mar 24 20:30:40 2017] [] ? autoremove_wake_function+0x40/0x40 [Fri Mar 24 20:30:40 2017] [] __wait_on_buffer+0x2a/0x30 [Fri Mar 24 20:30:40 2017] [] jbd2_journal_commit_transaction+0x185d/0x1ab0 [Fri Mar 24 20:30:40 2017] [] ? try_to_del_timer_sync+0x4f/0x70 [Fri Mar 24 20:30:40 2017] [] kjournald2+0xbd/0x250 [Fri Mar 24 20:30:40 2017] [] ? prepare_to_wait_event+0x100/0x100 [Fri Mar 24 20:30:40 2017] [] ? commit_timeout+0x10/0x10 [Fri Mar 24 20:30:40 2017] [] kthread+0xd2/0xf0 [Fri Mar 24 20:30:40 2017] [] ? kthread_create_on_node+0x1c0/0x1c0 [Fri Mar 24 20:30:40 2017] [] ret_from_fork+0x7c/0xb0 [Fri Mar 24 20:30:40 2017] [] ? kthread_create_on_node+0x1c0/0x1c0 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com