Sounds familiar... and discussed in "disk timeouts in libvirt/qemu VMs..."

We have not had this issue since reverting exclusive-lock, but it was
suggested this was not the issue. So far it's held up for us with not a
single corrupt filesystem since then.

On some images (ones created post-Jewel upgrade) the feature could not be
disabled, but these don't seem to be affected. Of course, we never did
pinpoint the cause of timeouts, so it's entirely possible something else
was causing it but no other major changes went into effect.

One thing to look for that might confirm the same issue are timeouts in the
guest VM. Most OS kernel will report a hung task in conjunction with the
hang up/lock/corruption. Wondering if you're seeing that too.

On Wed, May 3, 2017 at 10:49 PM, Stefan Priebe - Profihost AG <
[email protected]> wrote:

> Hello,
>
> since we've upgraded from hammer to jewel 10.2.7 and enabled
> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
> filesystems.
>
> Sometimes the VMs are just crashing with FS errors and a restart can
> solve the problem. Sometimes the whole VM is not even bootable and we
> need to import a backup.
>
> All of them have the same problem that you can't revert to an older
> snapshot. The rbd command just hangs at 99% forever.
>
> Is this a known issue - anythink we can check?
>
> Greets,
> Stefan
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Brian Andrus | Cloud Systems Engineer | DreamHost
[email protected] | www.dreamhost.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to