Hi List, we are running a ceph cluster (12.2.5) as backend to our OpenStack cloud.
Yesterday our datacenter had a power outage. As this wouldn't be enough, we also had a separated ceph cluster because of networking problems. First of all thanks a lot to the ceph developers. After the network was back to normal, ceph recovered itself. You saved us from a lot of downtime, lack of sleep and insanity. Now to our problem/question: After ceph recovered, we tried to bring up our VMs. They have cinder volumes saved in ceph. All VMs didn't start because of I/O problems during start: [ 4.393246] JBD2: recovery failed [ 4.395949] EXT4-fs (vda1): error loading journal [ 4.400811] VFS: Dirty inode writeback failed for block device vda1 (err=-5). mount: mounting /dev/vda1 on /root failed: Input/output error done. Begin: Running /scripts/local-bottom ... done. Begin: Running /scripts/init-bottom ... mount: mounting /dev on /root/dev failed: No such file or directory We tried to recover the disk with different methods, but all failed because of different reasons. What helped us at the end was a rebuild on the object map of each image: rbd object-map rebuild volumes/<uuid> >From what we understood, object-map is a feature for ceph internal speedup. How can this lead to I/O errors in our VMs? Is this the expected way for a recovery? Did we miss something? Is there any documentation describing what leads to invalid object-maps and how to recover? (We did not find a doc on that topic...) regards Christian
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
