On 11/04/14 09:07, Wido den Hollander wrote: > >> Op 11 april 2014 om 8:50 schreef Josef Johansson <[email protected]>: >> >> >> Hi, >> >> On 11/04/14 07:29, Wido den Hollander wrote: >>>> Op 11 april 2014 om 7:13 schreef Greg Poirier <[email protected]>: >>>> >>>> >>>> One thing to note.... >>>> All of our kvm VMs have to be rebooted. This is something I wasn't >>>> expecting. Tried waiting for them to recover on their own, but that's not >>>> happening. Rebooting them restores service immediately. :/ Not ideal. >>>> >>> A reboot isn't really required though. It could be that the VM itself is in >>> trouble, but from a librados/librbd perspective I/O should simply continue >>> as >>> soon as a osdmap has been received without the "full" flag. >>> >>> It could be that you have to wait some time before the VM continues. This >>> can >>> take up to 15 minutes. >> With other storage solution you would have to change the timeout-value >> for each disk, i.e. changing to 180 secs from 60 secs, for the VMs to >> survive storage problems. >> Does Ceph handle this differently somehow? >> > It's not that RBD does it differently. Librados simply blocks the I/O and thus > dus librbd which then causes Qemu to block. > > I've seen VMs survive RBD issues for longer periods then 60 seconds. Gave them > some time and they continued again. > > Which exact setting are you talking about? I'm talking about a Qemu/KVM VM > running with a VirtIO drive. cat /sys/block/*/device/timeout (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009465)
This file is non-existant for my Ceph-VirtIO-drive however, so it seems RBD handles this. I have just Para-Virtualized VMs to compare with right now, and they don't have it inside the VM, but that's expected. From my understanding it should've been there if it was a HVM. Whenever the timeout was reached, an error occured and the disk was set in read-only-mode. Cheers, Josef > Wido > >> Cheers, >> Josef >>> Wido >>> >>>> On Thu, Apr 10, 2014 at 10:12 PM, Greg Poirier >>>> <[email protected]>wrote: >>>> >>>>> Going to try increasing the full ratio. Disk utilization wasn't really >>>>> growing at an unreasonable pace. I'm going to keep an eye on it for the >>>>> next couple of hours and down/out the OSDs if necessary. >>>>> >>>>> We have four more machines that we're in the process of adding (which >>>>> doubles the number of OSDs), but got held up by some networking nonsense. >>>>> >>>>> Thanks for the tips. >>>>> >>>>> >>>>> On Thu, Apr 10, 2014 at 9:51 PM, Sage Weil <[email protected]> wrote: >>>>> >>>>>> On Thu, 10 Apr 2014, Greg Poirier wrote: >>>>>>> Hi, >>>>>>> I have about 200 VMs with a common RBD volume as their root filesystem >>>>>> and a >>>>>>> number of additional filesystems on Ceph. >>>>>>> >>>>>>> All of them have stopped responding. One of the OSDs in my cluster is >>>>>> marked >>>>>>> full. I tried stopping that OSD to force things to rebalance or at >>>>>> least go >>>>>>> to degraded mode, but nothing is responding still. >>>>>>> >>>>>>> I'm not exactly sure what to do or how to investigate. Suggestions? >>>>>> Try marking the osd out or partially out (ceph osd reweight N .9) to move >>>>>> some data off, and/or adjust the full ratio up (ceph pg set_full_ratio >>>>>> .95). Note that this becomes increasinly dangerous as OSDs get closer to >>>>>> full; add some disks. >>>>>> >>>>>> sage >>>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> [email protected] >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> _______________________________________________ >>> ceph-users mailing list >>> [email protected] >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> _______________________________________________ >> ceph-users mailing list >> [email protected] >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
