Hi, You didn't state what version of ceph or kvm/qemu you're using. I think it wasn't until qemu 1.5.0 (1.4.2+?) that an async patch from inktank was accepted into mainstream which significantly helps in situations like this.
If not using that on top of not limiting recovery threads you'll prob. see issues like you describe. Also more nodes make it easier on the entire cluster in case of recovery so it might make sense adding smaller ones if/when you expand it. Cheers, Martin On Tue, Dec 3, 2013 at 7:09 AM, 飞 <[email protected]> wrote: > hello, I'm testing Ceph as storage for KVM virtual machine images, > my cluster have 3 mons and 3 data nodes, every data node have 8x2T SATA > HDD and 1 SSD for journal. > when I shutdown one data node to imitate server fault, the cluster begin > to recovery , when under recovery, > I can see many blocked requests, and the KVM VMs will be crash (crash as > they think their disk is offline), > how Can I solve this issue ? any idea ? thank you > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
