Dear list,

Sorry it’s a long message and my english may not be the best :(

I have a cluster with two hosts and DRBD replication, I don’t want to use HA I 
only want to achieve more resiliency in case a host fail and avoid downtime for 
the VMs when there is a kernel update and I need to reboot one of the PVE so 2 
hosts should not be a problem.

I don’t have powerful machines nor hardware raid, I have 2x2To disks witch I 
use with a soft raid 1 partition for the system and the second partition is 
dedicated to drbd ressources.

I’m on the very last version of Proxmox4 with community subscription and my 
servers are up to date, I have a link between the servers witch truly provide 
1Gb of bandwidth (tested with iperf) and a latency of less than 2ms.

I use the version 9 of DRBD since it’s the one installed with PVE4 by default 
but I’m a little bit worried since Proxmox drbd9 documentation says this is a 
technology preview and even Linbit say DRBD 9.0.x is not production ready… so I 
use it the old fashion way for configuration, like the following link advises, 
2 ressources to avoid complicated split brain recovery : 
https://pve.proxmox.com/wiki/DRBD <https://pve.proxmox.com/wiki/DRBD>

I hope that by doing this way I avoid the « not production » thing, since I 
believe that it is the drbdmanage tool that is not production ready and not 
DRBD itself but maybe I’m wrong…? 
Is there people that are using drbd9 for production ? successfully ? please 
share your experience.

My problem, it is almost reproductible everytime (9 times /10) : when I try to 
clone a VM from one ressource to the other or when I try to move the disk of a 
VM from a ressource to the other, it finish by a server crash even without 
running a single VM on the cluster !

I understand it’s a resource intensive operation but it should be slow at worst 
but my servers should not crash...

What happens :  I launch the operation, it usually happen with « bigs » disks 
(100Go for exemple, with 5Go it usually complete), at the beginning it seems to 
work, I monitor the operation with several open ssh consoles with iotop, iftop, 
iostat, top… on both the servers.

After a wile, sometimes 10%, sometimes 20% of the operation, the progression 
window stop moving, the target server become red in the web interface, I can 
see that the server where I launched the operation (source server) stop making 
read io, there is almost no more bandwidth used in DRBD network link and the 
server that was « receiving » the datas (the target server) start to become 
unresponsive, there is no more io on both servers, and one of the servers, 
usually the target server start to have a big big load, at first its only 
3,4,5… but the more I wait, the more the load grows (at the end it can be up to 
30 or 35 !!!)

There is a process witch is eating cpu it is drdb even if it’s no more doing 
anything (and I don’t know how to stop it, the kill and kill -9 don’t work, 
"service drdb stop" either… 
there is plenty of free memory, there is almost no more IO and the CPU is used 
but not so much as you can see : 
https://cloud.ipgenius.fr/index.php/s/UWGbnePdxkiee5A 
<https://cloud.ipgenius.fr/index.php/s/UWGbnePdxkiee5A>

At the end, I can see theses errors in my ssh consoles :

Message from syslogd@virt1 at Apr  1 11:24:09 ...
 kernel:[79286.931625] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! 
[kvm:9177]

Message from syslogd@virt1 at Apr  1 11:24:37 ...
 kernel:[79314.904263] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! 
[kvm:9177]

and i can’t do nothing more, can’t even connect to ssh or launch any command in 
the consoles opened, the only way for getting my server back is to do a hard 
reboot :(

After that, the ressources quickly resync, and I end up with a LVM disk created 
but of course unusable… so I delete it...

I give you my drbd config files witch are as basic as possible :

root@virt1 ~ # cat /etc/drbd.d/global_common.conf
global {
  usage-count no;
}

common {
                
        startup {
                wfc-timeout  0; 
        }
        
        handlers {
        split-brain "/usr/lib/drbd/notify-split-brain.sh root";
        out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
        }
                        
        options {
        cpu-mask 0;
    }
                        
        net {
                protocol C;
                allow-two-primaries;
                cram-hmac-alg sha1;
                sndbuf-size 0; max-buffers 8000; max-epoch-size 8000; 
verify-alg sha1;
    }
    
    disk {
                resync-rate 40M;
                on-io-error detach;
        }
                        
}
root@virt1 ~ # cat /etc/drbd.d/r1.res
resource r1 {
                device /dev/drbd1;
                disk /dev/sda3;
                meta-disk internal;
                
        net {
                shared-secret "xxxxxxxxxx";
        }
        
        on virt1 {  
                address 10.200.10.1:7788;     
        }
        
        on virt2 {
                address 10.200.10.2:7788;
        }
}
root@virt1 ~ # cat /etc/drbd.d/r2.res
resource r2 {
                device /dev/drbd2;
                disk /dev/sdb3;
                meta-disk internal;

        net {
                shared-secret "xxxxxxxxxx";      
        }
                 
        on virt1 {
                address 10.200.10.1:7789;
        }
        
        on virt2 {
                address 10.200.10.2:7789;
        }
}
root@virt1 ~ # 

I’m not sure about the values for the line with the buffers directives but all 
the rest is very standard.

For information, unless for this specific operation (copy from a ressource to 
the other) the cluster is working correctly : I have made 3 or 4 linux and 3 
windows VMs on each hosts and launched very stressful tests (all ressources, 
cpu, ram and disk read/write) in every VMs during more than 24 hours and of 
course the cluster was a bit loaded, the network links sometimes showed as 
congested and the VMs was not very fast but it worked very well and it have 
been stable !

What I need above everything is stability, performances are far away in my 
priority list because I would like to go on prod with this cluster so really 
need it to be stable...

Can someone help me ? maybe do you need more config files or any kind of 
information, I can provide anything you ask. Is there a way for limiting the 
ressources allocated to drbd ?

Moreover, If there is a real specialist of drbd technology, my company is not 
rich and can’t afford to pay a lot for it but it is something that can be 
discussed. 

Anyway, thank you in advance if you have taken the time to read my message and 
thank you again more if you can provide an answer !

One last thing, not in direct relation with the subject :
There is no /proc/drbd anymore and I’m a little bit lost without it because I 
was used to it. 
Of course, I know there is others commands (drbdadm status, drbd-overview, 
drbdsetup events2…) but the nagios plugin I was using to monitor drbd will not 
work since it is using /proc/drbd output and I didn’t found a newer nagios 
plugin adapted to this new version, if someone has it could you please provide 
it ?

if nobody answer to this, I will write one myself, so if someone is interested 
for it, let me know.

Best regards,

        
Jean-Laurent Ivars 
Responsable Technique | Technical Manager
22, rue Robert - 13007 Marseille 
Tel: 09 84 56 64 30 - Mobile: 06.52.60.86.47 
Linkedin <http://fr.linkedin.com/in/jlivars/>   |  Viadeo 
<http://www.viadeo.com/fr/profile/jean-laurent.ivars>   |  www.ipgenius.fr 
<https://www.ipgenius.fr/>
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to