Thanks for the answer ... If I edit /etc/default/redhat-cluster-pve and comment out the last line do the job ??
Fábio Rabelo 2013/3/12 Steffen Wagner <[email protected]> > Hi, > > I had a similiar problem with 2.2 > I had rgmanager for HA features running on high end hardware (Dell, QNAP > and Cisco). After about three days one of the nodes (it wasnt always the > same!) left quorum (log said something like 'node 2 left, x nodes remaining > in cluster, fencing node 2.'. After then always the node was successfully > fenced... so i disabled fencing and changed it to 'hand'. Then the node > didnt shut down anymore. It remained online with all vms, but the cluster > said the node was offline (at reboot the node stuck at pve rgmanager > service, only hardreset was possible). > > In the end i disabled HA and ran the nodes now only in cluster mode > without fencing... working until now (3 months) without any problems... a > pity, because i want to use HA features, but dont know whats wrong. > > My network setup is similiar as Fabio's. I'm using VLANs one for the > storage interface and one for the other..... > > Until now i think i stay at 2.2 and do not upgrade to 2.3 until everyone > in the maillist is happy :-) > > > Mit freundlichen Grüßen, > Steffen Wagner > -- > > Im Obersteig 31 > 76879 Hochstadt/Pfalz > > E [email protected] > M 01523/3544688 > F 06347/918474 > > Fábio Rabelo <[email protected]> schrieb: > > >2013/3/12 Andreu Sànchez i Costa <[email protected]> > > > >> Hello Fábio, > >> > >> Al 12/03/13 01:00, En/na Fábio Rabelo ha escrit: > >> > >> > >> 2.3 do not have the reliability 1.9 has !!!! > >> > >> I am struggling with it for 3 months, my deadline are gone, and I cannot > >> make it work for more than 3 days without an issue ... > >> > >> > >> I cannot give my opinion about 2.3 but with 2.2.x it works perfectly, I > >> only had to change elevator to deadline cause CFQ had performance > problems > >> with our P2000 iSCSI array disk. > >> > >> As other list members asked, what are your main problems? > >> > >> > >I already described the problems several times here . > > > >This is a five node cluster, motherboards dual opteron from Supermicro . > > > >Storage uses the same motherboard as the five nodes, but with a 16 3,5 HD > >slots, with 12 occupied by WD enterprise disks . > > > >Storage runs Nas4Free . ( already try Freenas, same result ) > > > >Like I said, when I installed PVE 1.9 everything works fine for, now 9 > >days, and counting . > > > >In the five nodes, are embedded 2 network ports, connected to Linksys > >switcher, I am using it to serve the VMs . > > > >In one PCIe Slot there are an Intel 10 GB card, to talk with a Supermicro > >10 GB switcher, exclusive to communication between the five nodes and the > >Storage . > > > >This switcher have no link with anything else . > > > >In the Storage, I use one of the embedded ports to manage, and all images > >are served through 10 GB card . > > > >After sometime, between 1 and 3 days the system is working, the nodes > stops > >to talk with the storage . > > > >When it happens, the log shows lots of msg like this : > > > >Mar 6 17:15:29 nodo-01 pvestatd[2804]: WARNING: storage 'iudice01' is > >not online > >Mar 6 17:15:39 nodo-01 pvestatd[2804]: WARNING: storage 'iudice01' is > >not online > >Mar 6 17:15:49 nodo-01 pvestatd[2804]: WARNING: storage 'iudice01' is > >not online > >Mar 6 17:15:59 nodo-01 pvestatd[2804]: WARNING: storage 'iudice01' is > >not online > >Mar 6 17:16:09 nodo-01 pvestatd[2804]: WARNING: storage 'iudice01' is > >not online > >Mar 6 17:16:19 nodo-01 pvestatd[2804]: WARNING: storage 'iudice01' is > >not online > >Mar 6 17:16:29 nodo-01 pvestatd[2804]: WARNING: storage 'iudice01' is > >not online > >Mar 6 17:16:39 nodo-01 pvestatd[2804]: WARNING: storage 'iudice01' is > >not online > >Mar 6 17:16:49 nodo-01 pvestatd[2804]: WARNING: storage 'iudice01' is > >not online > >Mar 6 17:16:59 nodo-01 pvestatd[2804]: WARNING: storage 'iudice01' is > >not online > > > > > > > >After that, if I try to restart the pve daemon, it refuses to . > > > >If I try to reboot the server, it stops when the PVE daemon should stops, > >and stays there forever . > > > >The only way to reboot any of the nodes is a hard reset ! > > > >At first, I my suspects goes to Storage, changed from Freenas to Nas4Free, > >sane thing, desperation ! > > > >Then, for tests, I installed PVE 1.9 In all five nodes ( I have 2 systems > >running it for 3 years, so issue, this new system are to replace both ) > > > >Like I said, 9 days and counting !!! > > > >So, there is no problem in the hardware, and there is no problem with > >Nas4Free ! > > > >What left ?!? > > > > > >Fábio Rabelo > > > >_______________________________________________ > >pve-user mailing list > >[email protected] > >http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >
_______________________________________________ pve-user mailing list [email protected] http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
