Hi Szabolcs, On 10/25/2016 10:01 AM, Szabolcs F. wrote: > Hi Alwin, > > thanks for your hints. > >> On which interface is proxmox running on? Are these interfaces clogged > because, there is some heavy network IO going on? > I've got my two Intel Gbps network interfaces bonded together (bond0) as > active-backup and vmbr0 is bridged on this bond, then Proxmox is running on > this interface. I.e. http://pastebin.com/WZKQ02Qu > All nodes are configured like this. There is no heavy IO on these > interfaces, because the storage network uses the separate 10Gbps fiber > Intel NICs (bond1).
Is your bond working properly? Is the bond on the same switch or two different? Usually I add the "bond_primary ethX" option to set the interface that should be primarily used in active-backup configuration - side note. :-) What are the logs on the server showing? You know, syslog, dmesg, pveproxy, etc. ;-) > >> Another guess, are all servers synchronizing with a NTP server and have > the correct time? > Yes, NTP is working properly, the firewall lets all NTP request go through. > > > On Mon, Oct 24, 2016 at 5:19 PM, Alwin Antreich <sysadmin-...@cognitec.com> > wrote: > >> Hello Szabolcs, >> >> On 10/24/2016 03:16 PM, Szabolcs F. wrote: >>> Hello, >>> >>> I've got a Proxmox VE 4.3 cluster of 12 nodes. All of them are Dell C6220 >>> sleds. Each has 2x Intel Xeon E5-2670 CPU and 64GB RAM. I've got two >>> separate networks: 1Gbps LAN (Cisco 4948 switch) and 10Gbps storage >> (Cisco >>> N3K-3064PQ fiber switch). The Dell nodes use the integrated Intel Gbit >>> adapters for LAN and Intel PCI-E 10Gbps cards for the fiber network >> (ixgbe >>> driver). The storage servers are separate, they run FreeNAS and export >> the >>> shares with NFS. My virtual machines (I've made about 40 of them so far) >>> are KVM/QCOW2 and they are stored on the FreeNAS storage. So far so good. >>> I've been using this environment as a test and was almost ready to push >>> into production. >> On which interface is proxmox running on? Are these interfaces clogged >> because, there is some heavy network IO going on? >>> >>> But I have a problem with the cluster. From time to time the pveproxy >>> service dies on the nodes or the web UI lists all nodes (except the one >> I'm >>> actually logged into) as unreachable (red cross). Sometimes all nodes are >>> listed as working (green status) but if I try to connect to a virtual >>> machine I get a 'connection refused' error. When the cluster acts up I >>> can't do any VM migration and any other VM management (i.e. console, >>> start/stop/reset, new VM, etc). When it happens the only way to recover >> is >>> powering down all 12 nodes and starting them one after another. Then >>> everything works properly for a random amount of time: sometimes for >> weeks, >>> sometimes for only a few days. >> Another guess, are all servers synchronizing with a NTP server and have >> the correct time? >>> >>> I followed the network troubleshooting guide with omping, multicast, etc >>> and confirmed I've got multicase enabled and the troubleshooting didn't >>> return any error. The /etc/hosts file is configured on all nodes with the >>> proper hostname/IP list of all nodes. >>> When trying to do 'service pve-cluster restart' I get these errors: >>> http://pastebin.com/NXnEf4rd (running pmxcsf manually mounts the >> /etc/pve >>> properly, but doesn't fix the cluster/proxy issue) >>> pvecm status : http://pastebin.com/jsDFkqu3 (I powered down one node, >>> that's why it's missing) >>> pvecm nodes : http://pastebin.com/1WR8Yij8 >>> Corosync has a lot of these in the /var/logs/daemon.log : >>> http://pastebin.com/ajhE8Rb9 >>> >>> Someone please help! >>> >>> Thanks, >>> Szabolcs >>> _______________________________________________ >>> pve-user mailing list >>> pve-user@pve.proxmox.com >>> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>> >> >> -- >> Cheers, >> Alwin >> _______________________________________________ >> pve-user mailing list >> pve-user@pve.proxmox.com >> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > _______________________________________________ > pve-user mailing list > pve-user@pve.proxmox.com > http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > -- Cheers, Alwin _______________________________________________ pve-user mailing list pve-user@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user