Dear all, We are running CloudStack 4.2.0 with KVM hypervisor and Ceph RBD storage. We implemented secondary CloudStack management server and haproxy load balancer, and tonight we changed our configuration so that the CloudStack agents will be connecting to the LB IP rather than the CS mgmt server directly.
However, we noted that the agent will be regularly disconnected every 2-3 minutes. Here are the excerpts on the agent.log: ==== 2016-05-01 01:30:10,982 DEBUG [utils.nio.NioConnection] (Agent-Selector:null) Location 1: Socket Socket[addr=/X.X.X.8,port=8250,localport=50613] closed on read. Pro bably -1 returned: Connection closed with -1 on reading size. 2016-05-01 01:30:10,983 DEBUG [utils.nio.NioConnection] (Agent-Selector:null) Closing socket Socket[addr=/X.X.X.8,port=8250,localport=50613] 2016-05-01 01:30:10,983 DEBUG [cloud.agent.Agent] (Agent-Handler-3:null) Clearing watch list: 2 2016-05-01 01:30:15,984 INFO [cloud.agent.Agent] (Agent-Handler-3:null) Lost connection to the server. Dealing with the remaining commands... 2016-05-01 01:30:20,985 INFO [cloud.agent.Agent] (Agent-Handler-3:null) Reconnecting... 2016-05-01 01:30:20,986 INFO [utils.nio.NioClient] (Agent-Selector:null) Connecting to X.X.X.8:8250 2016-05-01 01:30:21,101 INFO [utils.nio.NioClient] (Agent-Selector:null) SSL: Handshake done 2016-05-01 01:30:21,101 INFO [utils.nio.NioClient] (Agent-Selector:null) Connected to X.X.X.8:8250 2016-05-01 01:30:21,133 DEBUG [kvm.resource.LibvirtCapXMLParser] (Agent-Handler-1:null) Found /usr/bin/kvm as a suiteable emulator 2016-05-01 01:30:21,134 DEBUG [kvm.resource.LibvirtComputingResource] (Agent-Handler-1:null) Executing: /bin/bash -c qemu-img --help|grep convert 2016-05-01 01:30:21,152 DEBUG [kvm.resource.LibvirtComputingResource] (Agent-Handler-1:null) Execution is successful. 2016-05-01 01:30:21,152 DEBUG [kvm.resource.LibvirtComputingResource] (Agent-Handler-1:null) convert [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O output _fmt] [-o options] [-s snapshot_id_or_name] [-l snapshot_param] [-S sparse_size] filename [filename2 [...]] output_filename options are: 'none', 'writeback' (default, except for convert), 'writethrough', 'directsync' and 'unsafe' (default for convert) ==== and then it will reconnected again, and then got disconnected again, etc. It will continue on that loop. haproxy.cfg configuration for the NIC facing the hypervisors: ==== listen cloudstack_systemvm_8250 bind X.X.X.8:8250 mode tcp option tcplog balance source server management-server-01.xxx.com X.X.X.3:8250 maxconn 32 check server management-server-02.xxx.com X.X.X.6:8250 maxconn 32 check ==== Note that .3 and .6 are the first and second CloudStack management servers respectively, while .8 is the IP of the load balancer. We are using one LB at the moment. Nothing much is found on the haproxy.log, see below. ==== May 1 01:14:41 cs-haproxy-02 haproxy[923]: X.X.X.28:50401 [01/May/2016:01:12:50.803] cloudstack_systemvm_8250 cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/110340 8584 cD 0/0/0/0/0 0/0 May 1 01:15:46 cs-haproxy-02 haproxy[923]: X.X.X.28:50402 [01/May/2016:01:14:51.150] cloudstack_systemvm_8250 cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/54920 8234 cD 0/0/0/0/0 0/0 May 1 01:16:47 cs-haproxy-02 haproxy[923]: X.X.X.28:50403 [01/May/2016:01:15:56.075] cloudstack_systemvm_8250 cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/51344 7868 cD 0/0/0/0/0 0/0 May 1 01:17:48 cs-haproxy-02 haproxy[923]: X.X.X.28:50404 [01/May/2016:01:16:57.426] cloudstack_systemvm_8250 cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/50854 7630 cD 0/0/0/0/0 0/0 May 1 01:18:49 cs-haproxy-02 haproxy[923]: X.X.X.28:50405 [01/May/2016:01:17:58.285] cloudstack_systemvm_8250 cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/50955 7630 cD 0/0/0/0/0 0/0 May 1 01:24:49 cs-haproxy-02 haproxy[923]: X.X.X.28:50406 [01/May/2016:01:18:59.245] cloudstack_systemvm_8250 cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/350361 14638 cD 0/0/0/0/0 0/0 May 1 01:28:00 cs-haproxy-02 haproxy[923]: X.X.X.28:50571 [01/May/2016:01:27:09.638] cloudstack_systemvm_8250 cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/50602 2852 cD 0/0/0/0/0 0/0 May 1 01:30:11 cs-haproxy-02 haproxy[923]: X.X.X.28:50613 [01/May/2016:01:29:20.260] cloudstack_systemvm_8250 cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/50876 7630 cD 0/0/0/0/0 0/0 May 1 01:32:11 cs-haproxy-02 haproxy[923]: X.X.X.28:50614 [01/May/2016:01:30:21.142] cloudstack_systemvm_8250 cloudstack_systemvm_8250/management-server-01.xxx.com 1/0/110308 8870 cD 0/0/0/0/0 0/0 ==== Note that .28 is the IP address of the hypervisor I tested to connect to the LB IP. Did I missed out anything on the haproxy configuration? Any advice is greatly appreciated. Looking forward to your reply, thank you. Cheers.