Hi Yipei, I just tried to reproduce this and was not successful.
I setup a tenant network, added a web server to it, created a loadbalancer VIP on the tenant network, added the webserver as a member on the load balancer. I can curl from the tenant network qdhcp- netns without issue. Are you running a recent version of Octavia? This bug might be impacting you: https://review.openstack.org/501915 Can you check if this routing table is present inside your amphora netns? root@amphora-f58cdb7c-8fde-4eea-bd40-780664cfa49f:~# ip netns exec amphora-haproxy ip route show table 1 default via <tenant network gateway> dev eth1 onlink It could be that since they are all on the same subnet there is a return path issue in the kernel which this patch fixes with a policy based route. Michael On Mon, Sep 25, 2017 at 1:33 AM, Yipei Niu <[email protected]> wrote: > Hi, all, > > I encounter some problems when using Octavia. After installing octavia with > devstack, I create a load balancer named lb1 (VIP: 10.0.1.9, IP of VRRP > port: 10.0.1.3) for a subnet (10.0.1.0/24), then a listener, a pool, and two > members. All the resources are created successfully. The two members (VMs) > reside in the same subnet, whose IP are 10.0.1.6 and 10.0.1.7, respectively. > To simulate a web server in each VM, I run "while true; do echo -e "HTTP/1.0 > 200 OK\r\n\r\nWelcome to $VM_IP" | sudo nc -l -p 80;done" to listen on port > 80 and return the VM's IP if the request is accepted. I run "sudo ip netns > exec qdhcp-XXXX curl -v $VM_IP" to send requests to VMs, the "web servers" > in VMs work (already added corresponding security rules to the VMs). Then I > tried to run "sudo ip netns exec qdhcp-XXXX curl -v $VIP" to send requests, > the VMs do not respond, and finally returns a timeout error. > > The configuration details in local.conf are as follows. > > [[local|localrc]] > > DATABASE_PASSWORD=password > RABBIT_PASSWORD=password > SERVICE_PASSWORD=password > SERVICE_TOKEN=password > ADMIN_PASSWORD=password > > HOST_IP=192.168.56.9 > > LOGFILE=/opt/stack/logs/stack.sh.log > VERBOSE=True > LOG_COLOR=True > SCREEN_LOGDIR=/opt/stack/logs > > # Neutron LBaaS > enable_plugin neutron-lbaas https://github.com/openstack/neutron-lbaas.git > enable_plugin octavia https://github.com/openstack/octavia.git > ENABLED_SERVICES+=,q-lbaasv2 > ENABLED_SERVICES+=,octavia,o-cw,o-hk,o-hm,o-api > > disable_service horizon > disable_service tempest > > To investigate the source of the error, I logon to the amphora. The details > of interfaces of amphora_haproxy network namespace are as follows. > > 1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1 > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast state > UP group default qlen 1000 > link/ether fa:16:3e:42:bf:d9 brd ff:ff:ff:ff:ff:ff > inet 10.0.1.3/24 brd 10.0.1.255 scope global eth1 > valid_lft forever preferred_lft forever > inet 10.0.1.9/24 brd 10.0.1.255 scope global secondary eth1:0 > valid_lft forever preferred_lft forever > inet6 fe80::f816:3eff:fe42:bfd9/64 scope link > valid_lft forever preferred_lft forever > > So I run "sudo ip netns exec amphora-haproxy tcpdump -i eth1 -nn 'tcp'" to > check whether amphora receive the request. The details are as follows. > > tcpdump: verbose output suppressed, use -v or -vv for full protocol decode > listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes > ^C06:11:49.048973 IP 10.0.1.2.33700 > 10.0.1.9.80: Flags [S], seq > 1717936594, win 28200, options [mss 1410,sackOK,TS val 28032309 ecr > 0,nop,wscale 7], length 0 > 06:11:50.031976 IP 10.0.1.2.33700 > 10.0.1.9.80: Flags [S], seq 1717936594, > win 28200, options [mss 1410,sackOK,TS val 28032559 ecr 0,nop,wscale 7], > length 0 > 06:11:52.026565 IP 10.0.1.2.33700 > 10.0.1.9.80: Flags [S], seq 1717936594, > win 28200, options [mss 1410,sackOK,TS val 28033060 ecr 0,nop,wscale 7], > length 0 > 06:11:56.002577 IP 10.0.1.2.33700 > 10.0.1.9.80: Flags [S], seq 1717936594, > win 28200, options [mss 1410,sackOK,TS val 28034062 ecr 0,nop,wscale 7], > length 0 > 06:12:03.909721 IP 10.0.1.2.33700 > 10.0.1.9.80: Flags [S], seq 1717936594, > win 28200, options [mss 1410,sackOK,TS val 28036064 ecr 0,nop,wscale 7], > length 0 > > Based on the trace, we can see that amphora do receive the request, but > haproxy does not send handshake datagram to respond. Then, to see whether > haproxy in the amphora listens on the right IP and port, I print > /var/lib/octavia/ec5ee7c5-7474-424f-9f44-71b338cf3e57/haproxy.cfg in the > console and see the following info. > > # Configuration for lb1 > global > daemon > user nobody > log /dev/log local0 > log /dev/log local1 notice > stats socket /var/lib/octavia/ec5ee7c5-7474-424f-9f44-71b338cf3e57.sock > mode 0666 level user > > defaults > log global > retries 3 > option redispatch > timeout connect 5000 > timeout client 50000 > timeout server 50000 > > frontend ec5ee7c5-7474-424f-9f44-71b338cf3e57 > option httplog > bind 10.0.1.9:80 > mode http > default_backend 40537c80-979d-49c9-b3ae-8504812c0f42 > > backend 40537c80-979d-49c9-b3ae-8504812c0f42 > mode http > balance roundrobin > server 73dc9a1d-1e92-479b-a6f3-8debd0ea17b8 10.0.1.6:80 weight 1 > server 4cdca33f-9cde-4ac2-a5bd-550d3e65f0f2 10.0.1.7:80 weight 1 > > Next, I print the info of all the running haproxy process in the console and > copy it below. > > root 2367 0.0 0.0 4228 740 ? Ss 07:14 0:00 > /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p > /run/haproxy.pid > haproxy 2370 0.0 0.5 37692 5340 ? S 07:14 0:00 > /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds > haproxy 2371 0.0 0.0 37692 924 ? Ss 07:14 0:00 > /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds > root 2471 0.0 0.0 4228 636 ? Ss 07:14 0:00 > /usr/sbin/haproxy-systemd-wrapper -f > /var/lib/octavia/ec5ee7c5-7474-424f-9f44-71b338cf3e57/haproxy.cfg -f > /var/lib/octavia/haproxy-default-user-group.conf -p > /var/lib/octavia/ec5ee7c5-7474-424f-9f44-71b338cf3e57/ec5ee7c5-7474-424f-9f44-71b338cf3e57.pid > -L A2GNEZ_IsG5HmdyY2LmdG3LSOco > nobody 2477 0.0 0.5 37676 5824 ? S 07:14 0:00 > /usr/sbin/haproxy -f > /var/lib/octavia/ec5ee7c5-7474-424f-9f44-71b338cf3e57/haproxy.cfg -f > /var/lib/octavia/haproxy-default-user-group.conf -p > /var/lib/octavia/ec5ee7c5-7474-424f-9f44-71b338cf3e57/ec5ee7c5-7474-424f-9f44-71b338cf3e57.pid > -L A2GNEZ_IsG5HmdyY2LmdG3LSOco -Ds > nobody 2478 0.1 0.3 37676 3140 ? Ss 07:14 0:01 > /usr/sbin/haproxy -f > /var/lib/octavia/ec5ee7c5-7474-424f-9f44-71b338cf3e57/haproxy.cfg -f > /var/lib/octavia/haproxy-default-user-group.conf -p > /var/lib/octavia/ec5ee7c5-7474-424f-9f44-71b338cf3e57/ec5ee7c5-7474-424f-9f44-71b338cf3e57.pid > -L A2GNEZ_IsG5HmdyY2LmdG3LSOco -Ds > ubuntu 2485 0.0 0.1 12916 1092 pts/0 S+ 07:36 0:00 grep > --color=auto haproxy > > I run strace to trace the activities of all the above processes. When > sending request to the VIP, none of the above processes takes action to > receive the datagram. The details are omitted, since they give little > information. > > Above all, I think the haproxy fail to receive the ingress traffic from the > IP and port it listens on. > > What do you think? Look forward to your valuable comments. Thank you. > > Best regards, > Yipei > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
