I have a cluster set up for load balancing a web based application that requires persistent connections. I'm using the ipvs sync daemon to keep the connection state information consistent between director and backup director. However, many times after a failover, the persistence does not work and clients end up connected to a different realserver. I'm running this on kernel 2.6.18 (RHEL 5, so not exactly bleeding edge.)
Here's some logging output. I had two clients connected to two different realservers: 159.63.77.30 connected to 10.204.54.167 159.63.77.44 connected to 10.204.54.166 On the director: ---------------- IPVS: p-schedule: src 159.63.77.30:55274 dest 10.204.54.170:443 mnet 159.63.77.16 IPVS: ip_vs_wlc_schedule(): Scheduling... IPVS: WLC: server 10.204.54.167:443 activeconns 0 refcnt 1 weight 1 overhead 0 IPVS: Bind-dest TCP c:159.63.77.16:0 v:10.204.54.170:443 d:10.204.54.167:443 fwd:R s:0 conn->flags:1183 conn->refcnt:1 dest->refcnt:2 IPVS: Bind-dest TCP c:159.63.77.30:55274 v:10.204.54.170:443 d:10.204.54.167:443 fwd:R s:0 conn->flags:183 conn->refcnt:1 dest->refcnt:3 IPVS: ADDing control for: cp.dst=159.63.77.30:55274 ctl_cp.dst=159.63.77.16:0 Enter: ip_vs_send_async, net/ipv4/ipvs/ip_vs_sync.c line 576 Leave: ip_vs_send_async, net/ipv4/ipvs/ip_vs_sync.c line 582 . . . IPVS: p-schedule: src 159.63.77.44:3993 dest 10.204.54.170:443 mnet 159.63.77.32 IPVS: ip_vs_wlc_schedule(): Scheduling... IPVS: WLC: server 10.204.54.166:443 activeconns 0 refcnt 1 weight 1 overhead 0 IPVS: Bind-dest TCP c:159.63.77.32:0 v:10.204.54.170:443 d:10.204.54.166:443 fwd:R s:0 conn->flags:1183 conn->refcnt:1 dest->refcnt:2 IPVS: Bind-dest TCP c:159.63.77.44:3993 v:10.204.54.170:443 d:10.204.54.166:443 fwd:R s:0 conn->flags:183 conn->refcnt:1 dest->refcnt:3 IPVS: ADDing control for: cp.dst=159.63.77.44:3993 ctl_cp.dst=159.63.77.32:0 . . . bnx2: eth0 NIC Link is Down <-- **Failover** Enter: ip_vs_del_dest, net/ipv4/ipvs/ip_vs_ctl.c line 997 IPVS: Moving dest 10.204.54.166:443 into trash, dest->refcnt=42 Leave: ip_vs_del_dest, net/ipv4/ipvs/ip_vs_ctl.c line 1024 Enter: ip_vs_del_dest, net/ipv4/ipvs/ip_vs_ctl.c line 997 IPVS: Moving dest 10.204.54.167:443 into trash, dest->refcnt=31 Leave: ip_vs_del_dest, net/ipv4/ipvs/ip_vs_ctl.c line 1024 ---------------- On the backup director (starting at the failover time): ---------------- Enter: ip_vs_receive, net/ipv4/ipvs/ip_vs_sync.c line 607 Leave: ip_vs_receive, net/ipv4/ipvs/ip_vs_sync.c line 618 IPVS: ip_vs_sched_getbyname(): sched_name "wlc" Enter: ip_vs_add_dest, net/ipv4/ipvs/ip_vs_ctl.c line 780 Enter: ip_vs_new_dest, net/ipv4/ipvs/ip_vs_ctl.c line 732 Leave: ip_vs_new_dest, net/ipv4/ipvs/ip_vs_ctl.c line 764 Leave: ip_vs_add_dest, net/ipv4/ipvs/ip_vs_ctl.c line 869 Enter: ip_vs_add_dest, net/ipv4/ipvs/ip_vs_ctl.c line 780 Enter: ip_vs_new_dest, net/ipv4/ipvs/ip_vs_ctl.c line 732 Leave: ip_vs_new_dest, net/ipv4/ipvs/ip_vs_ctl.c line 764 Leave: ip_vs_add_dest, net/ipv4/ipvs/ip_vs_ctl.c line 869 IPVS: p-schedule: src 159.63.77.44:4036 dest 10.204.54.170:443 mnet 159.63.77.32 IPVS: ip_vs_wlc_schedule(): Scheduling... IPVS: WLC: server 10.204.54.167:443 activeconns 0 refcnt 1 weight 1 overhead 0 IPVS: Bind-dest TCP c:159.63.77.32:0 v:10.204.54.170:443 d:10.204.54.167:443 fwd:R s:0 conn->flags:1183 conn->refcnt:1 dest->refcnt:2 IPVS: Bind-dest TCP c:159.63.77.44:4036 v:10.204.54.170:443 d:10.204.54.167:443 fwd:R s:0 conn->flags:183 conn->refcnt:1 dest->refcnt:3 IPVS: ADDing control for: cp.dst=159.63.77.44:4036 ctl_cp.dst=159.63.77.32:0 IPVS: p-schedule: src 159.63.77.44:4037 dest 10.204.54.170:443 mnet 159.63.77.32 IPVS: Bind-dest TCP c:159.63.77.44:4037 v:10.204.54.170:443 d:10.204.54.167:443 fwd:R s:0 conn->flags:183 conn->refcnt:1 dest->refcnt:4 IPVS: ADDing control for: cp.dst=159.63.77.44:4037 ctl_cp.dst=159.63.77.32:0 IPVS: p-schedule: src 159.63.77.44:4038 dest 10.204.54.170:443 mnet 159.63.77.32 . . . IPVS: p-schedule: src 159.63.77.30:57154 dest 10.204.54.170:443 mnet 159.63.77.16 IPVS: ip_vs_wlc_schedule(): Scheduling... IPVS: WLC: server 10.204.54.166:443 activeconns 0 refcnt 1 weight 1 overhead 0 IPVS: Bind-dest TCP c:159.63.77.16:0 v:10.204.54.170:443 d:10.204.54.166:443 fwd:L s:0 conn->flags:1181 conn->refcnt:1 dest->refcnt:2 IPVS: Bind-dest TCP c:159.63.77.30:57154 v:10.204.54.170:443 d:10.204.54.166:443 fwd:L s:0 conn->flags:181 conn->refcnt:1 dest->refcnt:3 IPVS: ADDing control for: cp.dst=159.63.77.30:57154 ctl_cp.dst=159.63.77.16:0 IPVS: p-schedule: src 159.63.77.30:50111 dest 10.204.54.170:443 mnet 159.63.77.16 IPVS: Bind-dest TCP c:159.63.77.30:50111 v:10.204.54.170:443 d:10.204.54.166:443 fwd:L s:0 conn->flags:181 conn->refcnt:1 dest->refcnt:4 IPVS: ADDing control for: cp.dst=159.63.77.30:50111 ctl_cp.dst=159.63.77.16:0 IPVS: p-schedule: src 159.63.77.30:43524 dest 10.204.54.170:443 mnet 159.63.77.16 IPVS: Bind-dest TCP c:159.63.77.30:43524 v:10.204.54.170:443 d:10.204.54.166:443 fwd:L s:0 conn->flags:181 conn->refcnt:1 dest->refcnt:5 ---------------- So after the failover, the clients' connections have been reversed: 159.63.77.30 is now connected to 10.204.54.166 159.63.77.44 is now connected to 10.204.54.167 If I run ipvsadm -L on the backup director before the failover, I do see the proper connections so I know the multicast is getting through. Any thoughts on what I might be doing wrong? Thanks, --Nick _______________________________________________ LinuxVirtualServer.org mailing list - [email protected] Send requests to [EMAIL PROTECTED] or go to http://lists.graemef.net/mailman/listinfo/lvs-users
