[
https://issues.apache.org/jira/browse/CLOUDSTACK-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13744681#comment-13744681
]
angeline shen commented on CLOUDSTACK-4199:
-------------------------------------------
Testing RVR using 8/19/13 latest build #465
MS 10.223.195.52
host 10.223.50.3 XS 6.2
1. advance zone with XEN 6.2 cluster
2. create network offering with RVR enabled
id: 17
name: rvrallow
uuid: bc34384a-7704-4786-87cc-e5cc24f32dcf
unique_name: rvrallow
display_text: rvrallow
nw_rate: NULL
mc_rate: 10
traffic_type: Guest
tags: NULL
system_only: 0
specify_vlan: 0
service_offering_id: NULL
conserve_mode: 0
created: 2013-08-19 22:34:41
removed: NULL
default: 0
availability: Optional
dedicated_lb_service: 1
shared_source_nat_service: 0
sort_key: 0
redundant_router_service: 1
state: Enabled
guest_type: Isolated
elastic_ip_service: 0
eip_associate_public_ip: 0
elastic_lb_service: 0
specify_ip_ranges: 0
inline: 0
is_persistent: 0
internal_lb: 0
public_lb: 1
egress_default_policy: 1
concurrent_connections: NULL
17 rows in set (0.00 sec)
3. As admin deploy VM using above nw offering rvrallow:
id: 10
name: z1rvrallowadminV40
uuid: 16eb63bd-53fc-45a9-8d28-c77c82f86dc0
instance_name: i-2-10-VM
state: Running
vm_template_id: 5
guest_os_id: 12
private_mac_address: 02:00:63:30:00:06
private_ip_address: 10.1.1.23
pod_id: 1
data_center_id: 1
host_id: 1
last_host_id: 1
proxy_id: 2
proxy_assign_time: 2013-08-20 00:45:33
vnc_password: YGzUsIS8Z+37W9Udm2+S4zvsV07NT9bP
ha_enabled: 1
limit_cpu_use: 0
update_count: 3
update_time: 2013-08-19 22:46:09
created: 2013-08-19 22:42:40
removed: NULL
type: User
vm_type: User
account_id: 2
domain_id: 1
service_offering_id: 12
reservation_id: 3af80789-c7ea-47e8-a79b-8c1cc6b284de
hypervisor_type: XenServer
disk_offering_id: NULL
cpu: NULL
ram: NULL
owner: 2
speed: 100
host_name: z1rvrallowadminV40
display_name: z1rvrallowadminV40
desired_state: NULL
dynamically_scalable: 1
display_vm: 1
4. The above steps deployed RVR routers without any issues:
id: 11 <=== MASTER
name: r-11-VM
uuid: 67bd2d37-cc5a-45ce-8cf7-74b06e1d9c8f
instance_name: r-11-VM
state: Running
vm_template_id: 1
guest_os_id: 183
private_mac_address: 0e:00:a9:fe:02:c4
private_ip_address: 169.254.2.196
pod_id: 1
data_center_id: 1
host_id: 1
last_host_id: 1
proxy_id: NULL
proxy_assign_time: NULL
vnc_password: trOmaP4q+opdndMZM6Wayg4svTXwTje2WCbEWXYp8Ic=
ha_enabled: 0
limit_cpu_use: 0
update_count: 29
update_time: 2013-08-20 02:25:07
created: 2013-08-19 22:42:41
removed: NULL
type: DomainRouter
vm_type: DomainRouter
account_id: 2
domain_id: 1
service_offering_id: 7
reservation_id: c881947a-1737-410e-9d92-3b63268e6c53
hypervisor_type: XenServer
disk_offering_id: NULL
cpu: NULL
ram: NULL
owner: NULL
speed: NULL
host_name: NULL
display_name: NULL
desired_state: NULL
dynamically_scalable: 0
display_vm: 1
*************************** 12. row ***************************
id: 12 <=== BACKUP
name: r-12-VM
uuid: be11be35-d844-44fd-ae54-4656d9433855
instance_name: r-12-VM
state: Running
vm_template_id: 1
guest_os_id: 183
private_mac_address: 0e:00:a9:fe:02:12
private_ip_address: 169.254.2.18
pod_id: 1
data_center_id: 1
host_id: 1
last_host_id: 1
proxy_id: NULL
proxy_assign_time: NULL
vnc_password: DRNlYWFm2jI1JxaQSm6r2EHvPxT+JMCu5FDxW9QwUhQ=
ha_enabled: 0
limit_cpu_use: 0
update_count: 11
update_time: 2013-08-19 22:45:57
created: 2013-08-19 22:42:41
removed: NULL
type: DomainRouter
vm_type: DomainRouter
account_id: 2
domain_id: 1
service_offering_id: 7
reservation_id: 37d39fd6-335b-4cda-a925-cb9f45b3305f
hypervisor_type: XenServer
disk_offering_id: NULL
cpu: NULL
ram: NULL
owner: NULL
speed: NULL
host_name: NULL
display_name: NULL
desired_state: NULL
dynamically_scalable: 0
display_vm: 1
5. Stop the MASTER VR from CloudStack
Observations:
i. MASTER router went into stopped state successfully but, BACKUP router stuck
in "FAULT" state forever.
Here is the snippet of keepalived.log for FAULT router:
root@r-12-VM:~# cat /ramdisk/rrouter/keepalived.log
To backup called
Disable public ip 0
Password server is not running
Stopping DNS forwarder and DHCP server: dnsmasq(not running) ... (warning).
cache internal:
current active connections: 0
connections created: 0 failed: 0
connections updated: 0 failed: 0
connections destroyed: 0 failed: 0
cache external:
current active connections: 0
connections created: 0 failed: 0
connections updated: 0 failed: 0
connections destroyed: 0 failed: 0
traffic processed:
0 Bytes 0 Pckts
multicast traffic (active device=eth0):
16 Bytes sent 24 Bytes recv
1 Pckts sent 2 Pckts recv
0 Error send 0 Error recv
message tracking:
0 Malformed msgs 0 Lost msgs
Conntrackd switch to backup done
Switch conntrackd mode backup 0
Status: BACKUP
To master called
ifdown: interface eth2 not configured
RTNETLINK answers: File exists
Failed to bring up eth2.
RTNETLINK answers: No such process
Enable public ip returned 2
Fail to enable public ip!
Password server is not running
Stopping DNS forwarder and DHCP server: dnsmasq(not running) ... (warning).
Stopping keepalived: keepalived.
Stopping conntrackd.
Status: FAULT (RTNETLINK answers: No such process)
root@r-12-VM:~#
6. MASTER router log:
root@r-11-VM:~# cat /ramdisk/rrouter/keepalived.log
To backup called
Disable public ip 0
Password server is not running
Stopping DNS forwarder and DHCP server: dnsmasq.
cache internal:
current active connections: 0
connections created: 0 failed: 0
connections updated: 0 failed: 0
connections destroyed: 0 failed: 0
cache external:
current active connections: 0
connections created: 0 failed: 0
connections updated: 0 failed: 0
connections destroyed: 0 failed: 0
traffic processed:
0 Bytes 0 Pckts
multicast traffic (active device=eth0):
40 Bytes sent 0 Bytes recv
5 Pckts sent 0 Pckts recv
0 Error send 0 Error recv
message tracking:
0 Malformed msgs 0 Lost msgs
Conntrackd switch to backup done
Switch conntrackd mode backup 0
Status: BACKUP
To master called
ifdown: interface eth2 not configured
Password server is not running
Removed cloud-passwd-srvr iptables rules
Added cloud-passwd-srvr iptables rules
Restarting DNS forwarder and DHCP server: dnsmasq10.1.1.236/24 10.1.1.1/24
.
Enable public ip returned 0
Conntrackd switch to primary done
Switch conntrackd mode primary returned 0
ARPING 10.223.122.18 from 10.223.122.18 eth2
Sent 1 probes (1 broadcast(s))
Received 0 response(s)
ARPING 10.223.122.18 from 10.223.122.18 eth2
Sent 1 probes (1 broadcast(s))
Received 0 response(s)
Status: MASTER
root@r-11-VM:~#
7. MASTER
* ifconfig output
* ifconfig -a output
* /ramdisk/rrouter/keepalived.log
* checkrouter.sh output
d. BACKUP (before and after failover)
* ifconfig output
* ifconfig -a output
* /ramdisk/rrouter/keepalived.log
* checkrouter.sh output
root@r-11-VM:~# ifconfig
eth0 Link encap:Ethernet HWaddr 02:00:67:29:00:02
inet addr:10.1.1.236 Bcast:10.1.1.255 Mask:255.255.255.0
inet6 addr: fe80::67ff:fe29:2/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1614 errors:0 dropped:0 overruns:0 frame:0
TX packets:11505 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:95725 (93.4 KiB) TX bytes:627963 (613.2 KiB)
Interrupt:22
eth1 Link encap:Ethernet HWaddr 0e:00:a9:fe:02:c4
inet addr:169.254.2.196 Bcast:169.254.255.255 Mask:255.255.0.0
inet6 addr: fe80::c00:a9ff:fefe:2c4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:8826 errors:0 dropped:0 overruns:0 frame:0
TX packets:8048 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1239708 (1.1 MiB) TX bytes:1493980 (1.4 MiB)
Interrupt:23
eth2 Link encap:Ethernet HWaddr 06:23:60:00:00:18
inet addr:10.223.122.18 Bcast:10.223.122.63 Mask:255.255.255.192
inet6 addr: fe80::423:60ff:fe00:18/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:127 errors:0 dropped:0 overruns:0 frame:0
TX packets:218 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:13423 (13.1 KiB) TX bytes:16317 (15.9 KiB)
Interrupt:24
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:12 errors:0 dropped:0 overruns:0 frame:0
TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2470 (2.4 KiB) TX bytes:2470 (2.4 KiB)
root@r-11-VM:~# ifconfig -a
eth0 Link encap:Ethernet HWaddr 02:00:67:29:00:02
inet addr:10.1.1.236 Bcast:10.1.1.255 Mask:255.255.255.0
inet6 addr: fe80::67ff:fe29:2/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1648 errors:0 dropped:0 overruns:0 frame:0
TX packets:11720 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:97847 (95.5 KiB) TX bytes:639723 (624.7 KiB)
Interrupt:22
eth1 Link encap:Ethernet HWaddr 0e:00:a9:fe:02:c4
inet addr:169.254.2.196 Bcast:169.254.255.255 Mask:255.255.0.0
inet6 addr: fe80::c00:a9ff:fefe:2c4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:9084 errors:0 dropped:0 overruns:0 frame:0
TX packets:8276 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1266636 (1.2 MiB) TX bytes:1530404 (1.4 MiB)
Interrupt:23
eth2 Link encap:Ethernet HWaddr 06:23:60:00:00:18
inet addr:10.223.122.18 Bcast:10.223.122.63 Mask:255.255.255.192
inet6 addr: fe80::423:60ff:fe00:18/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:128 errors:0 dropped:0 overruns:0 frame:0
TX packets:218 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:13451 (13.1 KiB) TX bytes:16317 (15.9 KiB)
Interrupt:24
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:12 errors:0 dropped:0 overruns:0 frame:0
TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2470 (2.4 KiB) TX bytes:2470 (2.4 KiB)
root@r-11-VM:~# checkrouter.sh
Status: MASTER&Bumped: NO
8. BACKUP:
root@r-12-VM:~# ifconfig
eth0 Link encap:Ethernet HWaddr 02:00:50:bd:00:03
inet addr:10.1.1.75 Bcast:10.1.1.255 Mask:255.255.255.0
inet6 addr: fe80::50ff:febd:3/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:31164 errors:0 dropped:0 overruns:0 frame:0
TX packets:8935 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1254816 (1.1 MiB) TX bytes:515974 (503.8 KiB)
Interrupt:24
eth1 Link encap:Ethernet HWaddr 0e:00:a9:fe:02:12
inet addr:169.254.2.18 Bcast:169.254.255.255 Mask:255.255.0.0
inet6 addr: fe80::c00:a9ff:fefe:212/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:29972 errors:0 dropped:0 overruns:0 frame:0
TX packets:27342 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:4248712 (4.0 MiB) TX bytes:5123688 (4.8 MiB)
Interrupt:25
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:2 errors:0 dropped:0 overruns:0 frame:0
TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:214 (214.0 B) TX bytes:214 (214.0 B)
root@r-12-VM:~# ifconfig -a
eth0 Link encap:Ethernet HWaddr 02:00:50:bd:00:03
inet addr:10.1.1.75 Bcast:10.1.1.255 Mask:255.255.255.0
inet6 addr: fe80::50ff:febd:3/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:31197 errors:0 dropped:0 overruns:0 frame:0
TX packets:8935 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1256068 (1.1 MiB) TX bytes:515974 (503.8 KiB)
Interrupt:24
eth1 Link encap:Ethernet HWaddr 0e:00:a9:fe:02:12
inet addr:169.254.2.18 Bcast:169.254.255.255 Mask:255.255.0.0
inet6 addr: fe80::c00:a9ff:fefe:212/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:30080 errors:0 dropped:0 overruns:0 frame:0
TX packets:27442 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:4258400 (4.0 MiB) TX bytes:5139200 (4.9 MiB)
Interrupt:25
eth2 Link encap:Ethernet HWaddr 06:23:60:00:00:18
inet addr:10.223.122.18 Bcast:10.223.122.63 Mask:255.255.255.192
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Interrupt:26
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:2 errors:0 dropped:0 overruns:0 frame:0
TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:214 (214.0 B) TX bytes:214 (214.0 B)
root@r-12-VM:~# checkrouter.sh
Status: FAULT (RTNETLINK answers: No such process)&Bumped: NO
> Redundant Virtual Router - no failover occur
> --------------------------------------------
>
> Key: CLOUDSTACK-4199
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4199
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
> Affects Versions: 4.2.0
> Environment: MS ACS 4.2 campo internal build 341
> host XS 6.2
> Reporter: angeline shen
> Assignee: Sheng Yang
> Fix For: 4.2.0
>
> Attachments: FAULT_logs.tgz, logs.tgz, management-server.log.gz,
> MASTER_logs.tgz, Screenshot-CloudPlatform™ - Mozilla Firefox-3.png,
> Screenshot-CloudPlatform™ - Mozilla Firefox-4.png
>
>
> 1. create network offering 'egallowrvrnw1' with egress firewall policy :
> allow , redundant router
> advance zone. create network of this offering. create guest VMs
> Verify ssh to VMs. VMs can ping other VMs in this network & reach
> internet
> 2. RVR MASTER r-37-VM
> RVR BACKUP r-38-VM
> stop r-37-VM
> Result: r-37-VM state becomes UNKNOWN
> r-38-VM state becomes FAULT
> no failover occur
> Cannot ssh to existing VMs
> 3. start r-37-VM.
> Result: r-37-VM state becomes MASTER
> r-38-VM state remains FAULT
> VMs can reach other VMs in same network.
> VMs cannot reach internet
> 4. stop r-37-VM
> r-37-VM state becomes UNKNOWN
> r-38-VM state becomes FAULT
> no failover occur
> Cannot ssh to existing VMs
> r.VirtualNetworkApplianceManagerImpl] (RouterStatusMonitor-1:null) Found 1
> networks to update RvR status.
> 2013-08-08 19:22:44,763 INFO
> [network.router.VirtualNetworkApplianceManagerImpl]
> (RedundantRouterStatusMonitor-6:null) Redundant virtual router (name:
> r-37-VM, id: 37) just switch from MASTER to UNKNOWN
> 2013-08-08 19:22:44,768 DEBUG [agent.transport.Request]
> (RedundantRouterStatusMonitor-6:null) Seq 1-2062888873: Sending { Cmd ,
> MgmtId: 7343890761426, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.CheckRouterCommand":{"a
> ccessDetails":{"router.ip":"169.254.3.245","router.name":"r-38-VM"},"wait":30}}]
> }
> 2013-08-08 19:22:44,769 DEBUG [agent.transport.Request]
> (RedundantRouterStatusMonitor-6:null) Seq 1-2062888873: Executing: { Cmd ,
> MgmtId: 7343890761426, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.CheckRouterCommand":
> 2013-08-08 19:22:45,116 INFO
> [network.router.VirtualNetworkApplianceManagerImpl]
> (RedundantRouterStatusMonitor-6:null) Redundant virtual router (name:
> r-38-VM, id: 38) just switch from BACKUP to FAULT
> 2013-08-08 19:22:45,344 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-270:null) Seq 1-2062888874: Response Received:
> 2013-08-08 19:22:45,345 DEBUG [agent.transport.Request]
> (DirectAgent-270:null) Seq 1-2062888874: Processing: { Ans: , MgmtId:
> 7343890761426, via: 1, Ver: v1, Flags: 10,
> [{"com.cloud.agent.api.CheckRouterAnswer":{"state":"FAULT","
> isBumped":false,"result":true,"details":"Status: FAULT (RTNETLINK answers: No
> such process)&Bumped: NO","wait":0}}] }
> 2013-08-08 19:22:45,345 DEBUG [agent.transport.Request]
> (RedundantRouterStatusMonitor-6:null) Seq 1-2062888874: Received: { Ans: ,
> MgmtId: 7343890761426, via: 1, Ver: v1, Flags: 10, { CheckRouterAnswer } }
> 2013-08-08 19:22:45,345 DEBUG [agent.manager.AgentManagerImpl]
> (RedundantRouterStatusMonitor-6:null) Details from executing class
> com.cloud.agent.api.CheckRouterCommand: Status: FAULT (RTNETLINK answers: No
> such process)&Bumped: N
> O
> 2013-08-08 19:22:45,349 INFO
> [network.router.VirtualNetworkApplianceManagerImpl]
> (RedundantRouterStatusMonitor-6:null) Redundant virtual router (name:
> r-38-VM, id: 38) just switch from BACKUP to FAULT
> 2013-08-08 19:22:46,781 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-13:null) Ping from 2
> 2013-08-08 19:22:47,125 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-12:null) Ping from 3
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira