[
https://issues.apache.org/jira/browse/CLOUDSTACK-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13744896#comment-13744896
]
Jayapal Reddy commented on CLOUDSTACK-4199:
-------------------------------------------
This happening because the master.sh script is failed because of
enable_pubip.sh script failed to ifdown and ifup the eth2 interface.
Due to this the checkrouter.sh is giving fail state, see below command.
root@r-20-QA:/ramdisk/rrouter# checkrouter.sh
Status: FAULT (RTNETLINK answers: No such process)&Bumped: NO
Below master.sh script fail.
To master called
ifdown: interface eth2 not configured
RTNETLINK answers: File exists
Failed to bring up eth2.
RTNETLINK answers: No such process
Enable public ip returned 2
Fail to enable public ip!
Password server is not running
Stopping DNS forwarder and DHCP server: dnsmasq(not running) ... (warning).
Stopping keepalived: keepalived.
Stopping conntrackd.
Status: FAULT (RTNETLINK answers: No such process)
In router observed that when route is going to backup state it is running
disable_pubip.sh, where it
is doing ifconfig eth2 down.
When the same route switching to master it is trying to ifdown eth2 and ifup
eth2.
The ifdown, ifup is failed for the interface which is made down with ifconfig.
ifdown is running success with force option. SO with force option ifdown eth2
--force, ifup eth2 is success.
Fixed this by adding force option to ifdown
> Redundant Virtual Router - no failover occur
> --------------------------------------------
>
> Key: CLOUDSTACK-4199
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4199
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
> Affects Versions: 4.2.0
> Environment: MS ACS 4.2 campo internal build 341
> host XS 6.2
> Reporter: angeline shen
> Assignee: Jayapal Reddy
> Priority: Critical
> Fix For: 4.2.0
>
> Attachments: FAULT_logs.tgz, logs.tgz, management-server.log.gz,
> management-server.log.gz, MASTER_logs.tgz, Screenshot-CloudPlatform™ -
> Mozilla Firefox-3.png, Screenshot-CloudPlatform™ - Mozilla Firefox-4.png
>
>
> 1. create network offering 'egallowrvrnw1' with egress firewall policy :
> allow , redundant router
> advance zone. create network of this offering. create guest VMs
> Verify ssh to VMs. VMs can ping other VMs in this network & reach
> internet
> 2. RVR MASTER r-37-VM
> RVR BACKUP r-38-VM
> stop r-37-VM
> Result: r-37-VM state becomes UNKNOWN
> r-38-VM state becomes FAULT
> no failover occur
> Cannot ssh to existing VMs
> 3. start r-37-VM.
> Result: r-37-VM state becomes MASTER
> r-38-VM state remains FAULT
> VMs can reach other VMs in same network.
> VMs cannot reach internet
> 4. stop r-37-VM
> r-37-VM state becomes UNKNOWN
> r-38-VM state becomes FAULT
> no failover occur
> Cannot ssh to existing VMs
> r.VirtualNetworkApplianceManagerImpl] (RouterStatusMonitor-1:null) Found 1
> networks to update RvR status.
> 2013-08-08 19:22:44,763 INFO
> [network.router.VirtualNetworkApplianceManagerImpl]
> (RedundantRouterStatusMonitor-6:null) Redundant virtual router (name:
> r-37-VM, id: 37) just switch from MASTER to UNKNOWN
> 2013-08-08 19:22:44,768 DEBUG [agent.transport.Request]
> (RedundantRouterStatusMonitor-6:null) Seq 1-2062888873: Sending { Cmd ,
> MgmtId: 7343890761426, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.CheckRouterCommand":{"a
> ccessDetails":{"router.ip":"169.254.3.245","router.name":"r-38-VM"},"wait":30}}]
> }
> 2013-08-08 19:22:44,769 DEBUG [agent.transport.Request]
> (RedundantRouterStatusMonitor-6:null) Seq 1-2062888873: Executing: { Cmd ,
> MgmtId: 7343890761426, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.CheckRouterCommand":
> 2013-08-08 19:22:45,116 INFO
> [network.router.VirtualNetworkApplianceManagerImpl]
> (RedundantRouterStatusMonitor-6:null) Redundant virtual router (name:
> r-38-VM, id: 38) just switch from BACKUP to FAULT
> 2013-08-08 19:22:45,344 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-270:null) Seq 1-2062888874: Response Received:
> 2013-08-08 19:22:45,345 DEBUG [agent.transport.Request]
> (DirectAgent-270:null) Seq 1-2062888874: Processing: { Ans: , MgmtId:
> 7343890761426, via: 1, Ver: v1, Flags: 10,
> [{"com.cloud.agent.api.CheckRouterAnswer":{"state":"FAULT","
> isBumped":false,"result":true,"details":"Status: FAULT (RTNETLINK answers: No
> such process)&Bumped: NO","wait":0}}] }
> 2013-08-08 19:22:45,345 DEBUG [agent.transport.Request]
> (RedundantRouterStatusMonitor-6:null) Seq 1-2062888874: Received: { Ans: ,
> MgmtId: 7343890761426, via: 1, Ver: v1, Flags: 10, { CheckRouterAnswer } }
> 2013-08-08 19:22:45,345 DEBUG [agent.manager.AgentManagerImpl]
> (RedundantRouterStatusMonitor-6:null) Details from executing class
> com.cloud.agent.api.CheckRouterCommand: Status: FAULT (RTNETLINK answers: No
> such process)&Bumped: N
> O
> 2013-08-08 19:22:45,349 INFO
> [network.router.VirtualNetworkApplianceManagerImpl]
> (RedundantRouterStatusMonitor-6:null) Redundant virtual router (name:
> r-38-VM, id: 38) just switch from BACKUP to FAULT
> 2013-08-08 19:22:46,781 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-13:null) Ping from 2
> 2013-08-08 19:22:47,125 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-12:null) Ping from 3
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira