[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941531#comment-15941531
 ] 

Patrick commented on CLOUDSTACK-9385:
-------------------------------------

just upgraded from 4.5.2 to 4.9.2, xen, and also impacted by the same issue.
A few additional clarification:
Happenned at the RvR recreation to apply the new SVM template. When RvR are 
rebooted to install the new SVM version, the pair always end up both in BACKUP 
state, whether I do VR reboot, network clean reboot, stop / start, etc. 
To fix it, had to find which of the two VR was displaying the errors: "Password 
server failed with error code 1. Restarting it...", restart the password 
service, restart the VR and it would then gain its MASTER state. From this 
point forward, the role switch between the two VR goes smoothly, until either 
VR is recreated. This is pretty ugly, I'm switching my RvR to standalone to 
avoid this issue.

> Password Server is not running on RvR
> -------------------------------------
>
>                 Key: CLOUDSTACK-9385
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9385
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: ISO, SystemVM
>    Affects Versions: 4.6.0, 4.6.1, 4.6.2, 4.7.0, 4.7.1, 4.8.0
>            Reporter: dsclose
>
> NB: I have not tested this on VPC routers.
> The cloud-passwd-srvr service fails on redundant virtual routers. This 
> appears to only concern redundant virtual routers. Standalone routers launch 
> the password server successfully, as per this bash session:
> {code:title=Standalone Router}
> root@r-3775-VM:~# ps aux | grep passwd | grep -v grep
> root      2257  0.0  0.5   9244  1328 ?        S    14:27   0:00 /bin/bash 
> /opt/cloud/bin/passwd_server_ip 10.1.1.1 dummy
> root      2259  0.0  3.2  37276  8128 ?        S    14:27   0:00 python 
> /opt/cloud/bin/passwd_server_ip.py 10.1.1.1
> root@r-3775-VM:~# netstat -tnlp | grep 2259
> tcp        0      0 10.1.1.1:8080           0.0.0.0:*               LISTEN    
>   2259/python
> {code}
> However, redundant virtual routers do not exhibit this behaviour. Instead, 
> the password server process is running without an IP argument. No matching 
> process is bound  to any ports:
> {code:title=Master Redundant Virtual Router}
> root@r-3776-VM:~# ps aux | grep passwd | grep -v grep
> root      5152  0.0  0.2  17684  1516 ?        S    14:38   0:00 /bin/bash 
> /opt/cloud/bin/passwd_server_ip None dummy
> root@r-3776-VM:~# netstat -ntlp | grep 5152
> root@r-3776-VM:~#
> {code}
> Further, an error message is being repeated in /var/log/messages:
> {code:title=/var/log/messages}
> May 24 14:53:07 r-3776-VM cloud: Password server failed with error code 1. 
> Restarting it...
> May 24 14:53:11 r-3776-VM cloud: Password server failed with error code 1. 
> Restarting it...
> May 24 14:53:14 r-3776-VM cloud: Password server failed with error code 1. 
> Restarting it...
> May 24 14:53:17 r-3776-VM cloud: Password server failed with error code 1. 
> Restarting it...
> May 24 14:53:20 r-3776-VM cloud: Password server failed with error code 1. 
> Restarting it...
> May 24 14:53:23 r-3776-VM cloud: Password server failed with error code 1. 
> Restarting it...
> May 24 14:53:26 r-3776-VM cloud: Password server failed with error code 1. 
> Restarting it...
> May 24 14:53:29 r-3776-VM cloud: Password server failed with error code 1. 
> Restarting it...
> {code}
> No process is bound to the password server port. Consequently, attempts to 
> request a password from the password server get rejected.
> Manually restarting the cloud-passwd-srvr resolves this issue immediately:
> {code:title=Master Redundant Virtual Router}
> root@r-3776-VM:~# service cloud-passwd-srvr restart
> Killed password server (pid=4874)
> iptables: Bad rule (does a matching rule exist in that chain?).
> Removed cloud-passwd-srvr iptables rules
> Stopped password server (pid=5152)
> iptables: Bad rule (does a matching rule exist in that chain?).
> Removed cloud-passwd-srvr iptables rules
> Added cloud-passwd-srvr iptables rules
> root@r-3776-VM:~# nohup: appending output to `nohup.out'
> root@r-3776-VM:~# ps aux | grep passwd | grep -v grep
> root     15776  0.0  0.3  19436  1576 pts/1    S    15:05   0:00 /bin/bash 
> /opt/cloud/bin/passwd_server_ip 10.1.1.250
> root     15780  0.2  1.6  45484  8304 pts/1    S    15:05   0:00 python 
> /opt/cloud/bin/passwd_server_ip.py 10.1.1.250
> root     15781  0.0  0.3  19436  1572 pts/1    S    15:05   0:00 /bin/bash 
> /opt/cloud/bin/passwd_server_ip 10.1.1.1
> root     15782  0.2  1.6  49692  8396 pts/1    S    15:05   0:00 python 
> /opt/cloud/bin/passwd_server_ip.py 10.1.1.1
> root@r-3776-VM:~# netstat -ntlp | grep 15780
> tcp        0      0 10.1.1.250:8080         0.0.0.0:*               LISTEN    
>   15780/python
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to