[ https://issues.apache.org/jira/browse/CLOUDSTACK-9385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15305344#comment-15305344 ]
ASF GitHub Bot commented on CLOUDSTACK-9385: -------------------------------------------- Github user dsclose commented on the pull request: https://github.com/apache/cloudstack/pull/1568#issuecomment-222308460 **NB**: This PR should be considered controversial because it restarts the password server after each keepalived transition, rather than stopping the service on transitions to backup or fault states. This decision was taken in anticipation of CLOUDSTACK-9035, which suggests that the password server should be running on backup redundant routers. However, this may be ill-advised. I'm happy to incorporate any feedback into this PR. > Password Server is not running on RvR > ------------------------------------- > > Key: CLOUDSTACK-9385 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9385 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: ISO, SystemVM > Affects Versions: 4.6.0, 4.6.1, 4.6.2, 4.7.0, 4.7.1, 4.8.0 > Reporter: dsclose > > NB: I have not tested this on VPC routers. > The cloud-passwd-srvr service fails on redundant virtual routers. This > appears to only concern redundant virtual routers. Standalone routers launch > the password server successfully, as per this bash session: > {code:title=Standalone Router} > root@r-3775-VM:~# ps aux | grep passwd | grep -v grep > root 2257 0.0 0.5 9244 1328 ? S 14:27 0:00 /bin/bash > /opt/cloud/bin/passwd_server_ip 10.1.1.1 dummy > root 2259 0.0 3.2 37276 8128 ? S 14:27 0:00 python > /opt/cloud/bin/passwd_server_ip.py 10.1.1.1 > root@r-3775-VM:~# netstat -tnlp | grep 2259 > tcp 0 0 10.1.1.1:8080 0.0.0.0:* LISTEN > 2259/python > {code} > However, redundant virtual routers do not exhibit this behaviour. Instead, > the password server process is running without an IP argument. No matching > process is bound to any ports: > {code:title=Master Redundant Virtual Router} > root@r-3776-VM:~# ps aux | grep passwd | grep -v grep > root 5152 0.0 0.2 17684 1516 ? S 14:38 0:00 /bin/bash > /opt/cloud/bin/passwd_server_ip None dummy > root@r-3776-VM:~# netstat -ntlp | grep 5152 > root@r-3776-VM:~# > {code} > Further, an error message is being repeated in /var/log/messages: > {code:title=/var/log/messages} > May 24 14:53:07 r-3776-VM cloud: Password server failed with error code 1. > Restarting it... > May 24 14:53:11 r-3776-VM cloud: Password server failed with error code 1. > Restarting it... > May 24 14:53:14 r-3776-VM cloud: Password server failed with error code 1. > Restarting it... > May 24 14:53:17 r-3776-VM cloud: Password server failed with error code 1. > Restarting it... > May 24 14:53:20 r-3776-VM cloud: Password server failed with error code 1. > Restarting it... > May 24 14:53:23 r-3776-VM cloud: Password server failed with error code 1. > Restarting it... > May 24 14:53:26 r-3776-VM cloud: Password server failed with error code 1. > Restarting it... > May 24 14:53:29 r-3776-VM cloud: Password server failed with error code 1. > Restarting it... > {code} > No process is bound to the password server port. Consequently, attempts to > request a password from the password server get rejected. > Manually restarting the cloud-passwd-srvr resolves this issue immediately: > {code:title=Master Redundant Virtual Router} > root@r-3776-VM:~# service cloud-passwd-srvr restart > Killed password server (pid=4874) > iptables: Bad rule (does a matching rule exist in that chain?). > Removed cloud-passwd-srvr iptables rules > Stopped password server (pid=5152) > iptables: Bad rule (does a matching rule exist in that chain?). > Removed cloud-passwd-srvr iptables rules > Added cloud-passwd-srvr iptables rules > root@r-3776-VM:~# nohup: appending output to `nohup.out' > root@r-3776-VM:~# ps aux | grep passwd | grep -v grep > root 15776 0.0 0.3 19436 1576 pts/1 S 15:05 0:00 /bin/bash > /opt/cloud/bin/passwd_server_ip 10.1.1.250 > root 15780 0.2 1.6 45484 8304 pts/1 S 15:05 0:00 python > /opt/cloud/bin/passwd_server_ip.py 10.1.1.250 > root 15781 0.0 0.3 19436 1572 pts/1 S 15:05 0:00 /bin/bash > /opt/cloud/bin/passwd_server_ip 10.1.1.1 > root 15782 0.2 1.6 49692 8396 pts/1 S 15:05 0:00 python > /opt/cloud/bin/passwd_server_ip.py 10.1.1.1 > root@r-3776-VM:~# netstat -ntlp | grep 15780 > tcp 0 0 10.1.1.250:8080 0.0.0.0:* LISTEN > 15780/python > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)