[
https://issues.apache.org/jira/browse/CLOUDSTACK-9385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15305346#comment-15305346
]
dsclose commented on CLOUDSTACK-9385:
-------------------------------------
Hi [~weizhou], please can you elaborate on what the above code does and how it
is intended to be used?
As mentioned, the PR I submitted restarts the password server on any keepalived
transition in anticipation of CLOUDSTACK-9035. You indicate that we should
ignore CLOUDSTACK-9035 because it's a complicated issue. However, restarting
the password server after a keepalived transition just seems like the simplest
solution for available for CLOUDSTACK-9385.
Do you know of any reason why the password server should not be running on a
backup redundant router?
> Password Server is not running on RvR
> -------------------------------------
>
> Key: CLOUDSTACK-9385
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9385
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: ISO, SystemVM
> Affects Versions: 4.6.0, 4.6.1, 4.6.2, 4.7.0, 4.7.1, 4.8.0
> Reporter: dsclose
>
> NB: I have not tested this on VPC routers.
> The cloud-passwd-srvr service fails on redundant virtual routers. This
> appears to only concern redundant virtual routers. Standalone routers launch
> the password server successfully, as per this bash session:
> {code:title=Standalone Router}
> root@r-3775-VM:~# ps aux | grep passwd | grep -v grep
> root 2257 0.0 0.5 9244 1328 ? S 14:27 0:00 /bin/bash
> /opt/cloud/bin/passwd_server_ip 10.1.1.1 dummy
> root 2259 0.0 3.2 37276 8128 ? S 14:27 0:00 python
> /opt/cloud/bin/passwd_server_ip.py 10.1.1.1
> root@r-3775-VM:~# netstat -tnlp | grep 2259
> tcp 0 0 10.1.1.1:8080 0.0.0.0:* LISTEN
> 2259/python
> {code}
> However, redundant virtual routers do not exhibit this behaviour. Instead,
> the password server process is running without an IP argument. No matching
> process is bound to any ports:
> {code:title=Master Redundant Virtual Router}
> root@r-3776-VM:~# ps aux | grep passwd | grep -v grep
> root 5152 0.0 0.2 17684 1516 ? S 14:38 0:00 /bin/bash
> /opt/cloud/bin/passwd_server_ip None dummy
> root@r-3776-VM:~# netstat -ntlp | grep 5152
> root@r-3776-VM:~#
> {code}
> Further, an error message is being repeated in /var/log/messages:
> {code:title=/var/log/messages}
> May 24 14:53:07 r-3776-VM cloud: Password server failed with error code 1.
> Restarting it...
> May 24 14:53:11 r-3776-VM cloud: Password server failed with error code 1.
> Restarting it...
> May 24 14:53:14 r-3776-VM cloud: Password server failed with error code 1.
> Restarting it...
> May 24 14:53:17 r-3776-VM cloud: Password server failed with error code 1.
> Restarting it...
> May 24 14:53:20 r-3776-VM cloud: Password server failed with error code 1.
> Restarting it...
> May 24 14:53:23 r-3776-VM cloud: Password server failed with error code 1.
> Restarting it...
> May 24 14:53:26 r-3776-VM cloud: Password server failed with error code 1.
> Restarting it...
> May 24 14:53:29 r-3776-VM cloud: Password server failed with error code 1.
> Restarting it...
> {code}
> No process is bound to the password server port. Consequently, attempts to
> request a password from the password server get rejected.
> Manually restarting the cloud-passwd-srvr resolves this issue immediately:
> {code:title=Master Redundant Virtual Router}
> root@r-3776-VM:~# service cloud-passwd-srvr restart
> Killed password server (pid=4874)
> iptables: Bad rule (does a matching rule exist in that chain?).
> Removed cloud-passwd-srvr iptables rules
> Stopped password server (pid=5152)
> iptables: Bad rule (does a matching rule exist in that chain?).
> Removed cloud-passwd-srvr iptables rules
> Added cloud-passwd-srvr iptables rules
> root@r-3776-VM:~# nohup: appending output to `nohup.out'
> root@r-3776-VM:~# ps aux | grep passwd | grep -v grep
> root 15776 0.0 0.3 19436 1576 pts/1 S 15:05 0:00 /bin/bash
> /opt/cloud/bin/passwd_server_ip 10.1.1.250
> root 15780 0.2 1.6 45484 8304 pts/1 S 15:05 0:00 python
> /opt/cloud/bin/passwd_server_ip.py 10.1.1.250
> root 15781 0.0 0.3 19436 1572 pts/1 S 15:05 0:00 /bin/bash
> /opt/cloud/bin/passwd_server_ip 10.1.1.1
> root 15782 0.2 1.6 49692 8396 pts/1 S 15:05 0:00 python
> /opt/cloud/bin/passwd_server_ip.py 10.1.1.1
> root@r-3776-VM:~# netstat -ntlp | grep 15780
> tcp 0 0 10.1.1.250:8080 0.0.0.0:* LISTEN
> 15780/python
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)