Hi Sarbajit, That output all looks fine, but it sounds as though the port mapping has failed – i.e. port 5060 on the Bono container hasn’t been exposed as port 5060 on the host. I’m not sure why that would have failed. Can you run nc -z -v 10.0.1.2 5060 inside the Bono container (this should work, but it’s worth doing as a sanity check!). Then can you run docker ps in your clearwater-docker checkout? The output should include a line that looks something like 0b4058027844 clearwaterdocker_bono "/usr/bin/supervisord" About a minute ago Up About a minute 0.0.0.0:3478->3478/tcp, 0.0.0.0:3478->3478/udp, 0.0.0.0:5060->5060/tcp, 0.0.0.0:5062->5062/tcp, 0.0.0.0:5060->5060/udp, 5058/tcp, 0.0.0.0:42513->22/tcp clearwaterdocker_bono_1, which should indicate that the port mapping is active. If this all looks fine it might be worth also running nc -v -z 10.109.190.9 5060 on the host that’s running the Bono container.
Thanks, Graeme ________________________________ From: Clearwater [mailto:[email protected]] On Behalf Of Sarbajit Chatterjee Sent: 21 September 2016 19:39 To: [email protected] Subject: Re: [Project Clearwater] Deploy Clearwater in a Swarm cluster using docker-compose Hi Graeme, I'm using following command to run the livetest - rake test[example.com<http://example.com>] TESTS="Basic*" SIGNUP_CODE=secret PROXY=10.109.190.9 ELLIS=10.109.190.10 here PROXY ip is where the bono container is launched and ELLIS ip is where the ellis container is launched. The bono service seems to be running in the container - root@9837c4dab241:/# ps -eaf | grep bono root 122 1 0 Sep19 ? 00:00:10 /usr/share/clearwater/clearwater-cluster-manager/env/bin/python /usr/share/clearwater/bin/clearwater-cluster-manager --mgmt-local-ip=10.0.1.2 --sig-local-ip=10.0.1.2 --local-site=site1 --remote-site= --remote-cassandra-seeds= --signaling-namespace= --uuid=18c7daf3-a098-47ae-962f-a3d57c0cff6f --etcd-key=clearwater --etcd-cluster-key=bono --log-level=3 --log-directory=/var/log/clearwater-cluster-manager --pidfile=/var/run/clearwater-cluster-manager.pid root 124 1 0 Sep19 ? 00:00:00 /bin/bash /etc/init.d/bono run root 139 124 0 Sep19 ? 00:00:00 /bin/bash /usr/share/clearwater/bin/run-in-signaling-namespace start-stop-daemon --start --quiet --exec /usr/share/clearwater/bin/bono --chuid bono --chdir /etc/clearwater -- --domain=example.com<http://example.com> --localhost=10.0.1.2,10.0.1.2 --alias=10.0.1.2 --pcscf=5060,5058 --webrtc-port=5062 --routing-proxy=scscf.sprout,5052,50,600 --ralf=ralf:10888 --sas=0.0.0.0,[email protected]<mailto:[email protected]> --dns-server=127.0.0.11 --worker-threads=4 --analytics=/var/log/bono --log-file=/var/log/bono --log-level=2 bono 140 139 0 Sep19 ? 00:11:15 /usr/share/clearwater/bin/bono --domain=example.com<http://example.com> --localhost=10.0.1.2,10.0.1.2 --alias=10.0.1.2 --pcscf=5060,5058 --webrtc-port=5062 --routing-proxy=scscf.sprout,5052,50,600 --ralf=ralf:10888 --sas=0.0.0.0,[email protected]<mailto:[email protected]> --dns-server=127.0.0.11 --worker-threads=4 --analytics=/var/log/bono --log-file=/var/log/bono --log-level=2 root 322 293 0 17:48 ? 00:00:00 grep --color=auto bono root@9837c4dab241:/# root@9837c4dab241:/# netstat -planut | grep 5060 tcp 0 0 10.0.1.2:5060<http://10.0.1.2:5060> 0.0.0.0:* LISTEN - udp 0 0 10.0.1.2:5060<http://10.0.1.2:5060> 0.0.0.0:* - root@9837c4dab241:/# But the connection to bono is failing from livetest container as you had predicted. root@40efba73deb5:~/clearwater-live-test# nc -v -z 10.109.190.9 5060 nc: connect to 10.109.190.9 port 5060 (tcp) failed: Connection refused root@40efba73deb5:~/clearwater-live-test# On checking the bono log, I see a series of errors like below in beginning of the log - 19-09-2016 15:41:44.612 UTC Status utils.cpp:591: Log level set to 2 19-09-2016 15:41:44.612 UTC Status main.cpp:1388: Access logging enabled to /var/log/bono 19-09-2016 15:41:44.613 UTC Warning main.cpp:1435: SAS server option was invalid or not configured - SAS is disabled 19-09-2016 15:41:44.613 UTC Warning main.cpp:1511: A registration expiry period should not be specified for P-CSCF 19-09-2016 15:41:44.613 UTC Status snmp_agent.cpp:117: AgentX agent initialised 19-09-2016 15:41:44.613 UTC Status load_monitor.cpp:105: Constructing LoadMonitor 19-09-2016 15:41:44.613 UTC Status load_monitor.cpp:106: Target latency (usecs) : 100000 19-09-2016 15:41:44.613 UTC Status load_monitor.cpp:107: Max bucket size : 1000 19-09-2016 15:41:44.613 UTC Status load_monitor.cpp:108: Initial token fill rate/s: 100.000000 19-09-2016 15:41:44.613 UTC Status load_monitor.cpp:109: Min token fill rate/s : 10.000000 19-09-2016 15:41:44.613 UTC Status dnscachedresolver.cpp:144: Creating Cached Resolver using servers: 19-09-2016 15:41:44.613 UTC Status dnscachedresolver.cpp:154: 127.0.0.11 19-09-2016 15:41:44.613 UTC Status sipresolver.cpp:60: Created SIP resolver 19-09-2016 15:41:44.637 UTC Status stack.cpp:419: Listening on port 5058 19-09-2016 15:41:44.637 UTC Status stack.cpp:419: Listening on port 5060 19-09-2016 15:41:44.638 UTC Status stack.cpp:855: Local host aliases: 19-09-2016 15:41:44.638 UTC Status stack.cpp:862: 10.0.1.2 19-09-2016 15:41:44.638 UTC Status stack.cpp:862: 172.18.0.2 19-09-2016 15:41:44.638 UTC Status stack.cpp:862: 10.0.1.2 19-09-2016 15:41:44.638 UTC Status stack.cpp:862: 10.0.1.2 19-09-2016 15:41:44.638 UTC Status stack.cpp:862: 19-09-2016 15:41:44.639 UTC Status httpresolver.cpp:52: Created HTTP resolver 19-09-2016 15:41:44.641 UTC Status httpconnection.cpp:114: Configuring HTTP Connection 19-09-2016 15:41:44.641 UTC Status httpconnection.cpp:115: Connection created for server ralf:10888 19-09-2016 15:41:44.641 UTC Status httpconnection.cpp:116: Connection will use a response timeout of 500ms 19-09-2016 15:41:44.642 UTC Status connection_pool.cpp:72: Creating connection pool to scscf.sprout:5052 19-09-2016 15:41:44.642 UTC Status connection_pool.cpp:73: connections = 50, recycle time = 600 +/- 120 seconds 19-09-2016 15:41:44.649 UTC Status bono.cpp:3314: Create list of PBXes 19-09-2016 15:41:44.649 UTC Status pluginloader.cpp:63: Loading plug-ins from /usr/share/clearwater/sprout/plugins 19-09-2016 15:41:44.649 UTC Status pluginloader.cpp:158: Finished loading plug-ins 19-09-2016 15:41:44.652 UTC Warning (Net-SNMP): Warning: Failed to connect to the agentx master agent ([NIL]): 19-09-2016 15:41:44.653 UTC Error pjsip: tcpc0x14f3df8 TCP connect() error: Connection refused [code=120111] 19-09-2016 15:41:44.653 UTC Error pjsip: tcpc0x14f5c38 TCP connect() error: Connection refused [code=120111] I have observed that in the bono container 5060 port is not listening in all interfaces while 5062 port is listening in all interfaces. root@9837c4dab241:/var/log/bono# netstat -planut | grep LISTEN tcp 0 0 10.0.1.2:4000<http://10.0.1.2:4000> 0.0.0.0:* LISTEN - tcp 0 0 10.0.1.2:5058<http://10.0.1.2:5058> 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.11:43395<http://127.0.0.11:43395> 0.0.0.0:* LISTEN - tcp 0 0 10.0.1.2:5060<http://10.0.1.2:5060> 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:5062<http://0.0.0.0:5062> 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:8080<http://127.0.0.1:8080> 0.0.0.0:* LISTEN - tcp 0 0 10.0.1.2:3478<http://10.0.1.2:3478> 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:22<http://0.0.0.0:22> 0.0.0.0:* LISTEN 9/sshd tcp6 0 0 :::22 :::* LISTEN 9/sshd root@9837c4dab241:/var/log/bono# Is it right to have 5060 port listen on only local port? Please help me to debug the issue. Thanks, Sarbajit On Wed, Sep 21, 2016 at 10:01 PM, Graeme Robertson (projectclearwater.org<http://projectclearwater.org>) <[email protected]<mailto:[email protected]>> wrote: Hi Sarbajit, I’ve had another look at this, and actually I think clearwater-live-test checks it can connect to Bono before it tries to provision numbers from Ellis, and it’s actually that connection that’s failing – apologies! Can you do similar checks for the Bono container, i.e. connect to the Bono container and run ps -eaf | grep bono and run nc -z -v <ip> 5060 from your live test container (where <ip> is the IP of your Bono)? One other thought – what command are you using to run the tests? You’ll need to set the PROXY option to your Bono IP and the ELLIS option to your Ellis IP. Thanks, Graeme From: Clearwater [mailto:[email protected]<mailto:[email protected]>] On Behalf Of Sarbajit Chatterjee Sent: 21 September 2016 16:41 To: [email protected]<mailto:[email protected]> Subject: Re: [Project Clearwater] Deploy Clearwater in a Swarm cluster using docker-compose Thanks Graeme for your reply. Here are the command outputs that you had asked - root@e994b17b4563:/# ps -eaf | grep ellis root 177 1 0 Sep19 ? 00:00:10 /usr/share/clearwater/clearwater-cluster-manager/env/bin/python /usr/share/clearwater/bin/clearwater-cluster-manager --mgmt-local-ip=10.0.1.7 --sig-local-ip=10.0.1.7 --local-site=site1 --remote-site= --remote-cassandra-seeds= --signaling-namespace= --uuid=18c7daf3-a098-47ae-962f-a3d57c0cff6f --etcd-key=clearwater --etcd-cluster-key=ellis --log-level=3 --log-directory=/var/log/clearwater-cluster-manager --pidfile=/var/run/clearwater-cluster-manager.pid root 180 1 0 Sep19 ? 00:00:00 /bin/sh /etc/init.d/ellis run ellis 185 180 0 Sep19 ? 00:00:05 /usr/share/clearwater/ellis/env/bin/python -m metaswitch.ellis.main root 287 253 0 15:18 ? 00:00:00 grep --color=auto ellis root@e994b17b4563:/# root@e994b17b4563:/# ps -eaf | grep nginx root 179 1 0 Sep19 ? 00:00:00 nginx: master process /usr/sbin/nginx -g daemon off; www-data 186 179 0 Sep19 ? 00:00:16 nginx: worker process www-data 187 179 0 Sep19 ? 00:00:00 nginx: worker process www-data 188 179 0 Sep19 ? 00:00:16 nginx: worker process www-data 189 179 0 Sep19 ? 00:00:16 nginx: worker process root 289 253 0 15:19 ? 00:00:00 grep --color=auto nginx root@e994b17b4563:/# root@e994b17b4563:/# netstat -planut | grep nginx tcp6 0 0 :::80 :::* LISTEN 179/nginx -g daemon root@e994b17b4563:/# I think both ellis and nginx are running fine inside the container. I can also open the ellis login page from a web browser. I also checked the MySQL DB in ellis container. I can see livetest user entry in users table and 1000 rows in numbers table. I can also connect to ellis (host IP 10.109.190.10) from my livetest container - root@40efba73deb5:~/clearwater-live-test# nc -v -z 10.109.190.10 80 Connection to 10.109.190.10 80 port [tcp/http] succeeded! root@40efba73deb5:~/clearwater-live-test# Is this happening because Clearwater containers are spread across multiple hosts? What other areas I should check? Thanks, Sarbajit On Wed, Sep 21, 2016 at 6:09 PM, Graeme Robertson (projectclearwater.org<http://projectclearwater.org>) <[email protected]<mailto:[email protected]>> wrote: Hi Sarbajit, I don’t think we’ve never tried deploying Project Clearwater in a Docker Swarm cluster, but I don’t see any reason why it couldn’t work. The tests are failing very early – they’re not able to connect to Ellis on port 80. I can think of a couple of reasons for this – either Ellis isn’t running or the Ellis port mapping hasn’t worked for some reason. Can you connect to the Ellis container and run ps –eaf | grep ellis and ps –eaf | grep nginx to confirm that NGINX and Ellis are running? Can you also run sudo netstat -planut | grep nginx or something equivalent to check that NGINX is listening on port 80? If there’s a problem with either NGINX or Ellis we probably need to look in the logs at /var/log/nginx/ or /var/log/ellis/ on the Ellis container. If however that all looks fine, then it sounds like the port mapping has failed for some reason. Can you run nc -z <ip> 80 from the box you’re running the live tests on? This will scan for anything listening at <ip>:80 and will return successfully if it finds anything. Thanks, Graeme ________________________________ From: Clearwater [mailto:[email protected]] On Behalf Of Sarbajit Chatterjee Sent: 20 September 2016 15:05 To: [email protected]<mailto:[email protected]> Subject: [Project Clearwater] Deploy Clearwater in a Swarm cluster using docker-compose Hello, I am following the instructions from https://github.com/Metaswitch/clearwater-docker. I can successfully deploy it on a single Docker node but, the compose file does not work with Swarm cluster. I did try to modify the compose file like this - version: '2' services: etcd: image: quay.io/coreos/etcd:v2.2.5<http://quay.io/coreos/etcd:v2.2.5> command: > -name etcd0 -advertise-client-urls http://etcd:2379,http://etcd:4001 -listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 -initial-advertise-peer-urls http://etcd:2380 -listen-peer-urls http://0.0.0.0:2380 -initial-cluster etcd0=http://etcd:2380 -initial-cluster-state new bono: image: swarm-node:5000/clearwaterdocker_bono ports: - 22 - "3478:3478" - "3478:3478/udp" - "5060:5060" - "5060:5060/udp" - "5062:5062" sprout: image: swarm-node:5000/clearwaterdocker_sprout networks: default: aliases: - scscf.sprout - icscf.sprout ports: - 22 homestead: image: swarm-node:5000/clearwaterdocker_homestead ports: - 22 homer: image: swarm-node:5000/clearwaterdocker_homer ports: - 22 ralf: image: swarm-node:5000/clearwaterdocker_ralf ports: - 22 ellis: image: swarm-node:5000/clearwaterdocker_ellis ports: - 22 - "80:80" where swarm-node:5000 is the local docker registry and it hosts the pre-built images of Clearwater containers. Even though the deployment succeeded, clearwater-livetests are failing with following error - Basic Registration (TCP) - Failed Errno::ECONNREFUSED thrown: - Connection refused - connect(2) - /usr/local/rvm/gems/ruby-1.9.3-p551/gems/quaff-0.7.3/lib/sources.rb:41:in `initialize' Any suggestions on how I can deploy Clearwater on a Swarm cluster? Thanks, Sarbajit _______________________________________________ Clearwater mailing list [email protected]<mailto:[email protected]> http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org _______________________________________________ Clearwater mailing list [email protected]<mailto:[email protected]> http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org
_______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org
