Re: [Project Clearwater] Deploy Clearwater in a Swarm cluster using docker-compose

Graeme Robertson (projectclearwater.org) Thu, 22 Sep 2016 03:42:08 -0700

Hi Sarbajit,

That output all looks fine, but it sounds as though the port mapping has failed 
– i.e. port 5060 on the Bono container hasn’t been exposed as port 5060 on the 
host. I’m not sure why that would have failed. Can you run nc -z -v 10.0.1.2 
5060 inside the Bono container (this should work, but it’s worth doing as a 
sanity check!). Then can you run docker ps in your clearwater-docker checkout? 
The output should include a line that looks something like 0b4058027844        
clearwaterdocker_bono        "/usr/bin/supervisord"   About a minute ago   Up 
About a minute   0.0.0.0:3478->3478/tcp, 0.0.0.0:3478->3478/udp, 
0.0.0.0:5060->5060/tcp, 0.0.0.0:5062->5062/tcp, 0.0.0.0:5060->5060/udp, 
5058/tcp, 0.0.0.0:42513->22/tcp   clearwaterdocker_bono_1, which should 
indicate that the port mapping is active. If this all looks fine it might be 
worth also running nc -v -z 10.109.190.9 5060 on the host that’s running the 
Bono container.


Thanks,
Graeme

________________________________
From: Clearwater [mailto:[email protected]] On 
Behalf Of Sarbajit Chatterjee
Sent: 21 September 2016 19:39
To: [email protected]
Subject: Re: [Project Clearwater] Deploy Clearwater in a Swarm cluster using 
docker-compose

Hi Graeme,

I'm using following command to run the livetest -

rake test[example.com<http://example.com>] TESTS="Basic*" SIGNUP_CODE=secret 
PROXY=10.109.190.9 ELLIS=10.109.190.10

here PROXY ip is where the bono container is launched and ELLIS ip is where the 
ellis container is launched.

The bono service seems to be running in the container -

root@9837c4dab241:/# ps -eaf | grep bono
root       122     1  0 Sep19 ?        00:00:10 
/usr/share/clearwater/clearwater-cluster-manager/env/bin/python 
/usr/share/clearwater/bin/clearwater-cluster-manager --mgmt-local-ip=10.0.1.2 
--sig-local-ip=10.0.1.2 --local-site=site1 --remote-site= 
--remote-cassandra-seeds= --signaling-namespace= 
--uuid=18c7daf3-a098-47ae-962f-a3d57c0cff6f --etcd-key=clearwater 
--etcd-cluster-key=bono --log-level=3 
--log-directory=/var/log/clearwater-cluster-manager 
--pidfile=/var/run/clearwater-cluster-manager.pid
root       124     1  0 Sep19 ?        00:00:00 /bin/bash /etc/init.d/bono run
root       139   124  0 Sep19 ?        00:00:00 /bin/bash 
/usr/share/clearwater/bin/run-in-signaling-namespace start-stop-daemon --start 
--quiet --exec /usr/share/clearwater/bin/bono --chuid bono --chdir 
/etc/clearwater -- --domain=example.com<http://example.com> 
--localhost=10.0.1.2,10.0.1.2 --alias=10.0.1.2 --pcscf=5060,5058 
--webrtc-port=5062 --routing-proxy=scscf.sprout,5052,50,600 --ralf=ralf:10888 
--sas=0.0.0.0,[email protected]<mailto:[email protected]> --dns-server=127.0.0.11 
--worker-threads=4 --analytics=/var/log/bono --log-file=/var/log/bono 
--log-level=2
bono       140   139  0 Sep19 ?        00:11:15 /usr/share/clearwater/bin/bono 
--domain=example.com<http://example.com> --localhost=10.0.1.2,10.0.1.2 
--alias=10.0.1.2 --pcscf=5060,5058 --webrtc-port=5062 
--routing-proxy=scscf.sprout,5052,50,600 --ralf=ralf:10888 
--sas=0.0.0.0,[email protected]<mailto:[email protected]> --dns-server=127.0.0.11 
--worker-threads=4 --analytics=/var/log/bono --log-file=/var/log/bono 
--log-level=2
root       322   293  0 17:48 ?        00:00:00 grep --color=auto bono
root@9837c4dab241:/#
root@9837c4dab241:/# netstat -planut | grep 5060
tcp        0      0 10.0.1.2:5060<http://10.0.1.2:5060>           0.0.0.0:*     
          LISTEN      -
udp        0      0 10.0.1.2:5060<http://10.0.1.2:5060>           0.0.0.0:*     
                      -
root@9837c4dab241:/#


But the connection to bono is failing from livetest container as you had 
predicted.

root@40efba73deb5:~/clearwater-live-test# nc -v -z 10.109.190.9 5060
nc: connect to 10.109.190.9 port 5060 (tcp) failed: Connection refused
root@40efba73deb5:~/clearwater-live-test#


On checking the bono log, I see a series of errors like below in beginning of 
the log -

19-09-2016 15:41:44.612 UTC Status utils.cpp:591: Log level set to 2
19-09-2016 15:41:44.612 UTC Status main.cpp:1388: Access logging enabled to 
/var/log/bono
19-09-2016 15:41:44.613 UTC Warning main.cpp:1435: SAS server option was 
invalid or not configured - SAS is disabled
19-09-2016 15:41:44.613 UTC Warning main.cpp:1511: A registration expiry period 
should not be specified for P-CSCF
19-09-2016 15:41:44.613 UTC Status snmp_agent.cpp:117: AgentX agent initialised
19-09-2016 15:41:44.613 UTC Status load_monitor.cpp:105: Constructing 
LoadMonitor
19-09-2016 15:41:44.613 UTC Status load_monitor.cpp:106:    Target latency 
(usecs)   : 100000
19-09-2016 15:41:44.613 UTC Status load_monitor.cpp:107:    Max bucket size     
     : 1000
19-09-2016 15:41:44.613 UTC Status load_monitor.cpp:108:    Initial token fill 
rate/s: 100.000000
19-09-2016 15:41:44.613 UTC Status load_monitor.cpp:109:    Min token fill 
rate/s    : 10.000000
19-09-2016 15:41:44.613 UTC Status dnscachedresolver.cpp:144: Creating Cached 
Resolver using servers:
19-09-2016 15:41:44.613 UTC Status dnscachedresolver.cpp:154:     127.0.0.11
19-09-2016 15:41:44.613 UTC Status sipresolver.cpp:60: Created SIP resolver
19-09-2016 15:41:44.637 UTC Status stack.cpp:419: Listening on port 5058
19-09-2016 15:41:44.637 UTC Status stack.cpp:419: Listening on port 5060
19-09-2016 15:41:44.638 UTC Status stack.cpp:855: Local host aliases:
19-09-2016 15:41:44.638 UTC Status stack.cpp:862:  10.0.1.2
19-09-2016 15:41:44.638 UTC Status stack.cpp:862:  172.18.0.2
19-09-2016 15:41:44.638 UTC Status stack.cpp:862:  10.0.1.2
19-09-2016 15:41:44.638 UTC Status stack.cpp:862:  10.0.1.2
19-09-2016 15:41:44.638 UTC Status stack.cpp:862:
19-09-2016 15:41:44.639 UTC Status httpresolver.cpp:52: Created HTTP resolver
19-09-2016 15:41:44.641 UTC Status httpconnection.cpp:114: Configuring HTTP 
Connection
19-09-2016 15:41:44.641 UTC Status httpconnection.cpp:115:   Connection created 
for server ralf:10888
19-09-2016 15:41:44.641 UTC Status httpconnection.cpp:116:   Connection will 
use a response timeout of 500ms
19-09-2016 15:41:44.642 UTC Status connection_pool.cpp:72: Creating connection 
pool to scscf.sprout:5052
19-09-2016 15:41:44.642 UTC Status connection_pool.cpp:73:   connections = 50, 
recycle time = 600 +/- 120 seconds
19-09-2016 15:41:44.649 UTC Status bono.cpp:3314: Create list of PBXes
19-09-2016 15:41:44.649 UTC Status pluginloader.cpp:63: Loading plug-ins from 
/usr/share/clearwater/sprout/plugins
19-09-2016 15:41:44.649 UTC Status pluginloader.cpp:158: Finished loading 
plug-ins
19-09-2016 15:41:44.652 UTC Warning (Net-SNMP): Warning: Failed to connect to 
the agentx master agent ([NIL]):
19-09-2016 15:41:44.653 UTC Error pjsip:  tcpc0x14f3df8 TCP connect() error: 
Connection refused [code=120111]
19-09-2016 15:41:44.653 UTC Error pjsip:  tcpc0x14f5c38 TCP connect() error: 
Connection refused [code=120111]


I have observed that in the bono container 5060 port is not listening in all 
interfaces while 5062 port is listening in all interfaces.

root@9837c4dab241:/var/log/bono# netstat -planut | grep LISTEN
tcp        0      0 10.0.1.2:4000<http://10.0.1.2:4000>           0.0.0.0:*     
          LISTEN      -
tcp        0      0 10.0.1.2:5058<http://10.0.1.2:5058>           0.0.0.0:*     
          LISTEN      -
tcp        0      0 127.0.0.11:43395<http://127.0.0.11:43395>        0.0.0.0:*  
             LISTEN      -
tcp        0      0 10.0.1.2:5060<http://10.0.1.2:5060>           0.0.0.0:*     
          LISTEN      -
tcp        0      0 0.0.0.0:5062<http://0.0.0.0:5062>            0.0.0.0:*      
         LISTEN      -
tcp        0      0 127.0.0.1:8080<http://127.0.0.1:8080>          0.0.0.0:*    
           LISTEN      -
tcp        0      0 10.0.1.2:3478<http://10.0.1.2:3478>           0.0.0.0:*     
          LISTEN      -
tcp        0      0 0.0.0.0:22<http://0.0.0.0:22>              0.0.0.0:*        
       LISTEN      9/sshd
tcp6       0      0 :::22                   :::*                    LISTEN      
9/sshd
root@9837c4dab241:/var/log/bono#


Is it right to have 5060 port listen on only local port? Please help me to 
debug the issue.

Thanks,
Sarbajit


On Wed, Sep 21, 2016 at 10:01 PM, Graeme Robertson 
(projectclearwater.org<http://projectclearwater.org>) 
<[email protected]<mailto:[email protected]>> wrote:
Hi Sarbajit,

I’ve had another look at this, and actually I think clearwater-live-test checks 
it can connect to Bono before it tries to provision numbers from Ellis, and 
it’s actually that connection that’s failing – apologies!

Can you do similar checks for the Bono container, i.e. connect to the Bono 
container and run ps -eaf | grep bono and run nc -z -v <ip> 5060 from your live 
test container (where <ip> is the IP of your Bono)?

One other thought – what command are you using to run the tests? You’ll need to 
set the PROXY option to your Bono IP and the ELLIS option to your Ellis IP.

Thanks,
Graeme

From: Clearwater 
[mailto:[email protected]<mailto:[email protected]>]
 On Behalf Of Sarbajit Chatterjee
Sent: 21 September 2016 16:41
To: 
[email protected]<mailto:[email protected]>
Subject: Re: [Project Clearwater] Deploy Clearwater in a Swarm cluster using 
docker-compose

Thanks Graeme for your reply. Here are the command outputs that you had asked -

root@e994b17b4563:/# ps -eaf | grep ellis
root       177     1  0 Sep19 ?        00:00:10 
/usr/share/clearwater/clearwater-cluster-manager/env/bin/python 
/usr/share/clearwater/bin/clearwater-cluster-manager --mgmt-local-ip=10.0.1.7 
--sig-local-ip=10.0.1.7 --local-site=site1 --remote-site= 
--remote-cassandra-seeds= --signaling-namespace= 
--uuid=18c7daf3-a098-47ae-962f-a3d57c0cff6f --etcd-key=clearwater 
--etcd-cluster-key=ellis --log-level=3 
--log-directory=/var/log/clearwater-cluster-manager 
--pidfile=/var/run/clearwater-cluster-manager.pid
root       180     1  0 Sep19 ?        00:00:00 /bin/sh /etc/init.d/ellis run
ellis      185   180  0 Sep19 ?        00:00:05 
/usr/share/clearwater/ellis/env/bin/python -m metaswitch.ellis.main
root       287   253  0 15:18 ?        00:00:00 grep --color=auto ellis
root@e994b17b4563:/#
root@e994b17b4563:/# ps -eaf | grep nginx
root       179     1  0 Sep19 ?        00:00:00 nginx: master process 
/usr/sbin/nginx -g daemon off;
www-data   186   179  0 Sep19 ?        00:00:16 nginx: worker process
www-data   187   179  0 Sep19 ?        00:00:00 nginx: worker process
www-data   188   179  0 Sep19 ?        00:00:16 nginx: worker process
www-data   189   179  0 Sep19 ?        00:00:16 nginx: worker process
root       289   253  0 15:19 ?        00:00:00 grep --color=auto nginx
root@e994b17b4563:/#
root@e994b17b4563:/# netstat -planut | grep nginx
tcp6       0      0 :::80                   :::*                    LISTEN      
179/nginx -g daemon
root@e994b17b4563:/#


I think both ellis and nginx are running fine inside the container. I can also 
open the ellis login page from a web browser. I also checked the MySQL DB in 
ellis container. I can see livetest user entry in users table and 1000 rows in 
numbers table.

I can also connect to ellis (host IP 10.109.190.10) from my livetest container -

root@40efba73deb5:~/clearwater-live-test# nc -v -z 10.109.190.10 80
Connection to 10.109.190.10 80 port [tcp/http] succeeded!
root@40efba73deb5:~/clearwater-live-test#


Is this happening because Clearwater containers are spread across multiple 
hosts? What other areas I should check?


Thanks,
Sarbajit


On Wed, Sep 21, 2016 at 6:09 PM, Graeme Robertson 
(projectclearwater.org<http://projectclearwater.org>) 
<[email protected]<mailto:[email protected]>> wrote:
Hi Sarbajit,

I don’t think we’ve never tried deploying Project Clearwater in a Docker Swarm 
cluster, but I don’t see any reason why it couldn’t work. The tests are failing 
very early – they’re not able to connect to Ellis on port 80. I can think of a 
couple of reasons for this – either Ellis isn’t running or the Ellis port 
mapping hasn’t worked for some reason.

Can you connect to the Ellis container and run ps –eaf | grep ellis and ps –eaf 
| grep nginx to confirm that NGINX and Ellis are running? Can you also run sudo 
netstat -planut | grep nginx or something equivalent to check that NGINX is 
listening on port 80? If there’s a problem with either NGINX or Ellis we 
probably need to look in the logs at /var/log/nginx/ or /var/log/ellis/ on the 
Ellis container.

If however that all looks fine, then it sounds like the port mapping has failed 
for some reason. Can you run nc -z <ip> 80 from the box you’re running the live 
tests on? This will scan for anything listening at <ip>:80 and will return 
successfully if it finds anything.

Thanks,
Graeme

________________________________
From: Clearwater [mailto:[email protected]] On 
Behalf Of Sarbajit Chatterjee
Sent: 20 September 2016 15:05
To: 
[email protected]<mailto:[email protected]>
Subject: [Project Clearwater] Deploy Clearwater in a Swarm cluster using 
docker-compose

Hello,

I am following the instructions from 
https://github.com/Metaswitch/clearwater-docker. I can successfully deploy it 
on a single Docker node but, the compose file does not work with Swarm cluster.

I did try to modify the compose file like this -


version: '2'
services:
  etcd:
    image: quay.io/coreos/etcd:v2.2.5<http://quay.io/coreos/etcd:v2.2.5>
    command: >
      -name etcd0
      -advertise-client-urls http://etcd:2379,http://etcd:4001
      -listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001
      -initial-advertise-peer-urls http://etcd:2380
      -listen-peer-urls http://0.0.0.0:2380
      -initial-cluster etcd0=http://etcd:2380
      -initial-cluster-state new
  bono:
    image: swarm-node:5000/clearwaterdocker_bono
    ports:
      - 22
      - "3478:3478"
      - "3478:3478/udp"
      - "5060:5060"
      - "5060:5060/udp"
      - "5062:5062"
  sprout:
    image: swarm-node:5000/clearwaterdocker_sprout
    networks:
      default:
        aliases:
          - scscf.sprout
          - icscf.sprout
    ports:
      - 22
  homestead:
    image: swarm-node:5000/clearwaterdocker_homestead
    ports:
      - 22
  homer:
    image: swarm-node:5000/clearwaterdocker_homer
    ports:
      - 22
  ralf:
    image: swarm-node:5000/clearwaterdocker_ralf
    ports:
      - 22
  ellis:
    image: swarm-node:5000/clearwaterdocker_ellis
    ports:
      - 22
      - "80:80"


where swarm-node:5000 is the local docker registry and it hosts the pre-built 
images of Clearwater containers. Even though the deployment succeeded, 
clearwater-livetests are failing with following error -


Basic Registration (TCP) - Failed
  Errno::ECONNREFUSED thrown:
   - Connection refused - connect(2)
     - 
/usr/local/rvm/gems/ruby-1.9.3-p551/gems/quaff-0.7.3/lib/sources.rb:41:in 
`initialize'


Any suggestions on how I can deploy Clearwater on a Swarm cluster?

Thanks,
Sarbajit


_______________________________________________
Clearwater mailing list
[email protected]<mailto:[email protected]>
http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org


_______________________________________________
Clearwater mailing list
[email protected]<mailto:[email protected]>
http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org

_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org

Re: [Project Clearwater] Deploy Clearwater in a Swarm cluster using docker-compose

Reply via email to