Names resolve correctly.

I made two test with 50 calls. One with single deployment (one copy for
each component) and one with double deployment (two copies).

Hope they can help... I'm giving them a deep look right now.

https://app.box.com/s/irvmyhphgoqwm3avvd9m

https://app.box.com/s/mjxf73f1kmbc2ntxqi63

2014-11-28 20:52 GMT+01:00 Eleanor Merry <[email protected]>:

>  It looks like there are connection issues between the Sprouts and
> Homesteads. Can you check that the Sprout can resolve the Homestead
> address? The clearwater processes look in /etc/resolv.conf for DNS
> configuration. If your nodes don’t use localhost (try running nslookup on a
> hostname and see if the Server value is 127.0.0.1) you’ll need to update
> the dnsmasq configuration to point to this server rather than localhost.
> You can find details of how to do this at:
> https://github.com/Metaswitch/clearwater-docs/wiki/Clearwater-DNS-Usage#client-configuration
>
>
>
> Can you also please send me the debug logs for this call from Sprout and
> Homestead. You can turn on debug logging by creating/editing the file
> /etc/clearwater/user_settings, adding log_level=5 and then restarting the
> component (service sprout <component> - it's automatically restarted by
> monit). The logs are output in /var/log/<component>/*. You won’t want to
> have stress running though the node with debug logs on though – can you
> either try making a standard call on your deployment, or set count= in
> /etc/clearwater/config on your stress node to a low number.
>
>  The reported Cassandra errors are benign. They are caused by the
> Cassandra process not having quite started up when the Homestead process
> started. This meant Homestead couldn’t initialize its cache. After a couple
> of seconds, once Cassandra is up, Homestead can successfully initialize its
> cache and so starts up properly.
>
>
>
> Ellie
>
>
>
> *From:* Antonio Ken Iannillo [mailto:[email protected]]
> *Sent:* 28 November 2014 16:13
> *To:* [email protected]
> *Cc:* Eleanor Merry
>
> *Subject:* Re: [Clearwater] Connection Timeout on Large Scale Deployment
>
>
>
> I also found this error in sprout log
>
>
> 28-11-2014 16:04:07.905 UTC Error httpconnection.cpp:569:
> http://hs.cini-clearwater.com:8888/impu/sip%3A2010000140%40cini-clearwater.com/reg-data
> failed at server 192.168.1.113 : HTTP response code said error (22 400) :
> fatal
> 28-11-2014 16:04:07.906 UTC Error httpconnection.cpp:667: cURL failure
> with cURL error code 22 (see man 3 libcurl-errors) and HTTP error code 400
> 28-11-2014 16:04:07.906 UTC Error hssconnection.cpp:565: Could not get
> subscriber data from HSS
> 28-11-2014 16:04:08.128 UTC Error httpconnection.cpp:569:
> http://hs.cini-clearwater.com:8888/impu/sip%3A2010000030%40cini-clearwater.com/reg-data
> failed at server 192.168.1.113 : HTTP response code said error (22 400) :
> fatal
> 28-11-2014 16:04:08.128 UTC Error httpconnection.cpp:667: cURL failure
> with cURL error code 22 (see man 3 libcurl-errors) and HTTP error code 400
> 28-11-2014 16:04:08.128 UTC Error hssconnection.cpp:565: Could not get
> subscriber data from HSS
>
>
>
>
>
> 2014-11-28 16:21 GMT+01:00 Antonio Ken Iannillo <
> [email protected]>:
>
>
>
> ---------- Forwarded message ----------
> From: *Antonio Ken Iannillo* <[email protected]>
> Date: 2014-11-28 16:19 GMT+01:00
> Subject: Re: [Clearwater] Connection Timeout on Large Scale Deployment
> To: Eleanor Merry <[email protected]>
>
>  I solved that problem... I think internally clearwater nodes talks in
> some way directly with dnsmasq to resolve the name. I used a different
> approach and when i configured back dnsmasq all worked again... but i get
> another strange unexpected message.
>
>
>
>
> '.
> 2014-11-28      15:54:26:984    1417186466.984707: Aborting call on
> unexpected message for Call-Id '[email protected]': while expecting
> 'INVITE' (index 16), received 'SIP/2.0 480 Temporarily Unavailable
> Via: SIP/2.0/TCP 192.168.1.107:39003
> ;rport=39003;received=192.168.1.107;branch=z9hG4bK-2010000000-100-1.000000-1
> Record-Route: <
> sip:sprout.cini-clearwater.com:5054;transport=TCP;lr;service=scscf;billing-role=charge-term
> >
> Record-Route: <
> sip:sprout.cini-clearwater.com:5054;transport=TCP;lr;service=scscf;billing-role=charge-orig
> >
> Record-Route: <sip:192.168.1.121:5058;transport=TCP;lr>
> Record-Route: <sip:[email protected]:5060;transport=TCP;lr>
> Call-ID: 2010000000-1.000000///[email protected]
> From: <sip:[email protected]>;tag=1787SIPpTag001001234
> To: <sip:[email protected]
> >;tag=z9hG4bKPjdoa02taVAMk8fNAOlz8drD7MrbjflAFY
> CSeq: 5 INVITE
> Content-Length:  0
> '.
>
>
>
> I already followed the guide for troubleshooting clearwater and the only
> thing i found 'till now is in the homestead log during the start up.
>
>
>
> the error messages
>
> 28-11-2014 15:13:36.580 UTC Error cassandra_store.cpp:170: Cache caught
> TTransportException: connect() failed: Connection refused
> 28-11-2014 15:13:36.580 UTC Error main.cpp:472: Failed to initialize cache
> - rc 3
> 28-11-2014 15:13:38.042 UTC Error cassandra_store.cpp:170: Cache caught
> TTransportException: connect() failed: Connection refused
> 28-11-2014 15:13:38.042 UTC Error main.cpp:472: Failed to initialize cache
> - rc 3
>
>
>
> the full log is
>
> 28-11-2014 15:13:36.572 UTC Status main.cpp:418: Log level set to 2
> 28-11-2014 15:13:36.579 UTC Status main.cpp:438: Access logging enabled to
> /var/log/homestead
> 28-11-2014 15:13:36.579 UTC Status dnscachedresolver.cpp:90: Creating
> Cached Resolver using server 127.0.0.1
> 28-11-2014 15:13:36.579 UTC Status httpresolver.cpp:50: Created HTTP
> resolver
> 28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:143: Configuring
> store
> 28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:144:   Hostname:
>  localhost
> 28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:145:   Port:
>  9160
> 28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:146:   Threads:   10
> 28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:147:   Max Queue: 0
> 28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:162: Starting store
> 28-11-2014 15:13:36.580 UTC Error cassandra_store.cpp:170: Cache caught
> TTransportException: connect() failed: Connection refused
> 28-11-2014 15:13:36.580 UTC Error main.cpp:472: Failed to initialize cache
> - rc 3
> 28-11-2014 15:13:36.580 UTC Status cassandra_store.cpp:204: Stopping cache
> 28-11-2014 15:13:36.580 UTC Status cassandra_store.cpp:214: Waiting for
> cache to stop
> 28-11-2014 15:13:38.037 UTC Status main.cpp:418: Log level set to 2
> 28-11-2014 15:13:38.037 UTC Status main.cpp:438: Access logging enabled to
> /var/log/homestead
> 28-11-2014 15:13:38.038 UTC Status dnscachedresolver.cpp:90: Creating
> Cached Resolver using server 127.0.0.1
> 28-11-2014 15:13:38.038 UTC Status httpresolver.cpp:50: Created HTTP
> resolver
> 28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:143: Configuring
> store
> 28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:144:   Hostname:
>  localhost
> 28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:145:   Port:
>  9160
> 28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:146:   Threads:   10
> 28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:147:   Max Queue: 0
> 28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:162: Starting store
> 28-11-2014 15:13:38.042 UTC Error cassandra_store.cpp:170: Cache caught
> TTransportException: connect() failed: Connection refused
> 28-11-2014 15:13:38.042 UTC Error main.cpp:472: Failed to initialize cache
> - rc 3
> 28-11-2014 15:13:38.042 UTC Status cassandra_store.cpp:204: Stopping cache
> 28-11-2014 15:13:38.042 UTC Status cassandra_store.cpp:214: Waiting for
> cache to stop
> 28-11-2014 15:13:48.070 UTC Status main.cpp:418: Log level set to 2
> 28-11-2014 15:13:48.070 UTC Status main.cpp:438: Access logging enabled to
> /var/log/homestead
> 28-11-2014 15:13:48.071 UTC Status dnscachedresolver.cpp:90: Creating
> Cached Resolver using server 127.0.0.1
> 28-11-2014 15:13:48.071 UTC Status httpresolver.cpp:50: Created HTTP
> resolver
> 28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:143: Configuring
> store
> 28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:144:   Hostname:
>  localhost
> 28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:145:   Port:
>  9160
> 28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:146:   Threads:   10
> 28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:147:   Max Queue: 0
> 28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:162: Starting store
> 28-11-2014 15:13:48.073 UTC Status diameterstack.cpp:63: Initializing
> Diameter stack
> 28-11-2014 15:13:48.073 UTC Status diameterstack.cpp:307: Configuring
> Diameter stack from file /var/lib/homestead/homestead.conf
> 28-11-2014 15:13:48.280 UTC Status freeDiameter: All extensions loaded.
> 28-11-2014 15:13:48.280 UTC Status freeDiameter: freeDiameter
> configuration:
> 28-11-2014 15:13:48.280 UTC Status freeDiameter:   Default trace level
> .... : +3
> 28-11-2014 15:13:48.280 UTC Status freeDiameter:   Configuration file
> ..... : /var/lib/homestead/homestead.conf
> 28-11-2014 15:13:48.280 UTC Status freeDiameter:   Diameter Identity
> ...... : 192.168.1.113 (l:13)
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   Diameter Realm
> ......... : hs.cini-clearwater.com (l:22)
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   Tc Timer
> ............... : 30
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   Tw Timer
> ............... : 30
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   Local port
> ............. : 3868
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   Local secure port
> ...... : 5658
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   Number of SCTP streams
> . : 3
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   Number of clients thr
> .. : 5
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   Number of app threads
> .. : 4
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   Local endpoints
> ........ : Default (use all available)
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   Local applications
> ..... : (none)
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   Flags : - IP
> ........... : Enabled
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:           - IPv6
> ......... : Enabled
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:           - Relay app
> .... : Enabled
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:           - TCP
> .......... : Enabled
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:           - SCTP
> ......... : Enabled
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:           - Pref. proto
> .. : SCTP
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:           - TLS method
> ... : Separate port
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   TLS :   - Certificate
> .. : /var/lib/homestead/cert.pem
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:           - Private key
> .. : /var/lib/homestead/privkey.pem
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:           - CA (trust)
> ... : /var/lib/homestead/ca.pem (1 certs)
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:           - CRL
> .......... : (none)
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:           - Priority
> ..... : (default: 'NORMAL')
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:           - DH bits
> ...... : 1024
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:   Origin-State-Id
> ........ : 1417187628
> 28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions:
> '/usr/share/clearwater/homestead/lib/freeDiameter//dbg_monitor.fdx'[(no
> config file)], loaded
> 28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions:
> '/usr/share/clearwater/homestead/lib/freeDiameter//dict_nasreq.fdx'[(no
> config file)], loaded
> 28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions:
> '/usr/share/clearwater/homestead/lib/freeDiameter//dict_sip.fdx'[(no config
> file)], loaded
> 28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions:
> '/usr/share/clearwater/homestead/lib/freeDiameter//dict_dcca.fdx'[(no
> config file)], loaded
> 28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions:
> '/usr/share/clearwater/homestead/lib/freeDiameter//dict_dcca_3gpp.fdx'[(no
> config file)], loaded
> 28-11-2014 15:13:48.281 UTC Status freeDiameter:
> {signal:12}'dbg_monitor'->0x7faf1c2ebd70
> 28-11-2014 15:13:48.281 UTC Status diameterstack.cpp:432: Starting
> Diameter stack
> 28-11-2014 15:13:48.288 UTC Status freeDiameter: Local server address(es):
> 192.168.1.113{---L-}
> 28-11-2014 15:13:48.288 UTC Status handlers.cpp:76: Configuring
> HssCacheTask
> 28-11-2014 15:13:48.288 UTC Status handlers.cpp:77:   Dest-Realm:
> cini-clearwater.com
> 28-11-2014 15:13:48.288 UTC Status handlers.cpp:78:   Dest-Host:
> 28-11-2014 15:13:48.288 UTC Status handlers.cpp:79:   Server-Name:
> sip:sprout.cini-clearwater.com:5054;transport=TCP
> 28-11-2014 15:13:48.288 UTC Status httpstack.cpp:131: Configuring HTTP
> stack
> 28-11-2014 15:13:48.288 UTC Status httpstack.cpp:132:   Bind address:
> 192.168.1.113
> 28-11-2014 15:13:48.288 UTC Status httpstack.cpp:133:   Bind port:    8888
> 28-11-2014 15:13:48.288 UTC Status httpstack.cpp:134:   Num threads:  50
> 28-11-2014 15:13:48.547 UTC Status main.cpp:597: Start-up complete - wait
> for termination signal
>
>
>
>
>
>
>
>
>
>
>
> 2014-11-28 10:36 GMT+01:00 Antonio Ken Iannillo <[email protected]>:
>
> I checked the bono node and I found in its log:
>
> UTC Error connection_pool.cpp:217: Failed to resolve
> sprout.cini-clearwater.com to an IP address - Not found (PJ_ENOTFOUND)
>
>
>
> But I can ping all the other machines in the deployment with their
> symbolic name (even sprout.cini-clearwater.com) so I cant' understand
> this error.
>
>
>
> Any idea?
>
>
>
> Antonio Ken Iannillo
>
>
>
>
>
> 2014-11-27 19:53 GMT+01:00 Eleanor Merry <[email protected]>:
>
> Hi,
>
> The stress tests shouldn't time out like that. Can you see if either of
> your Bono's receives the REGISTERs? If they don't then can you check that
> the stress node can resolve the Bono address (on a sip stress node, this
> will be either the home_domain, or (one of the) bono_servers set in
> /etc/clearwater/config), and that the Bono nodes will accept traffic from
> the stress nodes.
>
> Thanks for the documentation pointer - I'll update this.
>
> Ellie
>
>
> -----Original Message-----
> From: [email protected] [mailto:
> [email protected]] On Behalf Of Antonio Ken
> Iannillo
> Sent: 26 November 2014 18:43
> To: [email protected]
> Subject: [Clearwater] Connection Timeout on Large Scale Deployment
>
> Hi all,
>
> I manually installed a larger scale deployment (2 copies of every service
> but ellis) and I run the stress sipp test.
>
> I got all timeouts, but I didn't get them with a simple deployment. Do you
> think it depends on a greater network delay or maybe i could have
> mis-configured something?
>
> In sipp error logs I got something like:
>
> '.
> > 2014-11-26      19:27:16:661    1417026436.661073: Aborting call on
> > unexpected message for Call-Id '[email protected]': while
> > expecting '401' (index 3), received 'SIP/2.0 408 Request Timeout
> > Via: SIP/2.0/TCP 192.168.1.107:5060
> > ;rport=24582;received=192.168.1.107;branch=z9hG4bK-9941-17-2-201000016
> > 6-
> > Call-ID: 2010000166///[email protected]
> > From: <sip:[email protected]>;tag=9941SIPpTag0017
> > To:
> > <sip:[email protected]>;tag=z9hG4bK-9941-17-2-2010000166-
> > CSeq: 1 REGISTER
> > Content-Length:  0
> > '.
>
>
> while in the sipp message log i have:
>
> Problem EAGAIN on socket  12 and poll_idx  is 2
> > Added first buffered message to socket 12 Problem EWOULDBLOCK on
> > socket  12 and poll_idx  is 2 Exit problem event on socket 12 Wrote
> > 702 of 702 bytes in an output buffer.
> > ----------------------------------------------- 2014-11-26
> > 19:27:11:27.944 TCP message received [363] bytes :
> > SIP/2.0 408 Request Timeout
> > Via: SIP/2.0/TCP 192.168.1.107:5060
> > ;rport=24578;received=192.168.1.107;branch=z9hG4bK-9941-23-2-201000015
> > 4-
> > Call-ID: 2010000154///[email protected]
> > From: <sip:[email protected]>;tag=9941SIPpTag0023
> > To:
> > <sip:[email protected]>;tag=z9hG4bK-9941-23-2-2010000154-
> > CSeq: 1 REGISTER
> > Content-Length:  0
> >
> > -----------------------------------------------
> > Unexpected TCP message received:
> > SIP/2.0 408 Request Timeout
> > Via: SIP/2.0/TCP 192.168.1.107:5060
> > ;rport=24578;received=192.168.1.107;branch=z9hG4bK-9941-23-2-201000015
> > 4-
> > Call-ID: 2010000154///[email protected]
> > From: <sip:[email protected]>;tag=9941SIPpTag0023
> > To:
> > <sip:[email protected]>;tag=z9hG4bK-9941-23-2-2010000154-
> > CSeq: 1 REGISTER
> > Content-Length:  0
>
>
> I tried to increase send and receive timeout in the sipp command but the
> results didn't change.
>
> P.S.: Feedback on documentation --> in the description of large scale
> deployment, while clustering homestead one should unistall both homestead
> and homestead-prov in order to inject both schemas in cassandra. In the doc
> only homestead is specified.
>
> _______________________________________________
> Clearwater mailing list
> [email protected]
> http://lists.projectclearwater.org/listinfo/clearwater
>
>
>
>
>
>
>
>
>
_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/listinfo/clearwater

Reply via email to