It looks like there are connection issues between the Sprouts and Homesteads. 
Can you check that the Sprout can resolve the Homestead address? The clearwater 
processes look in /etc/resolv.conf for DNS configuration. If your nodes don’t 
use localhost (try running nslookup on a hostname and see if the Server value 
is 127.0.0.1) you’ll need to update the dnsmasq configuration to point to this 
server rather than localhost. You can find details of how to do this at: 
https://github.com/Metaswitch/clearwater-docs/wiki/Clearwater-DNS-Usage#client-configuration



Can you also please send me the debug logs for this call from Sprout and 
Homestead. You can turn on debug logging by creating/editing the file 
/etc/clearwater/user_settings, adding log_level=5 and then restarting the 
component (service sprout <component> - it's automatically restarted by monit). 
The logs are output in /var/log/<component>/*. You won’t want to have stress 
running though the node with debug logs on though – can you either try making a 
standard call on your deployment, or set count= in /etc/clearwater/config on 
your stress node to a low number.
The reported Cassandra errors are benign. They are caused by the Cassandra 
process not having quite started up when the Homestead process started. This 
meant Homestead couldn’t initialize its cache. After a couple of seconds, once 
Cassandra is up, Homestead can successfully initialize its cache and so starts 
up properly.

Ellie

From: Antonio Ken Iannillo [mailto:[email protected]]
Sent: 28 November 2014 16:13
To: [email protected]
Cc: Eleanor Merry
Subject: Re: [Clearwater] Connection Timeout on Large Scale Deployment

I also found this error in sprout log

28-11-2014 16:04:07.905 UTC Error httpconnection.cpp:569: 
http://hs.cini-clearwater.com:8888/impu/sip%3A2010000140%40cini-clearwater.com/reg-data
 failed at server 192.168.1.113 : HTTP response code said error (22 400) : fatal
28-11-2014 16:04:07.906 UTC Error httpconnection.cpp:667: cURL failure with 
cURL error code 22 (see man 3 libcurl-errors) and HTTP error code 400
28-11-2014 16:04:07.906 UTC Error hssconnection.cpp:565: Could not get 
subscriber data from HSS
28-11-2014 16:04:08.128 UTC Error httpconnection.cpp:569: 
http://hs.cini-clearwater.com:8888/impu/sip%3A2010000030%40cini-clearwater.com/reg-data
 failed at server 192.168.1.113 : HTTP response code said error (22 400) : fatal
28-11-2014 16:04:08.128 UTC Error httpconnection.cpp:667: cURL failure with 
cURL error code 22 (see man 3 libcurl-errors) and HTTP error code 400
28-11-2014 16:04:08.128 UTC Error hssconnection.cpp:565: Could not get 
subscriber data from HSS


2014-11-28 16:21 GMT+01:00 Antonio Ken Iannillo 
<[email protected]<mailto:[email protected]>>:

---------- Forwarded message ----------
From: Antonio Ken Iannillo <[email protected]<mailto:[email protected]>>
Date: 2014-11-28 16:19 GMT+01:00
Subject: Re: [Clearwater] Connection Timeout on Large Scale Deployment
To: Eleanor Merry 
<[email protected]<mailto:[email protected]>>

I solved that problem... I think internally clearwater nodes talks in some way 
directly with dnsmasq to resolve the name. I used a different approach and when 
i configured back dnsmasq all worked again... but i get another strange 
unexpected message.


'.
2014-11-28      15:54:26:984    1417186466.984707: Aborting call on unexpected 
message for Call-Id '[email protected]<mailto:[email protected]>': 
while expecting 'INVITE' (index 16), received 'SIP/2.0 480 Temporarily 
Unavailable
Via: SIP/2.0/TCP 
192.168.1.107:39003;rport=39003;received=192.168.1.107;branch=z9hG4bK-2010000000-100-1.000000-1
Record-Route: 
<sip:sprout.cini-clearwater.com:5054;transport=TCP;lr;service=scscf;billing-role=charge-term>
Record-Route: 
<sip:sprout.cini-clearwater.com:5054;transport=TCP;lr;service=scscf;billing-role=charge-orig>
Record-Route: <sip:192.168.1.121:5058;transport=TCP;lr>
Record-Route: <sip:[email protected]:5060;transport=TCP;lr>
Call-ID: 
2010000000-1.000000///[email protected]<mailto:[email protected]>
From: 
<sip:[email protected]<mailto:sip%[email protected]>>;tag=1787SIPpTag001001234
To: 
<sip:[email protected]<mailto:sip%[email protected]>>;tag=z9hG4bKPjdoa02taVAMk8fNAOlz8drD7MrbjflAFY
CSeq: 5 INVITE
Content-Length:  0
'.

I already followed the guide for troubleshooting clearwater and the only thing 
i found 'till now is in the homestead log during the start up.

the error messages
28-11-2014 15:13:36.580 UTC Error cassandra_store.cpp:170: Cache caught 
TTransportException: connect() failed: Connection refused
28-11-2014 15:13:36.580 UTC Error main.cpp:472: Failed to initialize cache - rc 
3
28-11-2014 15:13:38.042 UTC Error cassandra_store.cpp:170: Cache caught 
TTransportException: connect() failed: Connection refused
28-11-2014 15:13:38.042 UTC Error main.cpp:472: Failed to initialize cache - rc 
3

the full log is
28-11-2014 15:13:36.572 UTC Status main.cpp:418: Log level set to 2
28-11-2014 15:13:36.579 UTC Status main.cpp:438: Access logging enabled to 
/var/log/homestead
28-11-2014 15:13:36.579 UTC Status dnscachedresolver.cpp:90: Creating Cached 
Resolver using server 127.0.0.1
28-11-2014 15:13:36.579 UTC Status httpresolver.cpp:50: Created HTTP resolver
28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:143: Configuring store
28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:144:   Hostname:  
localhost
28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:145:   Port:      9160
28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:146:   Threads:   10
28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:147:   Max Queue: 0
28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:162: Starting store
28-11-2014 15:13:36.580 UTC Error cassandra_store.cpp:170: Cache caught 
TTransportException: connect() failed: Connection refused
28-11-2014 15:13:36.580 UTC Error main.cpp:472: Failed to initialize cache - rc 
3
28-11-2014 15:13:36.580 UTC Status cassandra_store.cpp:204: Stopping cache
28-11-2014 15:13:36.580 UTC Status cassandra_store.cpp:214: Waiting for cache 
to stop
28-11-2014 15:13:38.037 UTC Status main.cpp:418: Log level set to 2
28-11-2014 15:13:38.037 UTC Status main.cpp:438: Access logging enabled to 
/var/log/homestead
28-11-2014 15:13:38.038 UTC Status dnscachedresolver.cpp:90: Creating Cached 
Resolver using server 127.0.0.1
28-11-2014 15:13:38.038 UTC Status httpresolver.cpp:50: Created HTTP resolver
28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:143: Configuring store
28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:144:   Hostname:  
localhost
28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:145:   Port:      9160
28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:146:   Threads:   10
28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:147:   Max Queue: 0
28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:162: Starting store
28-11-2014 15:13:38.042 UTC Error cassandra_store.cpp:170: Cache caught 
TTransportException: connect() failed: Connection refused
28-11-2014 15:13:38.042 UTC Error main.cpp:472: Failed to initialize cache - rc 
3
28-11-2014 15:13:38.042 UTC Status cassandra_store.cpp:204: Stopping cache
28-11-2014 15:13:38.042 UTC Status cassandra_store.cpp:214: Waiting for cache 
to stop
28-11-2014 15:13:48.070 UTC Status main.cpp:418: Log level set to 2
28-11-2014 15:13:48.070 UTC Status main.cpp:438: Access logging enabled to 
/var/log/homestead
28-11-2014 15:13:48.071 UTC Status dnscachedresolver.cpp:90: Creating Cached 
Resolver using server 127.0.0.1
28-11-2014 15:13:48.071 UTC Status httpresolver.cpp:50: Created HTTP resolver
28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:143: Configuring store
28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:144:   Hostname:  
localhost
28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:145:   Port:      9160
28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:146:   Threads:   10
28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:147:   Max Queue: 0
28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:162: Starting store
28-11-2014 15:13:48.073 UTC Status diameterstack.cpp:63: Initializing Diameter 
stack
28-11-2014 15:13:48.073 UTC Status diameterstack.cpp:307: Configuring Diameter 
stack from file /var/lib/homestead/homestead.conf
28-11-2014 15:13:48.280 UTC Status freeDiameter: All extensions loaded.
28-11-2014 15:13:48.280 UTC Status freeDiameter: freeDiameter configuration:
28-11-2014 15:13:48.280 UTC Status freeDiameter:   Default trace level .... : +3
28-11-2014 15:13:48.280 UTC Status freeDiameter:   Configuration file ..... : 
/var/lib/homestead/homestead.conf
28-11-2014 15:13:48.280 UTC Status freeDiameter:   Diameter Identity ...... : 
192.168.1.113 (l:13)
28-11-2014 15:13:48.281 UTC Status freeDiameter:   Diameter Realm ......... : 
hs.cini-clearwater.com<http://hs.cini-clearwater.com> (l:22)
28-11-2014 15:13:48.281 UTC Status freeDiameter:   Tc Timer ............... : 30
28-11-2014 15:13:48.281 UTC Status freeDiameter:   Tw Timer ............... : 30
28-11-2014 15:13:48.281 UTC Status freeDiameter:   Local port ............. : 
3868
28-11-2014 15:13:48.281 UTC Status freeDiameter:   Local secure port ...... : 
5658
28-11-2014 15:13:48.281 UTC Status freeDiameter:   Number of SCTP streams . : 3
28-11-2014 15:13:48.281 UTC Status freeDiameter:   Number of clients thr .. : 5
28-11-2014 15:13:48.281 UTC Status freeDiameter:   Number of app threads .. : 4
28-11-2014 15:13:48.281 UTC Status freeDiameter:   Local endpoints ........ : 
Default (use all available)
28-11-2014 15:13:48.281 UTC Status freeDiameter:   Local applications ..... : 
(none)
28-11-2014 15:13:48.281 UTC Status freeDiameter:   Flags : - IP ........... : 
Enabled
28-11-2014 15:13:48.281 UTC Status freeDiameter:           - IPv6 ......... : 
Enabled
28-11-2014 15:13:48.281 UTC Status freeDiameter:           - Relay app .... : 
Enabled
28-11-2014 15:13:48.281 UTC Status freeDiameter:           - TCP .......... : 
Enabled
28-11-2014 15:13:48.281 UTC Status freeDiameter:           - SCTP ......... : 
Enabled
28-11-2014 15:13:48.281 UTC Status freeDiameter:           - Pref. proto .. : 
SCTP
28-11-2014 15:13:48.281 UTC Status freeDiameter:           - TLS method ... : 
Separate port
28-11-2014 15:13:48.281 UTC Status freeDiameter:   TLS :   - Certificate .. : 
/var/lib/homestead/cert.pem
28-11-2014 15:13:48.281 UTC Status freeDiameter:           - Private key .. : 
/var/lib/homestead/privkey.pem
28-11-2014 15:13:48.281 UTC Status freeDiameter:           - CA (trust) ... : 
/var/lib/homestead/ca.pem (1 certs)
28-11-2014 15:13:48.281 UTC Status freeDiameter:           - CRL .......... : 
(none)
28-11-2014 15:13:48.281 UTC Status freeDiameter:           - Priority ..... : 
(default: 'NORMAL')
28-11-2014 15:13:48.281 UTC Status freeDiameter:           - DH bits ...... : 
1024
28-11-2014 15:13:48.281 UTC Status freeDiameter:   Origin-State-Id ........ : 
1417187628
28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions: 
'/usr/share/clearwater/homestead/lib/freeDiameter//dbg_monitor.fdx'[(no config 
file)], loaded
28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions: 
'/usr/share/clearwater/homestead/lib/freeDiameter//dict_nasreq.fdx'[(no config 
file)], loaded
28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions: 
'/usr/share/clearwater/homestead/lib/freeDiameter//dict_sip.fdx'[(no config 
file)], loaded
28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions: 
'/usr/share/clearwater/homestead/lib/freeDiameter//dict_dcca.fdx'[(no config 
file)], loaded
28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions: 
'/usr/share/clearwater/homestead/lib/freeDiameter//dict_dcca_3gpp.fdx'[(no 
config file)], loaded
28-11-2014 15:13:48.281 UTC Status freeDiameter: 
{signal:12}'dbg_monitor'->0x7faf1c2ebd70
28-11-2014 15:13:48.281 UTC Status diameterstack.cpp:432: Starting Diameter 
stack
28-11-2014 15:13:48.288 UTC Status freeDiameter: Local server address(es): 
192.168.1.113{---L-}
28-11-2014 15:13:48.288 UTC Status handlers.cpp:76: Configuring HssCacheTask
28-11-2014 15:13:48.288 UTC Status handlers.cpp:77:   Dest-Realm:  
cini-clearwater.com<http://cini-clearwater.com>
28-11-2014 15:13:48.288 UTC Status handlers.cpp:78:   Dest-Host:
28-11-2014 15:13:48.288 UTC Status handlers.cpp:79:   Server-Name: 
sip:sprout.cini-clearwater.com:5054;transport=TCP
28-11-2014 15:13:48.288 UTC Status httpstack.cpp:131: Configuring HTTP stack
28-11-2014 15:13:48.288 UTC Status httpstack.cpp:132:   Bind address: 
192.168.1.113
28-11-2014 15:13:48.288 UTC Status httpstack.cpp:133:   Bind port:    8888
28-11-2014 15:13:48.288 UTC Status httpstack.cpp:134:   Num threads:  50
28-11-2014 15:13:48.547 UTC Status main.cpp:597: Start-up complete - wait for 
termination signal





2014-11-28 10:36 GMT+01:00 Antonio Ken Iannillo 
<[email protected]<mailto:[email protected]>>:
I checked the bono node and I found in its log:
UTC Error connection_pool.cpp:217: Failed to resolve 
sprout.cini-clearwater.com<http://sprout.cini-clearwater.com> to an IP address 
- Not found (PJ_ENOTFOUND)

But I can ping all the other machines in the deployment with their symbolic 
name (even sprout.cini-clearwater.com<http://sprout.cini-clearwater.com>) so I 
cant' understand this error.

Any idea?

Antonio Ken Iannillo


2014-11-27 19:53 GMT+01:00 Eleanor Merry 
<[email protected]<mailto:[email protected]>>:
Hi,

The stress tests shouldn't time out like that. Can you see if either of your 
Bono's receives the REGISTERs? If they don't then can you check that the stress 
node can resolve the Bono address (on a sip stress node, this will be either 
the home_domain, or (one of the) bono_servers set in /etc/clearwater/config), 
and that the Bono nodes will accept traffic from the stress nodes.

Thanks for the documentation pointer - I'll update this.

Ellie

-----Original Message-----
From: 
[email protected]<mailto:[email protected]>
 
[mailto:[email protected]<mailto:[email protected]>]
 On Behalf Of Antonio Ken Iannillo
Sent: 26 November 2014 18:43
To: 
[email protected]<mailto:[email protected]>
Subject: [Clearwater] Connection Timeout on Large Scale Deployment

Hi all,

I manually installed a larger scale deployment (2 copies of every service but 
ellis) and I run the stress sipp test.

I got all timeouts, but I didn't get them with a simple deployment. Do you 
think it depends on a greater network delay or maybe i could have 
mis-configured something?

In sipp error logs I got something like:

'.
> 2014-11-26      19:27:16:661    1417026436.661073: Aborting call on
> unexpected message for Call-Id 
> '[email protected]<mailto:[email protected]>': while
> expecting '401' (index 3), received 'SIP/2.0 408 Request Timeout
> Via: SIP/2.0/TCP 192.168.1.107:5060<http://192.168.1.107:5060>
> ;rport=24582;received=192.168.1.107;branch=z9hG4bK-9941-17-2-201000016
> 6-
> Call-ID: 2010000166///[email protected]<mailto:[email protected]>
> From: 
> <sip:[email protected]<mailto:sip%[email protected]>>;tag=9941SIPpTag0017
> To:
> <sip:[email protected]<mailto:sip%[email protected]>>;tag=z9hG4bK-9941-17-2-2010000166-
> CSeq: 1 REGISTER
> Content-Length:  0
> '.


while in the sipp message log i have:

Problem EAGAIN on socket  12 and poll_idx  is 2
> Added first buffered message to socket 12 Problem EWOULDBLOCK on
> socket  12 and poll_idx  is 2 Exit problem event on socket 12 Wrote
> 702 of 702 bytes in an output buffer.
> ----------------------------------------------- 2014-11-26
> 19:27:11:27.944 TCP message received [363] bytes :
> SIP/2.0 408 Request Timeout
> Via: SIP/2.0/TCP 192.168.1.107:5060<http://192.168.1.107:5060>
> ;rport=24578;received=192.168.1.107;branch=z9hG4bK-9941-23-2-201000015
> 4-
> Call-ID: 2010000154///[email protected]<mailto:[email protected]>
> From: 
> <sip:[email protected]<mailto:sip%[email protected]>>;tag=9941SIPpTag0023
> To:
> <sip:[email protected]<mailto:sip%[email protected]>>;tag=z9hG4bK-9941-23-2-2010000154-
> CSeq: 1 REGISTER
> Content-Length:  0
>
> -----------------------------------------------
> Unexpected TCP message received:
> SIP/2.0 408 Request Timeout
> Via: SIP/2.0/TCP 192.168.1.107:5060<http://192.168.1.107:5060>
> ;rport=24578;received=192.168.1.107;branch=z9hG4bK-9941-23-2-201000015
> 4-
> Call-ID: 2010000154///[email protected]<mailto:[email protected]>
> From: 
> <sip:[email protected]<mailto:sip%[email protected]>>;tag=9941SIPpTag0023
> To:
> <sip:[email protected]<mailto:sip%[email protected]>>;tag=z9hG4bK-9941-23-2-2010000154-
> CSeq: 1 REGISTER
> Content-Length:  0


I tried to increase send and receive timeout in the sipp command but the 
results didn't change.

P.S.: Feedback on documentation --> in the description of large scale 
deployment, while clustering homestead one should unistall both homestead and 
homestead-prov in order to inject both schemas in cassandra. In the doc only 
homestead is specified.
_______________________________________________
Clearwater mailing list
[email protected]<mailto:[email protected]>
http://lists.projectclearwater.org/listinfo/clearwater




_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/listinfo/clearwater

Reply via email to