---------- Forwarded message ---------- From: Antonio Ken Iannillo <[email protected]> Date: 2014-11-28 16:19 GMT+01:00 Subject: Re: [Clearwater] Connection Timeout on Large Scale Deployment To: Eleanor Merry <[email protected]>
I solved that problem... I think internally clearwater nodes talks in some way directly with dnsmasq to resolve the name. I used a different approach and when i configured back dnsmasq all worked again... but i get another strange unexpected message. > '. > 2014-11-28 15:54:26:984 1417186466.984707: Aborting call on > unexpected message for Call-Id '[email protected]': while expecting > 'INVITE' (index 16), received 'SIP/2.0 480 Temporarily Unavailable > Via: SIP/2.0/TCP 192.168.1.107:39003 > ;rport=39003;received=192.168.1.107;branch=z9hG4bK-2010000000-100-1.000000-1 > Record-Route: <sip:sprout.cini-clearwater.com:5054 > ;transport=TCP;lr;service=scscf;billing-role=charge-term> > Record-Route: <sip:sprout.cini-clearwater.com:5054 > ;transport=TCP;lr;service=scscf;billing-role=charge-orig> > Record-Route: <sip:192.168.1.121:5058;transport=TCP;lr> > Record-Route: <sip:[email protected]:5060;transport=TCP;lr> > Call-ID: 2010000000-1.000000///[email protected] > From: <sip:[email protected]>;tag=1787SIPpTag001001234 > To: <sip:[email protected] > >;tag=z9hG4bKPjdoa02taVAMk8fNAOlz8drD7MrbjflAFY > CSeq: 5 INVITE > Content-Length: 0 > '. I already followed the guide for troubleshooting clearwater and the only thing i found 'till now is in the homestead log during the start up. the error messages > 28-11-2014 15:13:36.580 UTC Error cassandra_store.cpp:170: Cache caught > TTransportException: connect() failed: Connection refused > 28-11-2014 15:13:36.580 UTC Error main.cpp:472: Failed to initialize cache > - rc 3 > 28-11-2014 15:13:38.042 UTC Error cassandra_store.cpp:170: Cache caught > TTransportException: connect() failed: Connection refused > 28-11-2014 15:13:38.042 UTC Error main.cpp:472: Failed to initialize cache > - rc 3 > the full log is > 28-11-2014 15:13:36.572 UTC Status main.cpp:418: Log level set to 2 > 28-11-2014 15:13:36.579 UTC Status main.cpp:438: Access logging enabled to > /var/log/homestead > 28-11-2014 15:13:36.579 UTC Status dnscachedresolver.cpp:90: Creating > Cached Resolver using server 127.0.0.1 > 28-11-2014 15:13:36.579 UTC Status httpresolver.cpp:50: Created HTTP > resolver > 28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:143: Configuring > store > 28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:144: Hostname: > localhost > 28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:145: Port: > 9160 > 28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:146: Threads: 10 > 28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:147: Max Queue: 0 > 28-11-2014 15:13:36.579 UTC Status cassandra_store.cpp:162: Starting store > 28-11-2014 15:13:36.580 UTC Error cassandra_store.cpp:170: Cache caught > TTransportException: connect() failed: Connection refused > 28-11-2014 15:13:36.580 UTC Error main.cpp:472: Failed to initialize cache > - rc 3 > 28-11-2014 15:13:36.580 UTC Status cassandra_store.cpp:204: Stopping cache > 28-11-2014 15:13:36.580 UTC Status cassandra_store.cpp:214: Waiting for > cache to stop > 28-11-2014 15:13:38.037 UTC Status main.cpp:418: Log level set to 2 > 28-11-2014 15:13:38.037 UTC Status main.cpp:438: Access logging enabled to > /var/log/homestead > 28-11-2014 15:13:38.038 UTC Status dnscachedresolver.cpp:90: Creating > Cached Resolver using server 127.0.0.1 > 28-11-2014 15:13:38.038 UTC Status httpresolver.cpp:50: Created HTTP > resolver > 28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:143: Configuring > store > 28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:144: Hostname: > localhost > 28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:145: Port: > 9160 > 28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:146: Threads: 10 > 28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:147: Max Queue: 0 > 28-11-2014 15:13:38.038 UTC Status cassandra_store.cpp:162: Starting store > 28-11-2014 15:13:38.042 UTC Error cassandra_store.cpp:170: Cache caught > TTransportException: connect() failed: Connection refused > 28-11-2014 15:13:38.042 UTC Error main.cpp:472: Failed to initialize cache > - rc 3 > 28-11-2014 15:13:38.042 UTC Status cassandra_store.cpp:204: Stopping cache > 28-11-2014 15:13:38.042 UTC Status cassandra_store.cpp:214: Waiting for > cache to stop > 28-11-2014 15:13:48.070 UTC Status main.cpp:418: Log level set to 2 > 28-11-2014 15:13:48.070 UTC Status main.cpp:438: Access logging enabled to > /var/log/homestead > 28-11-2014 15:13:48.071 UTC Status dnscachedresolver.cpp:90: Creating > Cached Resolver using server 127.0.0.1 > 28-11-2014 15:13:48.071 UTC Status httpresolver.cpp:50: Created HTTP > resolver > 28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:143: Configuring > store > 28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:144: Hostname: > localhost > 28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:145: Port: > 9160 > 28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:146: Threads: 10 > 28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:147: Max Queue: 0 > 28-11-2014 15:13:48.071 UTC Status cassandra_store.cpp:162: Starting store > 28-11-2014 15:13:48.073 UTC Status diameterstack.cpp:63: Initializing > Diameter stack > 28-11-2014 15:13:48.073 UTC Status diameterstack.cpp:307: Configuring > Diameter stack from file /var/lib/homestead/homestead.conf > 28-11-2014 15:13:48.280 UTC Status freeDiameter: All extensions loaded. > 28-11-2014 15:13:48.280 UTC Status freeDiameter: freeDiameter > configuration: > 28-11-2014 15:13:48.280 UTC Status freeDiameter: Default trace level > .... : +3 > 28-11-2014 15:13:48.280 UTC Status freeDiameter: Configuration file > ..... : /var/lib/homestead/homestead.conf > 28-11-2014 15:13:48.280 UTC Status freeDiameter: Diameter Identity > ...... : 192.168.1.113 (l:13) > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Diameter Realm > ......... : hs.cini-clearwater.com (l:22) > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Tc Timer > ............... : 30 > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Tw Timer > ............... : 30 > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Local port > ............. : 3868 > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Local secure port > ...... : 5658 > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Number of SCTP streams > . : 3 > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Number of clients thr > .. : 5 > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Number of app threads > .. : 4 > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Local endpoints > ........ : Default (use all available) > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Local applications > ..... : (none) > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Flags : - IP > ........... : Enabled > 28-11-2014 15:13:48.281 UTC Status freeDiameter: - IPv6 > ......... : Enabled > 28-11-2014 15:13:48.281 UTC Status freeDiameter: - Relay app > .... : Enabled > 28-11-2014 15:13:48.281 UTC Status freeDiameter: - TCP > .......... : Enabled > 28-11-2014 15:13:48.281 UTC Status freeDiameter: - SCTP > ......... : Enabled > 28-11-2014 15:13:48.281 UTC Status freeDiameter: - Pref. proto > .. : SCTP > 28-11-2014 15:13:48.281 UTC Status freeDiameter: - TLS method > ... : Separate port > 28-11-2014 15:13:48.281 UTC Status freeDiameter: TLS : - Certificate > .. : /var/lib/homestead/cert.pem > 28-11-2014 15:13:48.281 UTC Status freeDiameter: - Private key > .. : /var/lib/homestead/privkey.pem > 28-11-2014 15:13:48.281 UTC Status freeDiameter: - CA (trust) > ... : /var/lib/homestead/ca.pem (1 certs) > 28-11-2014 15:13:48.281 UTC Status freeDiameter: - CRL > .......... : (none) > 28-11-2014 15:13:48.281 UTC Status freeDiameter: - Priority > ..... : (default: 'NORMAL') > 28-11-2014 15:13:48.281 UTC Status freeDiameter: - DH bits > ...... : 1024 > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Origin-State-Id > ........ : 1417187628 > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions: > '/usr/share/clearwater/homestead/lib/freeDiameter//dbg_monitor.fdx'[(no > config file)], loaded > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions: > '/usr/share/clearwater/homestead/lib/freeDiameter//dict_nasreq.fdx'[(no > config file)], loaded > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions: > '/usr/share/clearwater/homestead/lib/freeDiameter//dict_sip.fdx'[(no config > file)], loaded > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions: > '/usr/share/clearwater/homestead/lib/freeDiameter//dict_dcca.fdx'[(no > config file)], loaded > 28-11-2014 15:13:48.281 UTC Status freeDiameter: Loaded extensions: > '/usr/share/clearwater/homestead/lib/freeDiameter//dict_dcca_3gpp.fdx'[(no > config file)], loaded > 28-11-2014 15:13:48.281 UTC Status freeDiameter: > {signal:12}'dbg_monitor'->0x7faf1c2ebd70 > 28-11-2014 15:13:48.281 UTC Status diameterstack.cpp:432: Starting > Diameter stack > 28-11-2014 15:13:48.288 UTC Status freeDiameter: Local server address(es): > 192.168.1.113{---L-} > 28-11-2014 15:13:48.288 UTC Status handlers.cpp:76: Configuring > HssCacheTask > 28-11-2014 15:13:48.288 UTC Status handlers.cpp:77: Dest-Realm: > cini-clearwater.com > 28-11-2014 15:13:48.288 UTC Status handlers.cpp:78: Dest-Host: > 28-11-2014 15:13:48.288 UTC Status handlers.cpp:79: Server-Name: > sip:sprout.cini-clearwater.com:5054;transport=TCP > 28-11-2014 15:13:48.288 UTC Status httpstack.cpp:131: Configuring HTTP > stack > 28-11-2014 15:13:48.288 UTC Status httpstack.cpp:132: Bind address: > 192.168.1.113 > 28-11-2014 15:13:48.288 UTC Status httpstack.cpp:133: Bind port: 8888 > 28-11-2014 15:13:48.288 UTC Status httpstack.cpp:134: Num threads: 50 > 28-11-2014 15:13:48.547 UTC Status main.cpp:597: Start-up complete - wait > for termination signal 2014-11-28 10:36 GMT+01:00 Antonio Ken Iannillo <[email protected]>: > I checked the bono node and I found in its log: > >> UTC Error connection_pool.cpp:217: Failed to resolve >> sprout.cini-clearwater.com to an IP address - Not found (PJ_ENOTFOUND) > > > But I can ping all the other machines in the deployment with their > symbolic name (even sprout.cini-clearwater.com) so I cant' understand > this error. > > Any idea? > > Antonio Ken Iannillo > > > 2014-11-27 19:53 GMT+01:00 Eleanor Merry <[email protected]>: > >> Hi, >> >> The stress tests shouldn't time out like that. Can you see if either of >> your Bono's receives the REGISTERs? If they don't then can you check that >> the stress node can resolve the Bono address (on a sip stress node, this >> will be either the home_domain, or (one of the) bono_servers set in >> /etc/clearwater/config), and that the Bono nodes will accept traffic from >> the stress nodes. >> >> Thanks for the documentation pointer - I'll update this. >> >> Ellie >> >> -----Original Message----- >> From: [email protected] [mailto: >> [email protected]] On Behalf Of Antonio Ken >> Iannillo >> Sent: 26 November 2014 18:43 >> To: [email protected] >> Subject: [Clearwater] Connection Timeout on Large Scale Deployment >> >> Hi all, >> >> I manually installed a larger scale deployment (2 copies of every service >> but ellis) and I run the stress sipp test. >> >> I got all timeouts, but I didn't get them with a simple deployment. Do >> you think it depends on a greater network delay or maybe i could have >> mis-configured something? >> >> In sipp error logs I got something like: >> >> '. >> > 2014-11-26 19:27:16:661 1417026436.661073: Aborting call on >> > unexpected message for Call-Id '[email protected]': while >> > expecting '401' (index 3), received 'SIP/2.0 408 Request Timeout >> > Via: SIP/2.0/TCP 192.168.1.107:5060 >> > ;rport=24582;received=192.168.1.107;branch=z9hG4bK-9941-17-2-201000016 >> > 6- >> > Call-ID: 2010000166///[email protected] >> > From: <sip:[email protected]>;tag=9941SIPpTag0017 >> > To: >> > <sip:[email protected]>;tag=z9hG4bK-9941-17-2-2010000166- >> > CSeq: 1 REGISTER >> > Content-Length: 0 >> > '. >> >> >> while in the sipp message log i have: >> >> Problem EAGAIN on socket 12 and poll_idx is 2 >> > Added first buffered message to socket 12 Problem EWOULDBLOCK on >> > socket 12 and poll_idx is 2 Exit problem event on socket 12 Wrote >> > 702 of 702 bytes in an output buffer. >> > ----------------------------------------------- 2014-11-26 >> > 19:27:11:27.944 TCP message received [363] bytes : >> > SIP/2.0 408 Request Timeout >> > Via: SIP/2.0/TCP 192.168.1.107:5060 >> > ;rport=24578;received=192.168.1.107;branch=z9hG4bK-9941-23-2-201000015 >> > 4- >> > Call-ID: 2010000154///[email protected] >> > From: <sip:[email protected]>;tag=9941SIPpTag0023 >> > To: >> > <sip:[email protected]>;tag=z9hG4bK-9941-23-2-2010000154- >> > CSeq: 1 REGISTER >> > Content-Length: 0 >> > >> > ----------------------------------------------- >> > Unexpected TCP message received: >> > SIP/2.0 408 Request Timeout >> > Via: SIP/2.0/TCP 192.168.1.107:5060 >> > ;rport=24578;received=192.168.1.107;branch=z9hG4bK-9941-23-2-201000015 >> > 4- >> > Call-ID: 2010000154///[email protected] >> > From: <sip:[email protected]>;tag=9941SIPpTag0023 >> > To: >> > <sip:[email protected]>;tag=z9hG4bK-9941-23-2-2010000154- >> > CSeq: 1 REGISTER >> > Content-Length: 0 >> >> >> I tried to increase send and receive timeout in the sipp command but the >> results didn't change. >> >> P.S.: Feedback on documentation --> in the description of large scale >> deployment, while clustering homestead one should unistall both homestead >> and homestead-prov in order to inject both schemas in cassandra. In the doc >> only homestead is specified. >> _______________________________________________ >> Clearwater mailing list >> [email protected] >> http://lists.projectclearwater.org/listinfo/clearwater >> > > _______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/listinfo/clearwater
