Hi Richard, 

It sounds like you could be hitting this issue - 
https://github.com/Metaswitch/clearwater-infrastructure/issues/96 - where the 
diags collection script causes the Sprout node to continually restart itself 
under load (this issue is exacerbated in a test deployment with a single 
Sprout, rather than a deployment with multiple Sprouts, and a more realistic 
load ramp-up).

Can you please try disabling the clearwater-diags-monitor for now (service 
clearwater-diags-monitor stop)?

Ellie

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of ???
Sent: 14 December 2014 18:13
To: [email protected]
Subject: [Clearwater] Please Help Me on Issues about Clearwater Sip Stress Test!

Dear Clearwater Developers:

Currently, I'm running clearwater sip stress test against a small clearwater 
cluster. I'm currently encountering some problems about the test load. Could 
you please help me explain my problem?

My test environment consists of 4 VMs interconnected by a LAN network, running 
Bono, Sprout, Homestead and Homer respectively. I have another VM running the 
clearwater-sip-stress package. This additional VM generates sip traffic to the 
clearwater cluster. I provisioned Homestead and Homer node with 200000 
subscribers. VMs running Bono and Sprout are configured with 2 cores and 4GB 
RAM. VMs running Homestead and Homer are configured with 4 cores and 8GB RAM 
because I notice that the cassandra service will consume a lot of memory and 
CPU resource.

Currently, when sip stress node generates traffic for 80000 or 90000 
subscribers following the steps in the clearwater doc. The whole cluster runs 
smoothly.

But when sip stress node generates traffic for 100000 subscribers, the cluster 
can't operate normally. First, I can observe many errors from the 
call_load2_xxx_errors.log (where xxx is the current sipp pid). Second, the 
Sprout node will constantly restart itself because of signal 6 or signal 11.

My problem is this: according to the performance page on the official website, 
it seems that each sprout node can handle 551k Subscribers and 500k BHCA, which 
is much larger than my current statistics. Is the phenomena that I encounter 
when I run stress test for 100000 subscribers caused by Sprout Node overloading 
or some other errors?

Thank you very much!
Richard Duan

*This is the sprout log file before signal 11 happens:*

14-12-2014 16:16:36.224 UTC Warning scscfsproutlet.cpp:1472: Cannot determine 
charging role as no Route header, assume originating
14-12-2014 16:16:36.353 UTC Warning scscfsproutlet.cpp:1472: Cannot determine 
charging role as no Route header, assume originating
14-12-2014 16:16:36.682 UTC Warning scscfsproutlet.cpp:1472: Cannot determine 
charging role as no Route header, assume originating
14-12-2014 16:16:36.714 UTC Warning scscfsproutlet.cpp:1472: Cannot determine 
charging role as no Route header, assume originating
14-12-2014 16:16:36.717 UTC Warning scscfsproutlet.cpp:1472: Cannot determine 
charging role as no Route header, assume originating
14-12-2014 16:16:36.720 UTC Warning scscfsproutlet.cpp:1472: Cannot determine 
charging role as no Route header, assume originating
14-12-2014 16:16:36.802 UTC Warning scscfsproutlet.cpp:1472: Cannot determine 
charging role as no Route header, assume originating
14-12-2014 16:16:36.846 UTC Warning scscfsproutlet.cpp:1472: Cannot determine 
charging role as no Route header, assume originating
14-12-2014 16:16:37.372 UTC Warning scscfsproutlet.cpp:1472: Cannot determine 
charging role as no Route header, assume originating
14-12-2014 16:16:38.044 UTC Warning scscfsproutlet.cpp:1472: Cannot determine 
charging role as no Route header, assume originating
14-12-2014 16:16:57.299 UTC Error httpconnection.cpp:579:
http://hs.cw.t:8888/impu/sip%3A2010075097%40cw.t/reg-data?private_id=2010075097%40cw.t
failed at server 192.168.126.41 : Timeout was reached (28) : fatal
14-12-2014 16:25:28.642 UTC Error httpconnection.cpp:579:
http://homer.cw.t:7888/org.etsi.ngn.simservs/users/sip%3A2010023992%40cw.t/simservs.xml
failed at server 192.168.126.51 : Timeout was reached (28) : fatal
14-12-2014 16:25:28.704 UTC Error httpconnection.cpp:579:
http://homer.cw.t:7888/org.etsi.ngn.simservs/users/sip%3A2010000650%40cw.t/simservs.xml
failed at server 192.168.126.51 : Timeout was reached (28) : fatal
14-12-2014 16:25:29.643 UTC Error httpconnection.cpp:579:
http://homer.cw.t:7888/org.etsi.ngn.simservs/users/sip%3A2010023992%40cw.t/simservs.xml
failed at server 192.168.126.51 : Timeout was reached (28) : fatal
14-12-2014 16:25:29.643 UTC Error httpconnection.cpp:692: cURL failure with 
cURL error code 28 (see man 3 libcurl-errors) and HTTP error code 500
14-12-2014 16:25:29.705 UTC Error httpconnection.cpp:579:
http://homer.cw.t:7888/org.etsi.ngn.simservs/users/sip%3A2010000650%40cw.t/simservs.xml
failed at server 192.168.126.51 : Timeout was reached (28) : fatal
14-12-2014 16:25:29.705 UTC Error httpconnection.cpp:692: cURL failure with 
cURL error code 28 (see man 3 libcurl-errors) and HTTP error code 500
14-12-2014 16:25:29.857 UTC Error httpconnection.cpp:579:
http://hs.cw.t:8888/impi/2010087933%40cw.t/av?impu=sip%3A2010087933%40cw.t
failed at server 192.168.126.41 : Timeout was reached (28) : fatal
14-12-2014 16:25:29.900 UTC Error httpconnection.cpp:579:
http://hs.cw.t:8888/impu/sip%3A2010063922%40cw.t/reg-data failed at server
192.168.126.41 : Timeout was reached (28) : fatal
14-12-2014 16:25:30.322 UTC Error httpconnection.cpp:579:
http://homer.cw.t:7888/org.etsi.ngn.simservs/users/sip%3A2010038720%40cw.t/simservs.xml
failed at server 192.168.126.51 : Timeout was reached (28) : fatal
14-12-2014 16:25:30.647 UTC Error httpconnection.cpp:579:
http://homer.cw.t:7888/org.etsi.ngn.simservs/users/sip%3A2010023993%40cw.t/simservs.xml
failed at server 192.168.126.51 : Timeout was reached (28) : fatal
14-12-2014 16:25:30.710 UTC Error httpconnection.cpp:579:
http://homer.cw.t:7888/org.etsi.ngn.simservs/users/sip%3A2010000651%40cw.t/simservs.xml
failed at server 192.168.126.51 : Timeout was reached (28) : fatal
14-12-2014 16:25:30.901 UTC Error httpconnection.cpp:579:
http://hs.cw.t:8888/impu/sip%3A2010063922%40cw.t/reg-data failed at server
192.168.126.41 : Timeout was reached (28) : fatal
14-12-2014 16:25:30.901 UTC Error httpconnection.cpp:692: cURL failure with 
cURL error code 28 (see man 3 libcurl-errors) and HTTP error code 500
14-12-2014 16:25:30.905 UTC Error hssconnection.cpp:589: Could not get 
subscriber data from HSS
14-12-2014 16:25:31.323 UTC Error httpconnection.cpp:579:
http://homer.cw.t:7888/org.etsi.ngn.simservs/users/sip%3A2010038720%40cw.t/simservs.xml
failed at server 192.168.126.51 : Timeout was reached (28) : fatal
14-12-2014 16:25:31.324 UTC Error httpconnection.cpp:692: cURL failure with 
cURL error code 28 (see man 3 libcurl-errors) and HTTP error code 500
14-12-2014 16:25:31.538 UTC Error httpconnection.cpp:579:
http://hs.cw.t:8888/impu/sip%3A2010059772%40cw.t/reg-data failed at server
192.168.126.41 : Timeout was reached (28) : fatal
14-12-2014 16:25:31.648 UTC Error httpconnection.cpp:579:
http://homer.cw.t:7888/org.etsi.ngn.simservs/users/sip%3A2010023993%40cw.t/simservs.xml
failed at server 192.168.126.51 : Timeout was reached (28) : fatal
14-12-2014 16:25:31.648 UTC Error httpconnection.cpp:692: cURL failure with 
cURL error code 28 (see man 3 libcurl-errors) and HTTP error code 500

Signal 11 caught

Basic stack dump:
/usr/share/clearwater/bin/
sprout(_ZN6Logger9backtraceEPKc+0x6d)[0x4a0b1d]
/usr/share/clearwater/bin/sprout(_ZN3Log9backtraceEPKcz+0x10d)[0x55281d]
/usr/share/clearwater/bin/sprout(_Z17exception_handleri+0x29)[0x5989d9]
/lib/x86_64-linux-gnu/libc.so.6(+0x36150)[0x7fe417dd6150]
/usr/share/clearwater/bin/sprout(_ZN16SproutletWrapper11rx_responseEP13pjsip_tx_datai+0x5d)[0x592a0d]
/usr/share/clearwater/bin/sprout(_ZN14SproutletProxy6UASTsx11tx_responseEP16SproutletWrapperP13pjsip_tx_data+0xdb)[0x592c1b]
/usr/share/clearwater/bin/sprout(_ZN16SproutletWrapper15process_actionsEb+0xa0)[0x5921c0]
/usr/share/clearwater/bin/sprout(_ZN14SproutletProxy6UASTsx22on_new_client_responseEPN10BasicProxy6UACTsxEP13pjsip_tx_data+0xe6)[0x593b56]
/usr/share/clearwater/bin/sprout(_ZN10BasicProxy6UACTsx12on_tsx_stateEP11pjsip_event+0x68f)[0x56de5f]
/usr/share/clearwater/bin/sprout[0x5dbe64]
/usr/share/clearwater/bin/sprout[0x5df5d7]
/usr/share/clearwater/bin/sprout(pjsip_tsx_recv_msg+0xb1)[0x5dd08e]
/usr/share/clearwater/bin/sprout[0x5db3ae]
/usr/share/clearwater/bin/sprout(pjsip_endpt_process_rx_data+0x23b)[0x5c552c]
/usr/share/clearwater/bin/sprout[0x4a27d1]
/usr/share/clearwater/bin/sprout[0x5fa558]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7fe418978e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fe417e9431d]

Advanced stack dump (requires gdb):
sh: 1: /usr/bin/gdb: not found

gdb failed with return code 32512
14-12-2014 16:25:43.247 UTC Status main.cpp:1082: Access logging enabled to 
/var/log/sprout
14-12-2014 16:25:43.247 UTC Status main.cpp:1086: Log level set to 2
14-12-2014 16:25:43.247 UTC Status load_monitor.cpp:93: Constructing LoadMonitor
14-12-2014 16:25:43.247 UTC Status load_monitor.cpp:94:    Target latency
(usecs)   : 100000
14-12-2014 16:25:43.247 UTC Status load_monitor.cpp:95:    Max bucket
size          : 20
14-12-2014 16:25:43.247 UTC Status load_monitor.cpp:96:    Initial token
fill rate/s: 10.000000
14-12-2014 16:25:43.247 UTC Status load_monitor.cpp:97:    Min token fill
rate/s    : 10.000000
14-12-2014 16:25:43.247 UTC Status dnscachedresolver.cpp:90: Creating Cached 
Resolver using server 127.0.0.1
14-12-2014 16:25:43.247 UTC Status sipresolver.cpp:59: Created SIP resolver
14-12-2014 16:25:43.284 UTC Status stack.cpp:690: Listening on port 5054
14-12-2014 16:25:43.287 UTC Status stack.cpp:1094: Local host aliases:
14-12-2014 16:25:43.287 UTC Status stack.cpp:1099:  192.168.124.51
14-12-2014 16:25:43.287 UTC Status stack.cpp:1099:  sprout.cw.t
14-12-2014 16:25:43.287 UTC Status stack.cpp:1099:  192.168.124.51
14-12-2014 16:25:43.288 UTC Status httpresolver.cpp:50: Created HTTP resolver
14-12-2014 16:25:43.288 UTC Status main.cpp:1295: Creating connection to HSS 
hs.cw.t:8888
14-12-2014 16:25:43.291 UTC Status httpconnection.cpp:119: HttpConnection for 
server hs.cw.t:8888
14-12-2014 16:25:43.291 UTC Status httpconnection.cpp:120: Response
timeout: 500
14-12-2014 16:25:43.292 UTC Status main.cpp:1334: Creating connection to 
Chronos 192.168.124.51:7253 using 127.0.0.1:9888 as the callback URI
14-12-2014 16:25:43.292 UTC Status httpconnection.cpp:149: HttpConnection for 
server 192.168.124.51:7253
14-12-2014 16:25:43.292 UTC Status httpconnection.cpp:150: Response
timeout: 500
14-12-2014 16:25:43.292 UTC Status main.cpp:1404: Using memcached compatible 
store with ASCII protocol
14-12-2014 16:25:43.292 UTC Status memcachedstore.cpp:784: Reloading memcached 
configuration from /etc/clearwater/cluster_settings file
14-12-2014 16:25:43.292 UTC Status memcachedstore.cpp:801:  servers=
192.168.124.51:11211
14-12-2014 16:25:43.292 UTC Status memcachedstore.cpp:140: Updating memcached 
store configuration
14-12-2014 16:25:43.292 UTC Status memcachedstore.cpp:161: Finished preparing 
new view, so flag that workers should switch to it
14-12-2014 16:25:43.292 UTC Status main.cpp:1459: Initialise S-CSCF 
authentication module
14-12-2014 16:25:43.292 UTC Status pluginloader.cpp:63: Loading plug-ins from 
/usr/share/clearwater/sprout/plugins
14-12-2014 16:25:43.292 UTC Status pluginloader.cpp:82: Attempt to load plug-in 
/usr/share/clearwater/sprout/plugins/sprout_bgcf.so
14-12-2014 16:25:43.302 UTC Status bgcfservice.cpp:71: No BGCF configuration 
(file ./bgcf.json does not exist)
14-12-2014 16:25:43.302 UTC Status pluginloader.cpp:106: Loaded sproutlet bgcf 
using API version 1
14-12-2014 16:25:43.302 UTC Status pluginloader.cpp:82: Attempt to load plug-in 
/usr/share/clearwater/sprout/plugins/sprout_mmtel_as.so
14-12-2014 16:25:43.303 UTC Status mmtelasplugin.cpp:87: Creating connection to 
XDMS homer.cw.t:7888
14-12-2014 16:25:43.303 UTC Status httpconnection.cpp:119: HttpConnection for 
server homer.cw.t:7888
14-12-2014 16:25:43.303 UTC Status httpconnection.cpp:120: Response
timeout: 500
14-12-2014 16:25:43.303 UTC Status pluginloader.cpp:106: Loaded sproutlet mmtel 
using API version 1
14-12-2014 16:25:43.303 UTC Status pluginloader.cpp:82: Attempt to load plug-in 
/usr/share/clearwater/sprout/plugins/sprout_scscf.so
14-12-2014 16:25:43.304 UTC Status pluginloader.cpp:106: Loaded sproutlet scscf 
using API version 1
14-12-2014 16:25:43.304 UTC Status pluginloader.cpp:82: Attempt to load plug-in 
/usr/share/clearwater/sprout/plugins/sprout_icscf.so
14-12-2014 16:25:43.305 UTC Status pluginloader.cpp:137: Finished loading 
plug-ins
14-12-2014 16:25:43.312 UTC Status httpstack.cpp:131: Configuring HTTP stack
14-12-2014 16:25:43.312 UTC Status httpstack.cpp:132:   Bind address:
192.168.124.51
14-12-2014 16:25:43.312 UTC Status httpstack.cpp:133:   Bind port:    9888
14-12-2014 16:25:43.312 UTC Status httpstack.cpp:134:   Num threads:  1
_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/listinfo/clearwater
_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/listinfo/clearwater

Reply via email to