> > K Eccles edited comment on XX-6619 at 4/15/10 9:01 AM: > ------------------------------------------------------- > > Looking at the Master data... > I don't see anything that explains what happened... No > interesting errors or warnings and nothing unusual at the > time that the memory use jumped. > > Some anomalies > 1- sipXalarms.log has these two messages > "2010-03-19T13:49:13.697656Z":1:ALARM:WARNING:firstsoak.qantom > .int:sipXsupervisor::SPX00002:"Process SIPRegistrar' stopped > unexpectedly. Attempting to restart the process." > "2010-03-19T13:49:26.607265Z":2:ALARM:WARNING:firstsoak.qantom > .int:sipXsupervisor::SPX00002:"Process SIPRegistrar' stopped > unexpectedly. Attempting to restart the process." > > 2- sipXproxy has a short time where these messages appear: > "2010-03-20T02:01:12.188697Z":6820700:SIP:WARNING:firstsoak.qa > ntom.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipTransaction::ha > ndleChildIncoming duplicate 2xx response received on UAC for > INVITE with no ACK" > --- 10 messages in a short period in sipXproxy.log.3 (others > ..4/2/1 not checked) > > "2010-03-20T02:01:14.092780Z":6820907:SIP:WARNING:firstsoak.qa > ntom.int:SipClientTcp-18:B64FAB90:SipXProxy:"SipUserAgent::dis > patch received response without transaction" > --- 3 messages in a short period in sipXproxy.log.3 (others > ..4/2/1 not checked) > > 3- sipXproxy starts getting these messages at the time shown, > which is 6+ hours after the memory jump: > "2010-03-20T22:32:31.655949Z":18276165:KERNEL:NOTICE:firstsoak > .qantom.int:SipClientUdp-8:B7A6BB90:SipXProxy:"OsMsgQShared::d > oSendCore message queue 'SipXProxyCseObserver-10' is over > half full - count = 1001, max = 2000"
Queues that suddenly fill up and never deplete are usually a sign of a deadlock of some sort. After queue fillups have been observed, can the Qantom guys attach GDB to the running sipXproxy process and get a backtrace for all the threads so that we can identify if there is a deadlock indeed? > > and the queue continues to fill until > "2010-03-20T22:33:10.060852Z":18283820:KERNEL:NOTICE:firstsoak > .qantom.int:SipClientUdp-8:B7A6BB90:SipXProxy:"OsMsgQShared::d > oSendCore message queue 'SipXProxyCseObserver-10' is over > half full - count = 1999, max = 2000" > "2010-03-20T22:33:10.066048Z":18283822:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers send failed with status 12 > (numMsgs = 2000, maxMsgs = 2000)" > "2010-03-20T22:33:10.066068Z":18283823:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers send failed to queue named > 'SipXProxyCseObserver-10'" > "2010-03-20T22:33:10.066092Z":18283824:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers observerQueue 0x8330934, > observerData (nil), SIP method 'BYE', wantsRequests 1, > wantsResponses 0, wantsIncoming 1, wantsOutGoing 0, eventName > '', SipSession (nil)" > "2010-03-20T22:33:10.066140Z":18283825:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers failed message is: BYE > sip:[email protected]:6605;transport=UDP SIP/2.0\r\nVia: > SIP/2.0/UDP > 176.25.20.25:6505;branch=z9hG4bK-2712-588736-10\r\nFrom: > BASIC Caller6 > <sip:[email protected]>;tag=AbCdEf-588736-2712-FeDcBa\r\nTo: > BASIC Callee6 > <sip:[email protected]>;tag=fEdCbA-588736-2700-AbCcDf\r\nRout > e: > <sip:176.25.20.251:5060;lr;sipXecs-CallDest=AL;sipXecs-rs=%2Aa > uth%7E.%2Afrom%7EQWJDZEVmLTU4ODczNi0yNzEyLUZlRGNCYQ%60%60%21ac > 381cdccd0413cbc1d0a7f24c6a6d6d>\r\nCall-Id: > [email protected]\r\ncseq: 3 BYE\r\nContact: > <sip:[email protected]:6505>\r\nUser-Agent: SIPp > v3.1-TLS-PCAP\r\nAccept-Language: en\r\nProxy-Authorization: > Digest > username=\"206\",realm=\"ha2.qantom.int\",cnonce=\"105b7886\", > nc=00000001,qop=auth,uri=\"sip:176.25.20.251:5060\",nonce=\"b6 > d96799ab2f2970be3cb3a84bcb96744ba54d23\",response=\"1843d7fb39 > 5f69585bc8058 > 901daf0ff\",algorithm=MD5\r\nMax-Forwards: > 70\r\nContent-Length: 0\r\nDate: Sat, 20 Mar 2010 22:33:10 GMT" > "2010-03-20T22:33:10.143832Z":18283829:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers send failed with status 12 > (numMsgs = 2000, maxMsgs = 2000)" > "2010-03-20T22:33:10.143846Z":18283830:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers send failed to queue named > 'SipXProxyCseObserver-10'" > "2010-03-20T22:33:10.143865Z":18283831:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers observerQueue 0x8330934, > observerData (nil), SIP method 'INVITE', wantsRequests 1, > wantsResponses 0, wantsIncoming 1, wantsOutGoing 0, eventName > '', SipSession (nil)" > "2010-03-20T22:33:10.144034Z":18283832:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers failed message is: INVITE > sip:[email protected] SIP/2.0\r\nFrom: AUTH Caller9 > <sip:[email protected]>;tag=AbCdEf-235489-2754-FeDcBa\r\nTo: > AUTH Callee9 <sip:[email protected]>\r\nCall-Id: > [email protected]\r\ncseq: 1 INVITE\r\nVia: > SIP/2.0/UDP > 176.25.20.25:6705;branch=z9hG4bK-2754-235489-0\r\nContact: > <sip:[email protected]:6705>\r\nAllow: INVITE, ACK, BYE, > CANCEL, OPTIONS, INFO, MESSAGE, SUBSCRIBE, NOTIFY, PRACK, > UPDATE, REFER\r\nUser-Agent: SIPp > v3.1-TLS-PCAP\r\nAccept-Language: en\r\nSupported: > 100rel,replaces\r\nAllow-Events: > talk,hold,conference\r\nMax-Forwards: 70\r\nContent-Type: > application/sdp\r\nContent-Length: 229\r\nDate: Sat, 20 Mar > 2010 22:33:10 GMT\r\n\r\nv=0\r\no=- 1234567890 1234567890 IN > IP4 176.25.20.25\r\ns=AUTH\r\nt=0 0\r\nc=IN IP4 > 176.25.20.25\r\nm=audio 16700 RTP/AVP 0 8 101\r\na=rtpmap:0 PC > MU/8000\r\na=rtpmap:8 PCMA/8000\r\na=rtpmap:101 > telephone-event/8000\r\na=fmtp:101 0-11\r\na=sendrecv" > "2010-03-20T22:33:10.193905Z":18283834:KERNEL:NOTICE:firstsoak > .qantom.int:SipRouter-11:B6DDFB90:SipXProxy:"OsMsgQShared::doS > endCore message queue 'SipXProxyCseObserver-10' is over half > full - count = 1998, max = 2000" > > 4- sipXproxy also has many instances of the following errors. > These errors appear very early and throughout the testing, > so I don't think they are related to this problem. Each of > these messages appears hundreds of thousands of time in the > proxy logs. > > a- > KERNEL:ERR:firstsoak.qantom.int:SipClientTcp:SipXProxy:"OsSock > et::write 181 (176.25.20.25:6605 :-1) send returned -1, > errno=32 'Broken pipe'" > > b- > SIP:ERR:firstsoak.qantom.int:SipClientUdp:SipXProxy:"Url::pars > eString error - expected end of url or field parameter > ';name=' at offset 62 in 'AUTH Callee5 > <sip:[email protected]>;tag=fEdCbA-call_number]-2731-AbCcDf'" > > c- > SIP:WARNING:firstsoak.qantom.int:SipSrvLookupThread:SipXProxy: > "DNS query for name 'ha2.qantom.int', type = 1 (A): returned error" > > d- > KERNEL:ERR:firstsoak.qantom.int:SipClientTcp:SipXProxy:"OsMsgQ > Shared::doSendCore message send failed for queue > 'SipTcpServer-3' - no room, ret = 9" > > e- > SIP:WARNING:firstsoak.qantom.int:SipSrvLookupThread:SipXProxy: > "DNS query for name '_sip._tls.ha2.qantom.int', type = 33 > (SRV): returned error" > > f- > HTTP:WARNING:firstsoak.qantom.int:SipClientUdp:SipXProxy:"Http > Message::getBytes content-length: 175 wrong setting to: 177" > > g- > SIP:WARNING:firstsoak.qantom.int:SipRouter:SipXProxy:"SipUserA > gent::send INVITE request matches existing transaction" > > h- > SIP:ERR:firstsoak.qantom.int:SipClientTcp:SipXProxy:"SipClient > WriteBuffer[SipClientTcp-219073]::writeMore OsSocket::write() > returned -1, errno = 32" > > i- > SIP:ERR:firstsoak.qantom.int:SipUserAgent:SipXProxy:"SipUserAg > ent::handleMessage SIP message timeout expired with no > matching transaction" > > Issue XX-8063 was opened to clean up logging messages. > > > > was (Author: keccles): > Looking at the Master data... > I don't see anything that explains what happened... No > interesting errors or warnings and nothing unusual at the > time that the memory use jumped. > > Some anomalies > 1- sipXalarms.log has these two messages > "2010-03-19T13:49:13.697656Z":1:ALARM:WARNING:firstsoak.qantom > .int:sipXsupervisor::SPX00002:"Process SIPRegistrar' stopped > unexpectedly. Attempting to restart the process." > "2010-03-19T13:49:26.607265Z":2:ALARM:WARNING:firstsoak.qantom > .int:sipXsupervisor::SPX00002:"Process SIPRegistrar' stopped > unexpectedly. Attempting to restart the process." > > 2- sipXproxy has a short time where these messages appear: > "2010-03-20T02:01:12.188697Z":6820700:SIP:WARNING:firstsoak.qa > ntom.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipTransaction::ha > ndleChildIncoming duplicate 2xx response received on UAC for > INVITE with no ACK" > --- 10 messages in a short period in sipXproxy.log.3 (others > ..4/2/1 not checked) > > "2010-03-20T02:01:14.092780Z":6820907:SIP:WARNING:firstsoak.qa > ntom.int:SipClientTcp-18:B64FAB90:SipXProxy:"SipUserAgent::dis > patch received response without transaction" > --- 3 messages in a short period in sipXproxy.log.3 (others > ..4/2/1 not checked) > > 3- sipXproxy starts getting these messages at the time shown, > which is 6+ hours after the memory jump: > "2010-03-20T22:32:31.655949Z":18276165:KERNEL:NOTICE:firstsoak > .qantom.int:SipClientUdp-8:B7A6BB90:SipXProxy:"OsMsgQShared::d > oSendCore message queue 'SipXProxyCseObserver-10' is over > half full - count = 1001, max = 2000" > > and the queue continues to fill until > "2010-03-20T22:33:10.060852Z":18283820:KERNEL:NOTICE:firstsoak > .qantom.int:SipClientUdp-8:B7A6BB90:SipXProxy:"OsMsgQShared::d > oSendCore message queue 'SipXProxyCseObserver-10' is over > half full - count = 1999, max = 2000" > "2010-03-20T22:33:10.066048Z":18283822:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers send failed with status 12 > (numMsgs = 2000, maxMsgs = 2000)" > "2010-03-20T22:33:10.066068Z":18283823:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers send failed to queue named > 'SipXProxyCseObserver-10'" > "2010-03-20T22:33:10.066092Z":18283824:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers observerQueue 0x8330934, > observerData (nil), SIP method 'BYE', wantsRequests 1, > wantsResponses 0, wantsIncoming 1, wantsOutGoing 0, eventName > '', SipSession (nil)" > "2010-03-20T22:33:10.066140Z":18283825:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers failed message is: BYE > sip:[email protected]:6605;transport=UDP SIP/2.0\r\nVia: > SIP/2.0/UDP > 176.25.20.25:6505;branch=z9hG4bK-2712-588736-10\r\nFrom: > BASIC Caller6 > <sip:[email protected]>;tag=AbCdEf-588736-2712-FeDcBa\r\nTo: > BASIC Callee6 > <sip:[email protected]>;tag=fEdCbA-588736-2700-AbCcDf\r\nRout > e: > <sip:176.25.20.251:5060;lr;sipXecs-CallDest=AL;sipXecs-rs=%2Aa > uth%7E.%2Afrom%7EQWJDZEVmLTU4ODczNi0yNzEyLUZlRGNCYQ%60%60%21ac > 381cdccd0413cbc1d0a7f24c6a6d6d>\r\nCall-Id: > [email protected]\r\ncseq: 3 BYE\r\nContact: > <sip:[email protected]:6505>\r\nUser-Agent: SIPp > v3.1-TLS-PCAP\r\nAccept-Language: en\r\nProxy-Authorization: > Digest > username=\"206\",realm=\"ha2.qantom.int\",cnonce=\"105b7886\", > nc=00000001,qop=auth,uri=\"sip:176.25.20.251:5060\",nonce=\"b6 > d96799ab2f2970be3cb3a84bcb96744ba54d23\",response=\"1843d7fb39 > 5f69585bc8058 > 901daf0ff\",algorithm=MD5\r\nMax-Forwards: > 70\r\nContent-Length: 0\r\nDate: Sat, 20 Mar 2010 22:33:10 GMT" > "2010-03-20T22:33:10.143832Z":18283829:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers send failed with status 12 > (numMsgs = 2000, maxMsgs = 2000)" > "2010-03-20T22:33:10.143846Z":18283830:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers send failed to queue named > 'SipXProxyCseObserver-10'" > "2010-03-20T22:33:10.143865Z":18283831:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers observerQueue 0x8330934, > observerData (nil), SIP method 'INVITE', wantsRequests 1, > wantsResponses 0, wantsIncoming 1, wantsOutGoing 0, eventName > '', SipSession (nil)" > "2010-03-20T22:33:10.144034Z":18283832:SIP:CRIT:firstsoak.qant > om.int:SipClientUdp-8:B7A6BB90:SipXProxy:"SipUserAgent::queueM > essageToInterestedObservers failed message is: INVITE > sip:[email protected] SIP/2.0\r\nFrom: AUTH Caller9 > <sip:[email protected]>;tag=AbCdEf-235489-2754-FeDcBa\r\nTo: > AUTH Callee9 <sip:[email protected]>\r\nCall-Id: > [email protected]\r\ncseq: 1 INVITE\r\nVia: > SIP/2.0/UDP > 176.25.20.25:6705;branch=z9hG4bK-2754-235489-0\r\nContact: > <sip:[email protected]:6705>\r\nAllow: INVITE, ACK, BYE, > CANCEL, OPTIONS, INFO, MESSAGE, SUBSCRIBE, NOTIFY, PRACK, > UPDATE, REFER\r\nUser-Agent: SIPp > v3.1-TLS-PCAP\r\nAccept-Language: en\r\nSupported: > 100rel,replaces\r\nAllow-Events: > talk,hold,conference\r\nMax-Forwards: 70\r\nContent-Type: > application/sdp\r\nContent-Length: 229\r\nDate: Sat, 20 Mar > 2010 22:33:10 GMT\r\n\r\nv=0\r\no=- 1234567890 1234567890 IN > IP4 176.25.20.25\r\ns=AUTH\r\nt=0 0\r\nc=IN IP4 > 176.25.20.25\r\nm=audio 16700 RTP/AVP 0 8 101\r\na=rtpmap:0 PC > MU/8000\r\na=rtpmap:8 PCMA/8000\r\na=rtpmap:101 > telephone-event/8000\r\na=fmtp:101 0-11\r\na=sendrecv" > "2010-03-20T22:33:10.193905Z":18283834:KERNEL:NOTICE:firstsoak > .qantom.int:SipRouter-11:B6DDFB90:SipXProxy:"OsMsgQShared::doS > endCore message queue 'SipXProxyCseObserver-10' is over half > full - count = 1998, max = 2000" > > 4- sipXproxy also has many instances of the following errors. > These errors appear very early and throughout the testing, > so I don't think they are related to this problem. Each of > these messages appears hundreds of thousands of time in the > proxy logs. > > a- > KERNEL:ERR:firstsoak.qantom.int:SipClientTcp:SipXProxy:"OsSock > et::write 181 (176.25.20.25:6605 :-1) send returned -1, > errno=32 'Broken pipe'" > > b- > SIP:ERR:firstsoak.qantom.int:SipClientUdp:SipXProxy:"Url::pars > eString error - expected end of url or field parameter > ';name=' at offset 62 in 'AUTH Callee5 > <sip:[email protected]>;tag=fEdCbA-call_number]-2731-AbCcDf'" > > c- > SIP:WARNING:firstsoak.qantom.int:SipSrvLookupThread:SipXProxy: > "DNS query for name 'ha2.qantom.int', type = 1 (A): returned error" > > d- > KERNEL:ERR:firstsoak.qantom.int:SipClientTcp:SipXProxy:"OsMsgQ > Shared::doSendCore message send failed for queue > 'SipTcpServer-3' - no room, ret = 9" > > e- > SIP:WARNING:firstsoak.qantom.int:SipSrvLookupThread:SipXProxy: > "DNS query for name '_sip._tls.ha2.qantom.int', type = 33 > (SRV): returned error" > > f- > HTTP:WARNING:firstsoak.qantom.int:SipClientUdp:SipXProxy:"Http > Message::getBytes content-length: 175 wrong setting to: 177" > > g- > SIP:WARNING:firstsoak.qantom.int:SipRouter:SipXProxy:"SipUserA > gent::send INVITE request matches existing transaction" > > h- > SIP:ERR:firstsoak.qantom.int:SipClientTcp:SipXProxy:"SipClient > WriteBuffer[SipClientTcp-219073]::writeMore OsSocket::write() > returned -1, errno = 32" > > i- > SIP:ERR:firstsoak.qantom.int:SipUserAgent:SipXProxy:"SipUserAg > ent::handleMessage SIP message timeout expired with no > matching transaction" > > > > > > Memory usage of sipXproxy increases and the SUT swaps > memory during soak testing. > > > ---------------------------------------------------------------------- > > ----------- > > > > Key: XX-6619 > > URL: http://track.sipfoundry.org/browse/XX-6619 > > Project: sipXecs > > Issue Type: Bug > > Components: sipXproxy > > Affects Versions: 4.1 > > Environment: sipxproxy 4.1.0-016475 2009-09-15T22:49:30 > > oem-centos5 Dell R-300 with 4GB RAM and 6GB Swap Area. > > Reporter: Prakash PM > > Assignee: Prakash PM > > Priority: Critical > > Fix For: 4.1.7 > > > > Attachments: 15_Mar-valgrind-test.zip, > > 17_Mar-valgrind-Report.zip, 19_Mar-valgrind.zip, > > Apr-5_72hour_Soak-Test_memMon_Reports.zip, > > Distributed_Soak_Test_Snapshot_without_logs.tar.gz, > > Mar19_DistribGraphs.zip, Mar19_MasterGraphs.zip, > > > Mar_19_Distributed_Snapshot_after_valgrind_test_at_NOTICE_level_loggin > > g.tar.gz, > > > Mar_19_Master_Snapshot_after_valgrind_test_at_NOTICE_level_logging.tar > > .gz, Master_Soak_Test_Snapshot_without_logs.tar.gz, > > > memMon_reports_of_72_Hour_HA_Comm_Soak_Test_at_NOTICE_level_logging.zi > > p, valgrind-report.txt, valgrind-report_DISTRIBUTED.txt, > > valgrind-report_MASTER.txt > > > > > > During 72 hour Comm Soak/Media Soak Test, memory usage of > sipXproxy was uniform at 20% till 32nd hour. After 32 hours > ((sunday, 4AM)) it started increasing and at the end of the > test it was at 57%. The SUT started swapping memory in the > 50th hour and it was 30% at the end of the test. > > The test completed with 99.9% results. But the SUBSCRIBE > test was failing with '408 Request Timeout' from 48th hour, > recovered in the 57th hour. > > -- > This message is automatically generated by JIRA. > - > If you think it was sent incorrectly contact one of the > administrators: http://track.sipfoundry.org/secure/Administrators.jspa > - > For more information on JIRA, see: > http://www.atlassian.com/software/jira > > > _______________________________________________ > sipxecs-commit mailing list [email protected] > List Archive: http://list.sipfoundry.org/archive/sipxecs-commit > Unsubscribe: > http://list.sipfoundry.org/mailman/listinfo/sipxecs-commit > sipXecs IP PBX -- http://www.sipfoundry.org/ > _______________________________________________ sipx-dev mailing list [email protected] List Archive: http://list.sipfoundry.org/archive/sipx-dev Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-dev sipXecs IP PBX -- http://www.sipfoundry.org/
