Hi everybody, regarding our TCP/TLS stability problems we have no decided to make test with kamailio 1.5.1 Nevertheless it would be interesting if there is a chance to get rid of this problems.
Is anybody using TLS? Used modules: SNMP, mySQL Summary of problems Errors may be related to the following log file entries un 17 09:01:27 si-.... /usr/local/sbin/openser[13921]: WARNING:core:send2child: no free tcp receiver, connection passed to the leastbusy one (6) Jun 17 08:54:52 si-.... /usr/local/sbin/openser[13921]: ERROR:core:tcpconn_new: shared memory allocation failure Jun 17 08:54:52 si-... /usr/local/sbin/openser[13921]: ERROR:core:handle_new_connect: tcpconn_new failed, closing socket And a few of these also (7613 times): Jun 17 08:57:24 si-... /usr/local/sbin/openser[13880]: ERROR:core:tls_accept: some error in SSL: Jun 17 08:57:24 si-... /usr/local/sbin/openser[13880]: ERROR:core:tls_print_errstack: error:1409C041:SSL routines:SSL3_SETUP_BUFFERS:malloc failure shared memory consumption shared memory is continously increasing (set to 1024) PKG_MEM is 1 MB high CPU load for some openser processes normally after some days we get a high CPU load (50-90%) for a small number of the openser processes It looks like an endless loop and requires restart of openser There may be an endless loop in Pass_fd.c again: ret=sendmsg(unix_socket, &msg, 0); if (ret<0){ if (errno==EINTR) goto again; LM_CRIT("sendmsg failed on %d: %s\n", unix_socket, strerror(errno)); } any comments on that? Mit besten Grüßen | Best regards Albert Munder Robert Bosch GmbH IT Systems Engineering (CI/ISE) Postfach 30 02 20 70442 Stuttgart GERMANY www.bosch.com Tel. +49 711 811-40562 Fax +49 711 811-5113333 albert.mun...@de.bosch.com Robert Bosch GmbH, Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart HRB 14000 Aufsichtsratsvorsitzender: Hermann Scholl; Geschäftsführung: Franz Fehrenbach, Siegfried Dais; Bernd Bohr, Wolfgang Chur, Rudolf Colm, Gerhard Kümmel, Wolfgang Malchow, Peter Marks; Volkmar Denner, Peter Tyroller. ________________________________ Von: Henning Westerholt [mailto:henning.westerh...@1und1.de] Gesendet: Dienstag, 30. Juni 2009 17:25 An: users@lists.kamailio.org Cc: Munder Albert (CI/ISE) Betreff: Re: [Kamailio-Users] OpenSER stability problems in pilot project On Dienstag, 30. Juni 2009, Munder Albert (CI/ISE) wrote: > [..] > We are running OpenSER in a pilot project and > unfortunately have some stability problems. Hallo Albert, > * Appr. 5000 subscriber accounts > * Appr. 1200 simultaneously registered users > * Signalling encrypted with TLS > * Media data encrypted with SRTP > * Clients: softphones and hardphones > * Re-registration time for clients: 3600 sec I've not that much experience with TCP, but don't think that this numbers should be a problem in a setup like this. > OpenSER configuration > · Works as stateful SIP Proxy > 1 mySQL database > 2 Version 1.3.4.-TLS > 3 Tcp_children: 100 --> is it recommended to increase this number? This are quite a lot of children, but ok. > 4 Udp_children: 20 > 5 Tcp_connection_timeout: 3600 > 6 Shared memory: > · -m 512 when error occurred > 1 Now set to 1024 How much PKG_MEM do you use? The default value? > Problems > * Shared memory consumption > Shared memory usage is permanently increasing (about 50 MB per day) > Application already crashed twice This could be a memory leak, what modules do you use? And do you use any proprietary modules? You could use the memory debugging to further investigate this: http://www.kamailio.org/dokuwiki/doku.php/troubleshooting:memory > First messages were, these, repeated thousands of times (5915 times): > Jun 17 08:54:52 si-.... /usr/local/sbin/openser[13921]: > ERROR:core:tcpconn_new: shared memory allocation failure Jun 17 08:54:52 > si-... /usr/local/sbin/openser[13921]: ERROR:core:handle_new_connect: > tcpconn_new failed, closing socket And a few of these also (7613 times): > Jun 17 08:57:24 si-... /usr/local/sbin/openser[13880]: > ERROR:core:tls_accept: some error in SSL: Jun 17 08:57:24 si-... > /usr/local/sbin/openser[13880]: ERROR:core:tls_print_errstack: > error:1409C041:SSL routines:SSL3_SETUP_BUFFERS:malloc failure This are caused from insufficient memory conditions. I can't comment on the TCP and TLS errors. But before really starting to investigate this problem, would it be possible for you to use a more recent version, e.g. kamailio 1.5.1 for testing? > * TCP errors, lost SIP messages > > Examples from error messages: > 14.100 times in log file from 17.06.09 > Jun 17 04:03:15 si-... /usr/local/sbin/openser[13863]: > ERROR:core:tcp_blocking_connect: poll error: flags 18 Jun 17 04:03:15 > si-... /usr/local/sbin/openser[13863]: ERROR:core:tcp_blocking_connect: > failed to retrieve SO_ERROR (111) Connection refused Jun 17 04:03:15 si-... > /usr/local/sbin/openser[13863]: ERROR:core:tcpconn_connect: > tcp_blocking_connect failed Jun 17 04:03:15 si-... > /usr/local/sbin/openser[13863]: ERROR:core:tcp_send: connect failed Jun 17 > 04:03:15 si-.. /usr/local/sbin/openser[13863]: ERROR:tm:msg_send: tcp_send > failed Jun 17 04:03:15 si-... /usr/local/sbin/openser[13863]: > ERROR:tm:t_forward_nonack: sending request failed > > Appears at least 20 000 times; and in the day of the last shared memory > errors, it was 225.794 times in the log file (note that the number in > parenthesis is usually 1 or 2, but on that day it has reached 6): Jun 17 > 09:01:27 si-.... /usr/local/sbin/openser[13921]: WARNING:core:send2child: > no free tcp receiver, connection passed to the leastbusy one (6) Jun 17 > 09:01:27 si-... /usr/local/sbin/openser[13921]: WARNING:core:send2child: no > free tcp receiver, connection passed to the leastbusy one (5) > > * Certificate validation problems > TCP traffic is currently significantly increased by some ( appr. 70) > clients which failed to validate the TLS certificate. Registration is > repeated every 5 sec. > > Circa 30 thousand per day (on that day, it was 37.162 times in log) > Jun 17 04:03:10 si-024lc008 /usr/local/sbin/openser[13801]: > ERROR:core:tls_accept: some error in SSL: Jun 17 04:03:10 si-024lc008 > /usr/local/sbin/openser[13801]: ERROR:core:tls_print_errstack: > error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca Best regards, Henning
_______________________________________________ Kamailio (OpenSER) - Users mailing list Users@lists.kamailio.org http://lists.kamailio.org/cgi-bin/mailman/listinfo/users http://lists.openser-project.org/cgi-bin/mailman/listinfo/users