Re: [OpenSIPS-Devel] Improved socket error messages
Hi, On Jan 27, 2013, at 8:17 PM, Michael Schloh von Bennewitz wrote: Hello developers, While trying to debug a difficult to find problem (where ntop, tcpdump, and snoop were no help) I finally figured out where OpenSIPS was trying to send misdirected messages by hacking the sources. It might be useful to always accompany log text with network stack information. So instead of: Jan 02 18:18:02 hostname error opensips[1234]: ERROR:core:tcp_blocking_connect: failed to retrieve SO_ERROR (146) Connection refused ...we would see: Jan 02 18:18:02 hostname error opensips[1234]: ERROR:core:tcp_blocking_connect: addr (1.2.3.4) port (98765), failed to retrieve SO_ERROR (146) Connection refused ...which of course is much more useful. To integrate the improvement above, I hacked tcp_main.c like this: $ sed -e 's;\(LM_ERR(\)\(failed to retrieve SO_ERROR (%d) %s\\n,\);\1addr (%s) port (%d), \2 inet_ntoa(((struct sockaddr_in *)servaddr)-sin_addr), ntohs(((struct sockaddr_in *)servaddr)-sin_port),;' tcp_main.c ...or if you prefer here's a unified diff that does the same thing: diff -Nau tcp_main.c.orig tcp_main.c --- tcp_main.c.orig 2012-10-17 00:09:05.0 +0200 +++ tcp_main.c2013-01-27 16:44:47.803235857 +0100 @@ -340,7 +340,7 @@ getsockopt(fd, SOL_SOCKET, SO_ERROR, err, err_len); if ((err==0) (poll_err==0)) goto end; if (err!=EINPROGRESS err!=EALREADY){ - LM_ERR(failed to retrieve SO_ERROR (%d) %s\n, err, + LM_ERR(addr (%s) port (%d), failed to retrieve SO_ERROR (%d) %s\n, inet_ntoa(((struct sockaddr_in *)servaddr)-sin_addr), ntohs(((struct sockaddr_in *)servaddr)-sin_port), err, strerror(err)); goto error; } This hack applies cleanly to OpenSIPS 1.7.2 and 1.8.2 although I've tested only on 1.7.2 (because of the database access bugs in 1.8.*.) Unfortunately there are other socket related operations which write unhelpful log text, so this is not a complete solution. Improved logging messages are of course great :-) I have a few unfinished patches in this regard as well, so if you send over partial improvements I'll incorporate them. However, the patch you sent is not correct because it doesn't deal with IPv6 addresses, you need to check the sa_family field of the sockaddr struct before casting it to sockaddr_in or sockaddr_in6. Regards, -- Saúl Ibarra Corretgé AG Projects ___ Devel mailing list Devel@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSIPS-Devel] Improved socket error messages
On Jan 28, 2013, at 12:10 PM, Bogdan-Andrei Iancu wrote: Hi, Saul, should I fix and upload Michael's patch (to be part of 1.9) or you want to provide a larger patch (to covert this too, still for 1.9) ? You may fix and upload that one, since I won't have time to do it this week. It would be nice to have a generic function/macro which would print the given error string + IP (port) + error description with the given error severity , for example :-) Then we can use it in more places later on. Regards, -- Saúl Ibarra Corretgé AG Projects ___ Devel mailing list Devel@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSIPS-Devel] Improved socket error messages
Hi Michael, See an improved version of your patch here on commit #9666 : http://opensips.svn.sourceforge.net/opensips/?rev=9666view=rev If you consider adding that extra info in other places, please provide a patch. Thanks and regards, Bogdan-Andrei Iancu OpenSIPS Founder and Developer http://www.opensips-solutions.com On 01/27/2013 09:17 PM, Michael Schloh von Bennewitz wrote: Hello developers, While trying to debug a difficult to find problem (where ntop, tcpdump, and snoop were no help) I finally figured out where OpenSIPS was trying to send misdirected messages by hacking the sources. It might be useful to always accompany log text with network stack information. So instead of: Jan 02 18:18:02 hostnameerror opensips[1234]: ERROR:core:tcp_blocking_connect: failed to retrieve SO_ERROR (146) Connection refused ...we would see: Jan 02 18:18:02 hostnameerror opensips[1234]: ERROR:core:tcp_blocking_connect: addr (1.2.3.4) port (98765), failed to retrieve SO_ERROR (146) Connection refused ...which of course is much more useful. To integrate the improvement above, I hacked tcp_main.c like this: $ sed -e 's;\(LM_ERR(\)\(failed to retrieve SO_ERROR (%d) %s\\n,\);\1addr (%s) port (%d), \2 inet_ntoa(((struct sockaddr_in *)servaddr)-sin_addr), ntohs(((struct sockaddr_in *)servaddr)-sin_port),;' tcp_main.c ...or if you prefer here's a unified diff that does the same thing: diff -Nau tcp_main.c.orig tcp_main.c --- tcp_main.c.orig 2012-10-17 00:09:05.0 +0200 +++ tcp_main.c 2013-01-27 16:44:47.803235857 +0100 @@ -340,7 +340,7 @@ getsockopt(fd, SOL_SOCKET, SO_ERROR,err,err_len); if ((err==0) (poll_err==0)) goto end; if (err!=EINPROGRESS err!=EALREADY){ - LM_ERR(failed to retrieve SO_ERROR (%d) %s\n, err, + LM_ERR(addr (%s) port (%d), failed to retrieve SO_ERROR (%d) %s\n, inet_ntoa(((struct sockaddr_in *)servaddr)-sin_addr), ntohs(((struct sockaddr_in *)servaddr)-sin_port), err, strerror(err)); goto error; } This hack applies cleanly to OpenSIPS 1.7.2 and 1.8.2 although I've tested only on 1.7.2 (because of the database access bugs in 1.8.*.) Unfortunately there are other socket related operations which write unhelpful log text, so this is not a complete solution. Regards, Michael ___ Devel mailing list Devel@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/devel ___ Devel mailing list Devel@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSIPS-Devel] Improved socket error messages
Hello Saul and Bogdan, On Mon., Jan. 28, 2013, Bogdan-Andrei Iancu wrote: On Jan 28, 2013, Michael Schloh von Bennewitz wrote: On Mon., Jan. 28, 2013, Saúl Ibarra Corretgé wrote: On Jan 27, 2013, at 8:17 PM, Michael Schloh von Bennewitz wrote: Jan 02 18:18:02 hostname error opensips[1234]: ERROR:core:tcp_blocking_connect: addr (1.2.3.4) port (98765), failed to retrieve SO_ERROR (146) Connection refused ...which of course is much more useful. [...] - LM_ERR(failed to retrieve SO_ERROR (%d) %s\n, err, + LM_ERR(addr (%s) port (%d), failed to retrieve SO_ERROR (%d) %s\n, inet_ntoa(((struct sockaddr_in *)servaddr)-sin_addr), ntohs(((struct sockaddr_in *)servaddr)-sin_port), err, strerror(err)); goto error; } Improved logging messages are of course great :-) I have a few unfinished patches in this regard as well, so if you send over partial improvements I'll incorporate them. However, the patch you sent is not correct because it doesn't deal with IPv6 addresses, you need to check the sa_family field of the sockaddr struct before casting it to sockaddr_in or sockaddr_in6. Wow, that's quite embarassing considering that I'm running IPv6 myself. Here are two equivalent corrected versions of the hack. You probably want the second which is easier to read. diff -Nau tcp_main.c.orig tcp_main.c --- tcp_main.c.orig 2013-01-27 19:35:31.898051715 +0100 +++ tcp_main.c2013-01-28 18:30:27.516978242 +0100 @@ -342,7 +342,8 @@ getsockopt(fd, SOL_SOCKET, SO_ERROR, err, err_len); if ((err==0) (poll_err==0)) goto end; if (err!=EINPROGRESS err!=EALREADY){ - LM_ERR(failed to retrieve SO_ERROR (%d) %s\n, err, + char erraddrstr[INET6_ADDRSTRLEN]; + LM_ERR(addr (%s) port (%d), failed to retrieve SO_ERROR (%d) %s\n, servaddr-sa_family == AF_INET ? inet_ntoa(((struct sockaddr_in *)servaddr)-sin_addr) : inet_ntop(AF_INET6, ((struct sockaddr_in6 *)servaddr)-sin6_addr, erraddrstr, sizeof(erraddrstr)), servaddr-sa_family == AF_INET ? ntohs(((struct sockaddr_in *)servaddr)-sin_port) : ntohs(((struct sockaddr_in6 *)servaddr)-sin6_port), err, strerror(err)); goto error; } diff -Nau tcp_main.c.orig tcp_main.c --- tcp_main.c.orig 2013-01-27 19:35:31.898051715 +0100 +++ tcp_main.c2013-01-28 18:35:23.761388145 +0100 @@ -342,8 +342,11 @@ getsockopt(fd, SOL_SOCKET, SO_ERROR, err, err_len); if ((err==0) (poll_err==0)) goto end; if (err!=EINPROGRESS err!=EALREADY){ - LM_ERR(failed to retrieve SO_ERROR (%d) %s\n, err, - strerror(err)); + char erraddrstr[INET6_ADDRSTRLEN]; + if (servaddr-sa_family == AF_INET) + LM_ERR(addr (%s) port (%d), failed to retrieve SO_ERROR (%d) %s\n, inet_ntoa(((struct sockaddr_in *)servaddr)-sin_addr), ntohs(((struct sockaddr_in *)servaddr)-sin_port), err, strerror(err)); See an improved version of your patch here on commit #9666 : http://opensips.svn.sourceforge.net/opensips/?rev=9666view=rev If you consider adding that extra info in other places, please provide a patch. Your correction is better than my correction so thanks alot for providing it and making the commit. I'll pull out my POSIX version of the correction [1] once 1.9.0 is out and my specs are updated. I'll keep an eye out for other places as well (surely there are many.) [1] http://scm.europalab.com/opkg/rev/a2c6460cfb16 Regards, Michael ___ Devel mailing list Devel@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/devel
[OpenSIPS-Devel] Improved socket error messages
Hello developers, While trying to debug a difficult to find problem (where ntop, tcpdump, and snoop were no help) I finally figured out where OpenSIPS was trying to send misdirected messages by hacking the sources. It might be useful to always accompany log text with network stack information. So instead of: Jan 02 18:18:02 hostname error opensips[1234]: ERROR:core:tcp_blocking_connect: failed to retrieve SO_ERROR (146) Connection refused ...we would see: Jan 02 18:18:02 hostname error opensips[1234]: ERROR:core:tcp_blocking_connect: addr (1.2.3.4) port (98765), failed to retrieve SO_ERROR (146) Connection refused ...which of course is much more useful. To integrate the improvement above, I hacked tcp_main.c like this: $ sed -e 's;\(LM_ERR(\)\(failed to retrieve SO_ERROR (%d) %s\\n,\);\1addr (%s) port (%d), \2 inet_ntoa(((struct sockaddr_in *)servaddr)-sin_addr), ntohs(((struct sockaddr_in *)servaddr)-sin_port),;' tcp_main.c ...or if you prefer here's a unified diff that does the same thing: diff -Nau tcp_main.c.orig tcp_main.c --- tcp_main.c.orig 2012-10-17 00:09:05.0 +0200 +++ tcp_main.c 2013-01-27 16:44:47.803235857 +0100 @@ -340,7 +340,7 @@ getsockopt(fd, SOL_SOCKET, SO_ERROR, err, err_len); if ((err==0) (poll_err==0)) goto end; if (err!=EINPROGRESS err!=EALREADY){ - LM_ERR(failed to retrieve SO_ERROR (%d) %s\n, err, + LM_ERR(addr (%s) port (%d), failed to retrieve SO_ERROR (%d) %s\n, inet_ntoa(((struct sockaddr_in *)servaddr)-sin_addr), ntohs(((struct sockaddr_in *)servaddr)-sin_port), err, strerror(err)); goto error; } This hack applies cleanly to OpenSIPS 1.7.2 and 1.8.2 although I've tested only on 1.7.2 (because of the database access bugs in 1.8.*.) Unfortunately there are other socket related operations which write unhelpful log text, so this is not a complete solution. Regards, Michael ___ Devel mailing list Devel@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/devel