Hi Andrew,

The second blocked process (doing the TLS/TCP stuff) surprisingly got stuck while waiting for a TCP fd from the TCP Main process.

You mentioned that the logs of the UDP worker (doing the TCP send) suddenly stopped - around that time, do you see any errors from that process or from the TCP MAIN processes ?

Regards,

Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  https://www.opensips-solutions.com
OpenSIPS eBootcamp 2021
  https://opensips.org/training/OpenSIPS_eBootcamp_2021/

On 10/8/21 2:43 PM, Andrew Yager wrote:
Hi Bogdan-Andrei,

Have restarted since the last bt, but have recreated again and
attached. Earlier today we did also get another bt full on the second
blocked pid, but I didn't save it. In that case it was a UDP reply
from one of our upstream servers that had gone through mid_registrar
and was being relayed to a TCP endpoint. The TCP endpoint did have an
open file descriptor we could see, and it had sent and was blocked on
receive at the same point (I'm getting better at reading backtraces!
:D).

The thing I do note is happening is that every example I have is a UDP
message being received from an upstream server being relayed to a
client on a TCP/TLS connection via a UDP worker.

While we are using WolfSSL in this box, the other box where we have
the same behaviour (but I haven't taken backtraces yet) is running
OpenSSL and on 3.1.3; so it's not SSL library specific.

I'm going to see if I can get a backtrace from the 3.1.3 box shortly.

Andrew

On Fri, 8 Oct 2021 at 17:13, Bogdan-Andrei Iancu <[email protected]> wrote:
Hi Andrew,

OK, interesting progress here. So, the FIFO process blocks as it is
trying to send an IPC JOB to an UDP process which looks like also being
blocked.

Could you attach with GDB to the that UDP blocked process too ? (you
have its PID in the printing of the pt[x] in first gdb)

Regards,

Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
    https://www.opensips-solutions.com
OpenSIPS eBootcamp 2021
    https://opensips.org/training/OpenSIPS_eBootcamp_2021/

On 10/8/21 1:43 AM, Andrew Yager wrote:
Interestingly, where I usually see a range of continued messages from
a process continually in the debug log, they appear to stop for this
PID at 3:47am, and that process seems blocked on a tcp/tls send:




_______________________________________________
Users mailing list
[email protected]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users

Reply via email to