abalashov created an issue (kamailio/kamailio#4603)
The documentation for `evapi_relay()` says:
> The function is passing the task to evapi dispatcher process, therefore the
> SIP worker process is not blocked
However, I am not sure this is actually true. The client sockets on the EVAPI
dispatcher are never set to nonblocking. I have encountered instances in
production where an EVAPI consumer does not read fast enough and we get a total
SIP worker stall. While it is true that the TCP sending is done by the EVAPI
dispatcher process, this eventually leads to backpressure on the `socketpair`
pipe from the SIP worker process to the EVAPI dispatcher, and, in due time,
seems to stall the SIP worker.
As far as I can tell, the main problem, as I mentioned, is that client sockets
connected to the dispatcher aren't set `O_NONBLOCK` -- only the server
listening socket is. Accordingly, `write()`s to client sockets in
`evapi_dispatch_notify()` are, in fact, blocking. Is that right?
The secondary problem is that the notify `socketpair` between the worker and
the dispatcher process is also blocking, and so of course would have a finite
send buffer (what is it, 200-something kB on Linux?). Once that buffer is full,
if the dispatcher is stalled waiting on a blocking `write()` to a client, then
the worker's `write()` to the dispatcher end of the pipe (i.e.
`evapi_notify_sockets[1]`) will block, too.
If my understanding is correct, then the claim made above, that "the SIP worker
process is not blocked", represents a happy path through the code where the
dispatcher is healthy and can consume the pipe from a SIP worker fast enough.
Given a slow client, that may not be true. If so, I guess the suggested fix is
to make both the dispatcher pipe and the client socket writes nonblocking,
though I am not a competent judge of how to harness that into `libev` and so
forth. Client writes would need the usual `EAGAIN` handling and per-client
output queues, or some policy of dropping overflowing messages, and I'm not
sure how to best design that.
This issue is admittedly somewhat complex to reproduce, and I don't have a back
trace or other artifacts handy. It does not seem confined to a historical
version of Kamailio, and I analysed this issue against 6.1.
--
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/4603
You are receiving this because you are subscribed to this thread.
Message ID: <kamailio/kamailio/issues/[email protected]>
_______________________________________________
Kamailio - Development Mailing List -- [email protected]
To unsubscribe send an email to [email protected]
Important: keep the mailing list in the recipients, do not reply only to the
sender!