Hi Ben,
Regarding the "blocking rest_get()" problem: due to the specifics of the
libcurl API, there is
no direct access to the socket file descriptors during the initial TCP
connect handshake. Thus,
this phase will be synchronous (blocking), while all transfers (send
HTTP req, receive reply) are
done asynchronously, regardless of how the data gets send or arrives
(all at once, in chunks,
byte-by-byte, etc.).
So, indeed: on the specific test case that the remote TCP endpoint is
either unresponsive or it is
blocked by a firewall, the throughput of your OpenSIPS may be
significantly affected when talking HTTP
to such peers even if you perform async HTTP requests. Maybe this is
just my opinion, but I believe
that the average case is when the HTTP server/API is down and the TCP
connect attempt will get rejected.
See [1] for more on this topic.
Regarding the 5s vs. 8s timeouts, I will have to confirm the behavior
first and get back to you.
Async is heavily reliant on tm, so I wouldn't exclude the possibility
that the transaction timeout
overrides, in some cases, the async logic's timeout.
Best regards,
[1]:
https://serverfault.com/questions/521359/why-do-some-connections-time-out-and-others-get-refused
Liviu Chircu
OpenSIPS Developer
http://www.opensips-solutions.com
On 25.06.2019 00:33, Ben Newlin wrote:
Hello all,
During testing I have found behavior that I cannot explain based on
our current understanding of the operation of the async and
rest_client functionality. I have reproduced this behavior with a
fairly simple script configuration, the relevant piece of which is below:
children=8
loadmodule "rest_client.so"
modparam("rest_client", "connection_timeout", 5)
modparam("rest_client", "connect_poll_interval", 1)
modparam("rest_client", "curl_timeout", 5)
route {
xlog("L_ALERT", "before t_newtran()\r\n");
t_newtran();
xlog("L_ALERT", "after t_newtran()\r\n");
async(rest_post("http://0.0.0.0", "{}", "application/json",
"$var(body)", "$var(ctype)", "$var(rcode)"), test_resume);
xlog("L_ALERT", "after async\r\n");
}
route[test_resume] {
xlog("L_ALERT", "test_resume: $retcode\r\n");
t_reply("503", "Service Unavailable");
}
The expected operation is to immediately trigger t_newtran() which
sends a 100 response, then timeout on the rest_post call in 5 seconds
and respond with a 503. For individual calls this does occur, although
the timeout consistently takes several seconds too long, usually ~8s,
as can be seen in the logs from the above code:
Jun 24 19:10:56 [337] before t_newtran()
Jun 24 19:10:56 [337] after t_newtran()
Jun 24 19:11:04 [337] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 19:11:04 [337] test_resume: -2
The extra time is troubling, but the bigger concern is that when I run
multiple concurrent calls, I find that the 8 UDP children are still
all blocking against the rest_post call. Here are the logs for that test:
Jun 24 21:19:51 [337] before t_newtran()
Jun 24 21:19:51 [337] after t_newtran()
Jun 24 21:19:51 [338] before t_newtran()
Jun 24 21:19:51 [338] after t_newtran()
Jun 24 21:19:51 [335] before t_newtran()
Jun 24 21:19:51 [335] after t_newtran()
Jun 24 21:19:51 [333] before t_newtran()
Jun 24 21:19:51 [333] after t_newtran()
Jun 24 21:19:52 [331] before t_newtran()
Jun 24 21:19:52 [331] after t_newtran()
Jun 24 21:19:52 [336] before t_newtran()
Jun 24 21:19:52 [336] after t_newtran()
Jun 24 21:19:52 [332] before t_newtran()
Jun 24 21:19:52 [332] after t_newtran()
Jun 24 21:19:52 [334] before t_newtran()
Jun 24 21:19:52 [334] after t_newtran()
Jun 24 21:19:59 [337] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:19:59 [337] test_resume: -2
Jun 24 21:19:59 [337] before t_newtran()
Jun 24 21:19:59 [337] after t_newtran()
Jun 24 21:19:59 [338] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:19:59 [338] test_resume: -2
Jun 24 21:19:59 [338] before t_newtran()
Jun 24 21:19:59 [338] after t_newtran()
Jun 24 21:19:59 [335] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:19:59 [335] test_resume: -2
Jun 24 21:19:59 [335] before t_newtran()
Jun 24 21:19:59 [335] after t_newtran()
Jun 24 21:19:59 [333] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:19:59 [333] test_resume: -2
Jun 24 21:19:59 [333] before t_newtran()
Jun 24 21:19:59 [333] after t_newtran()
Jun 24 21:19:59 [331] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:19:59 [331] test_resume: -2
Jun 24 21:19:59 [331] before t_newtran()
Jun 24 21:19:59 [331] after t_newtran()
Jun 24 21:19:59 [336] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:19:59 [336] test_resume: -2
Jun 24 21:19:59 [336] before t_newtran()
Jun 24 21:19:59 [336] after t_newtran()
Jun 24 21:19:59 [332] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:19:59 [332] test_resume: -2
Jun 24 21:19:59 [332] before t_newtran()
Jun 24 21:19:59 [332] after t_newtran()
Jun 24 21:20:00 [334] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:20:00 [334] test_resume: -2
Jun 24 21:20:00 [334] before t_newtran()
Jun 24 21:20:00 [334] after t_newtran()
Jun 24 21:20:07 [337] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:20:07 [337] test_resume: -2
Jun 24 21:20:07 [338] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:20:07 [338] test_resume: -2
Jun 24 21:20:07 [335] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:20:07 [335] test_resume: -2
Jun 24 21:20:07 [333] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:20:07 [333] test_resume: -2
Jun 24 21:20:07 [331] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:20:07 [331] test_resume: -2
Jun 24 21:20:07 [336] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:20:07 [336] test_resume: -2
Jun 24 21:20:07 [332] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:20:07 [332] test_resume: -2
Jun 24 21:20:07 [334] ERROR:rest_client:start_async_http_req: connect
timeout on http://0.0.0.0 (5s)
Jun 24 21:20:07 [334] test_resume: -2
You can clearly see that the first 8 Invites are processed
immediately, but then no new messages are processed until the rest
queries begin to timeout. As each query times out, freeing a child
process, a new inbound request is processed. Clearly the message
processing is being blocked on the rest queries, which is not supposed
to happen when using async. Am I missing something in my configuration
or am I not understanding how async is supposed to work somehow?
Ben Newlin
_______________________________________________
Users mailing list
[email protected]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
_______________________________________________
Users mailing list
[email protected]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users