Liviu,

Thank you for the response and explanation, and for investigating the timing 
discrepancies. I must say though that this answer is very disappointing. We are 
not transferring large files or data sets during realtime call processing (I 
hope no one is!), so the transfer time is not really a concern; it is the 
connect time that can cause severe backlogs when the far end is not available, 
as in my test case. So as it stands the async functionality for rest_client is 
mostly useless to us as it doesn’t prevent the most common cause of congestion.

I can appreciate that libcurl does not provide access during the connect, but I 
guess my question would be why isn’t the entire operation run in async? Why 
must the connect be performed in the current process and only the transfer be 
in another process? My expectation when using any async function was that all 
execution of the underlying function called would be performed in a separate 
process, not that the function could pick and choose which parts would block 
and which would not.

Are other async function implementations similar to this? For example, does an 
async_db_query establish the DB connection in the current thread in blocking 
mode and only perform the actual query as async?

I would also recommend updating the documentation for both async and 
rest_client specifically to make this limitation clear, as it was not at all 
clear to us.

Ben Newlin

From: Users <[email protected]> on behalf of Liviu Chircu 
<[email protected]>
Reply-To: OpenSIPS users mailling list <[email protected]>
Date: Tuesday, June 25, 2019 at 8:52 AM
To: "[email protected]" <[email protected]>
Subject: Re: [OpenSIPS-Users] Rest Client Async operation


Hi Ben,

Regarding the "blocking rest_get()" problem: due to the specifics of the 
libcurl API, there is
no direct access to the socket file descriptors during the initial TCP connect 
handshake.  Thus,
this phase will be synchronous (blocking), while all transfers (send HTTP req, 
receive reply) are
done asynchronously, regardless of how the data gets send or arrives (all at 
once, in chunks,
byte-by-byte, etc.).

So, indeed: on the specific test case that the remote TCP endpoint is either 
unresponsive or it is
blocked by a firewall, the throughput of your OpenSIPS may be significantly 
affected when talking HTTP
to such peers even if you perform async HTTP requests.  Maybe this is just my 
opinion, but I believe
that the average case is when the HTTP server/API is down and the TCP connect 
attempt will get rejected.
See [1] for more on this topic.

Regarding the 5s vs. 8s timeouts, I will have to confirm the behavior first and 
get back to you.
Async is heavily reliant on tm, so I wouldn't exclude the possibility that the 
transaction timeout
overrides, in some cases, the async logic's timeout.

Best regards,

[1]: 
https://serverfault.com/questions/521359/why-do-some-connections-time-out-and-others-get-refused<https://serverfault.com/questions/521359/why-do-some-connections-time-out-and-others-get-refused>

Liviu Chircu

OpenSIPS Developer

http://www.opensips-solutions.com<http://www.opensips-solutions.com>
On 25.06.2019 00:33, Ben Newlin wrote:
Hello all,

During testing I have found behavior that I cannot explain based on our current 
understanding of the operation of the async and rest_client functionality. I 
have reproduced this behavior with a fairly simple script configuration, the 
relevant piece of which is below:

children=8

loadmodule "rest_client.so"
modparam("rest_client", "connection_timeout", 5)
modparam("rest_client", "connect_poll_interval", 1)
modparam("rest_client", "curl_timeout", 5)

route {
  xlog("L_ALERT", "before t_newtran()\r\n");
  t_newtran();
  xlog("L_ALERT", "after t_newtran()\r\n");
  async(rest_post("http://0.0.0.0";<http://0.0.0.0>, "{}", "application/json", 
"$var(body)", "$var(ctype)", "$var(rcode)"), test_resume);
  xlog("L_ALERT", "after async\r\n");
}
route[test_resume] {
  xlog("L_ALERT", "test_resume: $retcode\r\n");
  t_reply("503", "Service Unavailable");
}

The expected operation is to immediately trigger t_newtran() which sends a 100 
response, then timeout on the rest_post call in 5 seconds and respond with a 
503. For individual calls this does occur, although the timeout consistently 
takes several seconds too long, usually ~8s, as can be seen in the logs from 
the above code:

Jun 24 19:10:56 [337] before t_newtran()
Jun 24 19:10:56 [337] after t_newtran()
Jun 24 19:11:04 [337] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 19:11:04 [337] test_resume: -2

The extra time is troubling, but the bigger concern is that when I run multiple 
concurrent calls, I find that the 8 UDP children are still all blocking against 
the rest_post call. Here are the logs for that test:

Jun 24 21:19:51 [337] before t_newtran()
Jun 24 21:19:51 [337] after t_newtran()
Jun 24 21:19:51 [338] before t_newtran()
Jun 24 21:19:51 [338] after t_newtran()
Jun 24 21:19:51 [335] before t_newtran()
Jun 24 21:19:51 [335] after t_newtran()
Jun 24 21:19:51 [333] before t_newtran()
Jun 24 21:19:51 [333] after t_newtran()
Jun 24 21:19:52 [331] before t_newtran()
Jun 24 21:19:52 [331] after t_newtran()
Jun 24 21:19:52 [336] before t_newtran()
Jun 24 21:19:52 [336] after t_newtran()
Jun 24 21:19:52 [332] before t_newtran()
Jun 24 21:19:52 [332] after t_newtran()
Jun 24 21:19:52 [334] before t_newtran()
Jun 24 21:19:52 [334] after t_newtran()
Jun 24 21:19:59 [337] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:19:59 [337] test_resume: -2
Jun 24 21:19:59 [337] before t_newtran()
Jun 24 21:19:59 [337] after t_newtran()
Jun 24 21:19:59 [338] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:19:59 [338] test_resume: -2
Jun 24 21:19:59 [338] before t_newtran()
Jun 24 21:19:59 [338] after t_newtran()
Jun 24 21:19:59 [335] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:19:59 [335] test_resume: -2
Jun 24 21:19:59 [335] before t_newtran()
Jun 24 21:19:59 [335] after t_newtran()
Jun 24 21:19:59 [333] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:19:59 [333] test_resume: -2
Jun 24 21:19:59 [333] before t_newtran()
Jun 24 21:19:59 [333] after t_newtran()
Jun 24 21:19:59 [331] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:19:59 [331] test_resume: -2
Jun 24 21:19:59 [331] before t_newtran()
Jun 24 21:19:59 [331] after t_newtran()
Jun 24 21:19:59 [336] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:19:59 [336] test_resume: -2
Jun 24 21:19:59 [336] before t_newtran()
Jun 24 21:19:59 [336] after t_newtran()
Jun 24 21:19:59 [332] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:19:59 [332] test_resume: -2
Jun 24 21:19:59 [332] before t_newtran()
Jun 24 21:19:59 [332] after t_newtran()
Jun 24 21:20:00 [334] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:20:00 [334] test_resume: -2
Jun 24 21:20:00 [334] before t_newtran()
Jun 24 21:20:00 [334] after t_newtran()
Jun 24 21:20:07 [337] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:20:07 [337] test_resume: -2
Jun 24 21:20:07 [338] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:20:07 [338] test_resume: -2
Jun 24 21:20:07 [335] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:20:07 [335] test_resume: -2
Jun 24 21:20:07 [333] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:20:07 [333] test_resume: -2
Jun 24 21:20:07 [331] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:20:07 [331] test_resume: -2
Jun 24 21:20:07 [336] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:20:07 [336] test_resume: -2
Jun 24 21:20:07 [332] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:20:07 [332] test_resume: -2
Jun 24 21:20:07 [334] ERROR:rest_client:start_async_http_req: connect timeout 
on http://0.0.0.0<http://0.0.0.0> (5s)
Jun 24 21:20:07 [334] test_resume: -2

You can clearly see that the first 8 Invites are processed immediately, but 
then no new messages are processed until the rest queries begin to timeout. As 
each query times out, freeing a child process, a new inbound request is 
processed. Clearly the message processing is being blocked on the rest queries, 
which is not supposed to happen when using async. Am I missing something in my 
configuration or am I not understanding how async is supposed to work somehow?

Ben Newlin



_______________________________________________

Users mailing list

[email protected]<mailto:[email protected]>

http://lists.opensips.org/cgi-bin/mailman/listinfo/users<http://lists.opensips.org/cgi-bin/mailman/listinfo/users>
_______________________________________________
Users mailing list
[email protected]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users

Reply via email to