Hi Trey, Have you seen this working on previous releases? We expect to be able to create HTTP connections from Sprout to Homestead very quickly (i.e. in under 10ms) so our HTTP connection timeout is set to 50ms (see here<https://github.com/Metaswitch/cpp-common/blob/1477ab58ca02d9b342554f887af75836720efbb5/include/http_connection_pool.h#L68>), and I think that’s the timeout you’re hitting. I suspect everything initially continues working after you add the latency because Sprout is using connections to Homestead that have already been set up, and you only run into the timeout problem when Sprout needs to set up new connections.
Thanks, Graeme From: Clearwater [mailto:[email protected]] On Behalf Of Trey Ormsbee Sent: 13 September 2016 16:55 To: [email protected] Subject: [Project Clearwater] latency to homestead causes curl connection fatal errors I am trying to simulate a higher latency connection to sprout and I have inserted latency ranging from 50ms-1000ms on the sprout signaling adapter. I have an issue that occurs fairly quickly with homestead queries (even with as low as 50ms) here is a snippet of the log: 12-09-2016 21:14:12.125 UTC Debug connection_pool.h:225: Request for connection to IP: xxx.xxx.xxx.51, port: 8888 12-09-2016 21:14:12.125 UTC Debug connection_pool.h:244: No existing connection in pool, create one 12-09-2016 21:14:12.126 UTC Debug http_connection_pool.cpp:56: Allocated CURL handle 0x7fd0d000ec00 12-09-2016 21:14:12.126 UTC Debug connection_pool.h:246: Created new connection 0x7fd0d0017a80 12-09-2016 21:14:12.126 UTC Debug httpclient.cpp:466: Sending HTTP request : http://homestead.example.com:8888/impi/%2B15557775555%40example.com/av?impu=sip%3A%2B15557775555%40example.com (trying xxx.xxx.xxx.51) 12-09-2016 21:14:12.177 UTC Error httpclient.cpp:487: http://homestead.example.com:8888/impi/%2B15557775555%40example.com/av?impu=sip%3A%2B15557775555%40example.com failed at server xxx.xxx.xxx.51 : Timeout was reached (28) : fatal 12-09-2016 21:14:12.177 UTC Debug baseresolver.cpp:498: Add xxx.xxx.xxx.51:8888 transport 6 to blacklist for 30 seconds, graylist for 30 seconds 12-09-2016 21:14:12.177 UTC Debug connection_pool.h:261: Release connection to IP: xxx.xxx.xxx.51, port: 8888 and destroy 12-09-2016 21:14:12.177 UTC Debug connection_pool.h:225: Request for connection to IP: xxx.xxx.xxx.52, port: 8888 12-09-2016 21:14:12.177 UTC Debug connection_pool.h:244: No existing connection in pool, create one 12-09-2016 21:14:12.178 UTC Debug http_connection_pool.cpp:56: Allocated CURL handle 0x7fd0d000ec00 12-09-2016 21:14:12.178 UTC Debug connection_pool.h:246: Created new connection 0x7fd0d0019ef0 12-09-2016 21:14:12.178 UTC Debug httpclient.cpp:466: Sending HTTP request : http://homestead.example.com:8888/impi/%2B15557775555%40example.com/av?impu=sip%3A%2B15557775555%40example.com (trying xxx.xxx.xxx.52) 12-09-2016 21:14:12.229 UTC Error httpclient.cpp:487: http://homestead.example.com:8888/impi/%2B15557775555%40example.com/av?impu=sip%3A%2B15557775555%40example.com failed at server xxx.xxx.xxx.52 : Timeout was reached (28) : fatal 12-09-2016 21:14:12.229 UTC Debug baseresolver.cpp:498: Add xxx.xxx.xxx.52:8888 transport 6 to blacklist for 30 seconds, graylist for 30 seconds 12-09-2016 21:14:12.229 UTC Debug connection_pool.h:261: Release connection to IP: xxx.xxx.xxx.52, port: 8888 and destroy 12-09-2016 21:14:12.229 UTC Debug communicationmonitor.cpp:82: Checking communication changes - successful attempts 0, failures 1 12-09-2016 21:14:12.229 UTC Error httpclient.cpp:623: cURL failure with cURL error code 28 (see man 3 libcurl-errors) and HTTP error code 500 12-09-2016 21:14:12.229 UTC Error hssconnection.cpp:149: Failed to get Authentication Vector for [email protected]<mailto:[email protected]> 12-09-2016 21:14:12.229 UTC Debug authentication.cpp:638: Failed to get Authentication vector This is happening on release-105 and is fairly easily recreate on a busy system by adding latency with tc. I used: ip netns exec signaling tc qdisc add dev eth1 root netem delay 400ms 200ms After 5 or 6 successful registrations, the above issue starts. Once that issue starts the only way to correct it is to remove the latency. Any ideas what might be causing this, does anyone else experience the same issues? I found this old issue logged that seems be a similar issue: https://github.com/Metaswitch/sprout/issues/144. Thanks, Trey
_______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org
