I have a PHP thrift client and have tried both python and Scala servers. About 1-3% of requests end up getting time out errors either in the write or read portion of the request. There is no reason for these errors. The servers have been up the entire time and I haven't even begun to send it enough traffic to overload it. The read errors are especially confusing since the methods being called don't actually return anything.
While the read errors might be the most confusing, the write errors are the biggest problem. Especially since I think they are actually going through although I can't be 100% certain. Yes, I realized like 5 days ago that a simple fix might be to just increase the timeout. Didn't work. This might have actually helped me stumble upon the real issue. I set the times out to 1 minute. When I did that, the scripts making the calls ending up freezing up causing CPU usage to go up and the load on the machine and go up. It appears that the connection stalled. When the timeout is low its ok because it just gets killed. Anyone else experienced this? Travis Beauvais
