On Fri, Aug 28, 2015 at 11:40:18AM +0200, Lukas Tribus wrote: > >> Ok, you may be hitting a bug. Can you provide haproxy -vv output? > >> > > > > > > What do you mean? I get the following warning when trying to use this > > option in tcp backend/frontend: > > Yes I know (I didn't realize you are using tcp mode). I don't mean the > warning is the bug, I mean the tcp mode is supposed to not cause any > delays by default, if I'm not mistaken.
You're not mistaken, tcp_nodelay is unconditional in TCP mode and MSG_MORE is not used there since we never know if more data follows. In fact there's only one case where it can happen, it's when data wrap at the end of the buffer and we want to send them together. > You are running freebsd, so splicing (Linux) can't be an issue either. > Is strace available on your OS (afaik 64bit freebsd doesn't have strace)? > > Can you try disabling kqueue [1], to see if the behavior changes? If > not, try disabling poll as well [2]. That way haproxy falls back to > select(). > > Having all syscalls (strace) and tcpdumps from the front and backend > traffic would be helpful. Especially interesting would be if haproxy sets > TCP_NODELAY and MSG_MORE. It should set the former, but not the > latter. I used to find in the past on Linux (old version) that forcing TCP_NODELAY could end up with an actually higher latency than desired. This is due to the fact that you're not supposed to send anything after an incomplete TCP PUSH until it's been ACKed. I used to see this even cause slowdowns on some proxies. But something like 1 or 2 years ago while I was discussing about this on the HTTP WG with the Chromium developers, I couldn't reproduce it anymore, which means that the behavious has changed at least on Linux. I would not be surprized if it still exists on other OSes. A tcpdump will definitely tell us if that's the case because we'll see that a new segment is emitted immediately once the previous one gets ACKed. There's nothing that can be done about this (except switching to another stack or changing the application of course), because : - without TCP_NODELAY, you face Nagle and your data may wait up to 40ms - with TCP_NODELAY you can be blocked here. In practice, any application should only send a push when it has nothing more to send and is waiting for the other side to respond, so if the application sends many small messages, only the last one of each batch should have the PUSH flag set. I know it's not always easy to do especially when you forward data that comes from an uncontrolled source :-) Regards, Willy