Hi Willy, Thanks for your elaborative answer.
> Did you observe anything special about the CPU usage ? Was it lower > than with 1.3 ? If so, it would indicate some additional delay somewhere. > If it was higher, it could indicate that the Transfer-encoding parser > takes too many cycles but my preliminary tests proved it to be quite > efficient. I did not notice anything special about CPU usage. It seems to be around 2-4% with both versions. When checking munin-graphs, this morning I did however notice that the counter "connection resets received" from "netstat -s" was increasing a lot more with 1.4. This led me to look at the log more closely, and there seems to be a lot new errors that looks something like this: w.x.y.z:4004 [15/Mar/2010:09:50:51.190] fe_xxx be_yyy/upload-srvX 0/0/0/-1/62 502 391 - PR-- 9/6/6/3/0 0/0 "PUT /dav/filename.ext HTTP/1.1" This is only for a few of the PUT requests, most requests seem to get proxied successfully. I will try to reproduce this in a more controlled lab setup where I can sniff HTTP-headers to see what is actually sent in the request. > No, I've run POST requests (very similar to PUT), except that there > was no Transfer-Encoding in the requests. It's interesting that you're > doing that in the request, because Apache removed support for TE:chunked > a few years ago because there was no user. Also, most of my POST tests > were not performance related. Interesting. We do use Apache for parts of this application on the backend side, although PUT requests are handled by an in-house developed Erlang application. > A big part has changed, in previous version, haproxy did not care > at all about the payload. It only saw headers. Now with keepalive > support, it has to find requests/responses bounds and as such must > parse the transfer-encoding and content-lengths. However, transfer > encoding is nice to components such as haproxy because it's very > cheap. Haproxy reads a chunk size (one line), then forwards that > many bytes, then reads a new chunk size, etc... So this is really > a cheap operation. My tests have shown no issue at gigabit/s speeds > with just a few bytes per chunk. > > I suspect that the application tries to use the chunked encoding > to simulate a bidirectionnal access. In this case, it might be > waiting for data pending in the kernel buffers which were sent by > haproxy with the MSG_MORE flag, indicating that more data are > following (and so you should observe a low CPU usage). > > Could you please do a small test : in src/stream_sock.c, please > comment out line 616 : > > 615 /* this flag has precedence over the rest */ > 616 // if (b->flags & BF_SEND_DONTWAIT) > 617 send_flag &= ~MSG_MORE; > > It will unconditionally disable use of MSG_MORE. If this fixes the > issue for you, I'll probably have to add an option to disable this > packet merging for very specific applications. I tried to comment out the line above as instructed, but it made no noticable change. As stated above, I will try to reproduce the problem in a lab setup. This may be an issue with our application rather than haproxy. Best regards Erik -- Erik Gulliksson, [email protected] System Administrator, Diino AB http://www.diino.com

