Hi guys! I have a tough situation here and need help from the community.
We're using Haproxy as a load-balancer for pretty complex web application deployed in AWS EC2. One of ours http backends is a web-server on Tornado which uses NoSQL cluster for data storage. It serves requests on port 80 and Haproxy terminates SSL for incoming connections. There is main cycle of work for this backend: 1) We got an incoming PATCH request from client to Haproxy. 2) Haproxy redirects it to one of http backend node. 3) Backend node extracts all needed data and put it into NoSQL table. 4) After that backend replies with short response to client. Response contains confirmation of received data. Everything worked OK until we've decided to migrate to Amazon's DynamoDB. All local tests without load balancer passing fine, but if code is deployed on real instances in EC2 the following happens: As soon as backend node receives the request from Haproxy and connects to DynamoDB to put the data, Haproxy immediately replies error 502 to client (logs shows this happens roughly 100 msec BEFORE actual backend's reply). Backend successfully continues with processing of data and returns a reply with code 200 but session is closed already by Haproxy. Details of error 502 from Haproxy: client_ip=<redacted> client_port=32668 backend=http_data times=0/0/0/-1/91 http_status=502 bytes_uploaded=478 bytes_read=1807 termination_state=PH-- connections=1/0/0/0/0 queues=0/0 "PATCH /incoming_data HTTP/1.1" Termination state PH means error happened prematurely on headers processing so I've checked out the output of 'show errors' socket command and saw the following: Total events captured on [11/Oct/2017:07:27:22.352] : 1 [10/Oct/2017:18:34:51.923] backend http-data (#8): invalid response frontend http-ssl (#3), server incoming_data_bk1 (#1), event #1 src <redacted>:24565, session #12, session flags 0x000004ce HTTP msg state 26, msg flags 0x00000000, tx flags 0x08000000 HTTP chunk len 0 bytes, HTTP body len 0 bytes buffer flags 0x00008002, out 0 bytes, total 1453 bytes pending 1453 bytes, wrapping at 16384, error at position 0: 00000 \x17\x03\x03\x05\xA8\x83(\x13\xE5Ph\xA7\xE4\x93}\x04&\xA2\x83}e\xAE! 00023+ \x1E\xA1\x14|B\x15\xEE\x13\xEDx4\x8Bw"\xC0!\xB8\x97\xCE\xBF\x8E\xC5 00045+ \xC0\x8C\xF2G$\t\xCC\x97\xE8K[5\xB3\x95d\xD5c#l\x9AI\r2@\xD6\xC0<dBI\e 00076+ \x03/f\xD8a\xA2\xCAB\xA09\xFA0g\xA1I\x14\xFEM\x9F\xAE*x\xB7\x87\xCA And there is 100+ more lines of binary data. I've inspected the actual reply from web-server (it's logged before sending) and it was only 6 bytes with all proper headers. So that weird data which has been captured by Haproxy is definitely not the actual response. So I've began debugging and this is what I've found so far: 1) This issue appears ONLY if backend performs data exchange with DynamoDB. 2) It's not 100% reproducible - roughly one of ten requests succeed. Then I've started Haproxy in debug mode on foreground to see all info about all requests and responses on screen. After I've send the PATCH request Haproxy have stuck for about 40 seconds (literally) but after this pause it continued with proper response from backend! I have no idea what Haproxy did in that 40 seconds - there was no messages on screen about anything. After this 'freeze' the normal response from backend have appeared. All headers was in place and Content-Length and Content-Type was OK when response have transmitted. So it looks like Haproxy somehow intercepts data exchange between backend and DynamoDB and treat this data as reply. Does anyone saw something like this before? Any suggestions on how to debug/fix that? What might be the cause of that? I've checked this out with all main releases of Haproxy - 1.5, 1.6, 1.7 - all have the same behavior with that. Thanks!

