Public bug reported:
Binary package hint: apache2
This applies to Apache 2.1.5 +
In a web server-farm scenario that is fronted by hardware load-
balancers, in this case Juniper Redline aka DX, where the load-balancers
are configured to use TCP multiplexing (holding open and re-using HTTP
connections to the web servers) there exists the potential for random,
unexplained and untraceable connection failures.
>From the end-user web browser client perspective, all they see is a
blank page returned. This happens so randomly any reports from end users
would be dismissed as network glitches.
I've spent the last two weeks working with a large IT e-commerce
retailer. Their system administrator initially came to me with the
belief that something in the Linux kernel network stack was faulty. He
had already done extensive diagnostic work with the Juniper support
engineers and neither had been able to pin-point the cause of the
failure.
What they knew was the persistent connection between the DXs and the
web-servers would occasionally, and seemingly randomly, be RESET by the
server. Some web servers in some clusters were affected; others weren't.
When I examined the tcpdump capture taken on a web server it quickly
became evident that Linux was ignoring ACKs from the DX during the
initial handshake, was retrying the SYN ACK the default 5 times, and
then closing the half-open connection.
After a lot of work with custom-written tools that detected packets at
the PF_PACKET level (libpcap) and checked they were seen by the
netfilters/iptables layer, we decided to hack a custom kernel. I added
printk() statements into net/ipv4/tcp_minisocks.c::tcp_check_req() so
each cause of a dropped packet was logged, and moved the netfilters
NF_HOOK() used for the 'mangle table INPUT chain' from its usual
location in net/ipv4/ip_input.c:ip_local_deliver() to
net/ipv4/tcp_input::tcp_rcv_state_process() after "tcp_set_state(sk,
TCP_ESTABLISHED);" in order that we could detect every handshake that
failed.
As a result we discovered handshakes were having their ACK from the
client (in this case the Juniper DX) discarded because the listening
socket was operating with TCP_DEFER_ACCEPT flag (SO_ACCEPTFILTER on
BSD).
The server's SYN_RECEIVED timer would time-out, and the server would
resend the SYN ACK. The DX would reply with a duplicate ACK, which would
again be discarded.
This would repeat 5 times (the default retries for SYN ACK). Each time-
out doubled in time: 3, 6, 12, 24, 48, 96 seconds respectively - ~190
seconds in total. If a request arrived from the DX *after* this the DX
received a RST from the server since the socket had been closed due to
the handshake failure. This causes the end-user client to see a 'white
page' (empty response).
If a request arrived from the DX *before* the retries and time-out
expired it would cause the connection to be ESTABLISHED and the request
would be handled.
The reason for the failures is the Juniper DX maintains a group (by
default 6) of persistent connections to each target host in a cluster of
servers. It creates these persistent connections *before* it has HTTP
requests for the target server. If the server is using Deferred Accept
(TCP_DEFER_ACCEPT) on listening sockets the connection will not be
promoted to ESTABLISHED until data is received.
It turns out that Apache introduced TCP_DEFER_ACCEPT as the *default*
for its socket options in version 2.1.5. There needs to be no specific
AcceptFilter http data
rule in the Apache configuration files to enable it. In fact, it needs
AcceptFilter http none
in order to disable TCP_DEFER_ACCEPT on its sockets.
Because the Juniper DX OS up to at least version 5.2.6 doesn't correctly
implement the HTTP protocol when using persistent connections, the
interaction between Apache 2.1.5+ and the DX persistent connections
brings about this issue when *traffic is light* - it won't happen if the
work load is medium or heavy.
The root cause of the failure, but exacerbated by the change in Apache
2.1.5+ to using TCP_DEFER_ACCEPT, is that the Juniper DX OS tries to
open a connection to the HTTP server *but doesn't send a request*.
Unlike other protocols like telnet, HTTP expects the connection to be
accompanied by a request, so the TCP packet contains data. RFC2616 (HTTP
1.1) section 1.4 states:
"...a connection may be used for one or more request/response
exchanges..."
The Juniper DX however creates a connection and in low-traffic
situations doesn't send "one or more request[s]..." causing the Linux
kernel network stack to time-out the socket.
The work-around is to disable TCP_DEFER_ACCEPT when deploying Apache
2.1.5+ behind load-balancing systems such as the Juniper Redline / DX by
adding to the Apache configuration:
AcceptFilter http none
** Affects: apache2 (Ubuntu)
Importance: Undecided
Assignee: TJ
Status: Confirmed
--
Unexplained random HTTP connection failures in hardware load-balanced
web-server farms
https://bugs.launchpad.net/bugs/134274
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs