Hi Bertrand,
On Wed, Jun 15, 2016 at 08:43:45PM +0100, Bertrand Jacquin wrote:
> Hi,
>
> Here is series of two patches to support having HAProxy behind a L7
> NetScaler load balancer configured to insert real client IP. This is
> basically a NetScaler version of the PROXY protocol for the same purpose
> except that it does it by copying the client raw IP and TCP header into
> the first packet of data.
Thanks for doing this and for providing detailed documentation on the
protocol.
I found two minor glitches below :
> @@ -10525,16 +10537,18 @@ send-proxy
> connection established to this server. The PROXY protocol informs the other
> end about the layer 3/4 addresses of the incoming connection, so that it
> can
> know the client's address or the public address it accessed to, whatever
> the
> - upper layer protocol. For connections accepted by an "accept-proxy"
> listener,
> - the advertised address will be used. Only TCPv4 and TCPv6 address families
> - are supported. Other families such as Unix sockets, will report an UNKNOWN
> - family. Servers using this option can fully be chained to another instance
> of
> - haproxy listening with an "accept-proxy" setting. This setting must not be
> - used if the server isn't aware of the protocol. When health checks are sent
> - to the server, the PROXY protocol is automatically used when this option is
> - set, unless there is an explicit "port" or "addr" directive, in which case
> an
> - explicit "check-send-proxy" directive would also be needed to use the PROXY
> - protocol. See also the "accept-proxy" option of the "bind" keyword.
> + upper layer protocol. For connections accepted by an "accept-praoxy" or
s/praoxy/proxy/
> @@ -570,6 +573,13 @@ static inline const char *conn_err_code_str(struct
> connection *c)
> case CO_ER_PRX_NOT_HDR: return "Received something which does not
> look like a PROXY protocol header";
> case CO_ER_PRX_BAD_HDR: return "Received an invalid PROXY protocol
> header";
> case CO_ER_PRX_BAD_PROTO: return "Received an unhandled protocol in the
> PROXY protocol header";
> +
> + case CO_ER_CIP_EMPTY: return "Connection closed while waiting for
> NetScaler Client IP header";
> + case CO_ER_CIP_ABORT: return "Connection error while waiting for
> Netscaler Client IP header";
Please check if it should be "NetScaler" or "Netscaler", as we want to
be consistent and to respect product names. Maybe a "(tm)" or similar is
even required in the doc, I don't know.
> --- a/include/types/listener.h
> +++ b/include/types/listener.h
> @@ -92,6 +92,7 @@ enum li_state {
> #define LI_O_TCP_FO 0x0100 /* enable TCP Fast Open (linux >= 3.7) */
> #define LI_O_V6ONLY 0x0200 /* bind to IPv6 only on Linux >= 2.4.21 */
> #define LI_O_V4V6 0x0400 /* bind to IPv4/IPv6 on Linux >= 2.4.21 */
> +#define LI_O_ACC_CIP 0x0800 /* find the proxied address in the first
> request line */
I guess you copy-pasted the accept proxy line without changing the comment :-)
> diff --git a/src/connection.c b/src/connection.c
> index b926e9f513ed..9a9a4dc9cb15 100644
> --- a/src/connection.c
> +++ b/src/connection.c
> @@ -66,6 +66,10 @@ void conn_fd_handler(int fd)
> if (!conn_recv_proxy(conn, CO_FL_ACCEPT_PROXY))
> goto leave;
>
> + if (conn->flags & CO_FL_ACCEPT_CIP)
> + if (!conn_recv_netscaler_cip(conn, CO_FL_ACCEPT_CIP))
> + goto leave;
> +
Netscaler_cip is extracted after accept_proxy, which means that if both
are configured, we expect the proxy header to be placed outside. I don't
know if there are any use case with both (hardly seems to make sense).
I tend to think that a netscaler is mostly a local component and that
a haproxy proxy header may come from far away. Thus if you have the
following configuration :
[ haproxy1 ] ---- (internet) ---- [ netscaler ] ---- [ haproxy2 ] --- server
Then the original client's address will be transmitted by haproxy1 over the
internet to the netscaler, which will simply prepend its CIP header, so
haproxy2 will see :
[ CIP header ] [ haproxy header ] [ data ]
The other situation where a netscaler is placed before haproxy where both
would add their header seems much less likely to happen, because :
- haproxy could already strip the netscaler header
- haproxy may refrain from adding one if the source comes from a known
network (ie: the netscaler).
Thus I think it may make sense to move the processing of the CIP *before*
the PROXY protocol.
There is a problem below if a corrupted header arrives in small chunks. Let's
imagine that we only receive the 8 first bytes, that they contain the magic
and a short length (<= 8) :
> + do {
> + trash.len = recv(conn->t.sock.fd, trash.str, trash.size,
> MSG_PEEK);
=> trash.len=8
> + if (trash.len < 0) {
> + if (errno == EINTR)
> + continue;
> + if (errno == EAGAIN) {
> + fd_cant_recv(conn->t.sock.fd);
> + return 0;
> + }
> + goto recv_abort;
> + }
> + } while (0);
>
> + if (!trash.len) {
> + /* client shutdown */
> + conn->err_code = CO_ER_CIP_EMPTY;
> + goto fail;
> + }
> +
> + if (trash.len < 8)
> + goto missing;
OK we reach this point.
> + line = trash.str;
> +
> + cip_magic = ntohl(*(uint32_t *)line);
> + cip_len = ntohl(*(uint32_t *)(line+4));
Magic is OK.
cip_len = 0.
> + /* Decode a possible NetScaler Client IP request, fail early if it does
> not match */
> + if (cip_magic != objt_listener(conn->target)->bind_conf->ns_cip_magic)
> + goto bad_magic;
> +
> + if (trash.len < cip_len)
> + goto missing;
We reach this point.
> + line += 8;
> +
> + ip_v = ((char)*line & 0xf0) >> 4;
Here we read the garbage that remains in memory from a previous buffer.
BTW, there's no need for this (char) cast here.
> + if (ip_v == 4) {
Let's assume that we match this one because memory used to contain 0x4X.
> + struct ip *hdr_ip4;
> + struct tcphdr *hdr_tcp;
> +
> + hdr_ip4 = (struct ip *)line;
> +
> + if (trash.len < ntohs(hdr_ip4->ip_len)) {
> + goto missing;
OK, let's assume this memory location contained large enough bytes (a
single non-zero on the first one is enough).
> + } else if (hdr_ip4->ip_p != IPPROTO_TCP) {
> + /* The protocol does not include a TCP header */
> + conn->err_code = CO_ER_CIP_BAD_PROTO;
> + goto fail;
> + }
and that we have 6 here.
> + hdr_tcp = (struct tcphdr *)(line + (hdr_ip4->ip_hl * 4));
Here we're preparing a TCP header that is from 0 to 60 bytes away from
the line's beginning.
> + /* update the session's addresses and mark them set */
> + ((struct sockaddr_in *)&conn->addr.from)->sin_family = AF_INET;
> + ((struct sockaddr_in *)&conn->addr.from)->sin_addr.s_addr =
> hdr_ip4->ip_src.s_addr;
> + ((struct sockaddr_in *)&conn->addr.from)->sin_port =
> hdr_tcp->source;
> +
> + ((struct sockaddr_in *)&conn->addr.to)->sin_family = AF_INET;
> + ((struct sockaddr_in *)&conn->addr.to)->sin_addr.s_addr =
> hdr_ip4->ip_dst.s_addr;
> + ((struct sockaddr_in *)&conn->addr.to)->sin_port =
> hdr_tcp->dest;
> +
> + conn->flags |= CO_FL_ADDR_FROM_SET | CO_FL_ADDR_TO_SET;
We save what lies in memory there.
> + }
> + else if (ip_v == 6) {
(...)
> + }
> +
> + line += cip_len;
> + trash.len = line - trash.str;
trash.len is re-evaluated to match skipped bytes.
> + /* remove the NetScaler Client IP header from the request. For this
> + * we re-read the exact line at once. If we don't get the exact same
> + * result, we fail.
> + */
> + do {
> + int len2 = recv(conn->t.sock.fd, trash.str, trash.len, 0);
> + if (len2 < 0 && errno == EINTR)
> + continue;
> + if (len2 != trash.len)
> + goto recv_abort;
> + } while (0);
And we trim them. Thus it may get an incorrect number of bytes removed from
the stream. There's no risk of overflow since the values are very small
compared to the smallest possible size for a working buffer.
I think you should simply validate after "line += 8" that trash.len is at
least as large as the smallest valid IP header + 8 (28), which allows you
to parse it, and then when you get the length you need to skip from ip_len
you need to be sure that trash.len is large enough to cover 8 + ip_len + 20
(for TCP). Similar principle applies for IPv6.
So overall it looks pretty good already!
Thanks!
Willy