Hi All So, I enabled latest (brilliant) contribution from Olivier into my Kubernetes cluster and I discovered it did not work as expected. After digging into the issues, I found 3 bugs directly related to the way SRV records must be read and processed by HAProxy. It was clearly hard to spot them outside a real orchestrator :)
Please find in attachment 3 patches to fix them. Please note that I might have found an other bug, that I'll dig into later. When "scalling in" (reducing an app footprint in kubernetes), HAProxy considers some servers (pods in kubernetes) in error "no dns resolution". This is normal. What is not normal is that those servers never ever come back to live, even when I scale up again... Note that thanks to (Salut) Fred contribution about server-templates some time ago, we can do some very cool fancy configurations like the one below: (I have a headless service called 'red' in my kubernetes, it points to my 'red' application) backend red server-template red 20 _http._tcp.red.default.svc.cluster.local:8080 inter 1s resolvers kube check In one line, we can enable automatic "scalling follow-up" in HAProxy. Baptiste -- Baptiste Assmann <bassm...@haproxy.com>
From 3f50598f8c50b6f140477e9870b47c0ce1317658 Mon Sep 17 00:00:00 2001 From: Baptiste Assmann <bed...@gmail.com> Date: Fri, 11 Aug 2017 09:58:27 +0200 Subject: [PATCH 1/3] MINOR: dns: Update analisys of TRUNCATED response for SRV records First implementation of the DNS parser used to consider TRUNCATED responses as errors and triggered a failover to an other query type (usually A to AAAA or vice-versa). When we query for SRV records, a TRUNCATED response still contains valid records we can exploit, so we shouldn't trigger a failover in such case. Note that we had to move the maching against the flag later in the response parsing (actually, until we can read the query type....) --- src/dns.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/dns.c b/src/dns.c index 00f7b10..24f9781 100644 --- a/src/dns.c +++ b/src/dns.c @@ -1061,9 +1061,6 @@ int dns_validate_dns_response(unsigned char *resp, unsigned char *bufend, struct flags = reader[0] * 256 + reader[1]; - if (flags & DNS_FLAG_TRUNCATED) - return DNS_RESP_TRUNCATED; - if ((flags & DNS_FLAG_REPLYCODE) != DNS_RCODE_NO_ERROR) { if ((flags & DNS_FLAG_REPLYCODE) == DNS_RCODE_NX_DOMAIN) return DNS_RESP_NX_DOMAIN; @@ -1148,6 +1145,12 @@ int dns_validate_dns_response(unsigned char *resp, unsigned char *bufend, struct reader += 2; } + /* TRUNCATED flag must be checked after we could read the query type + * because a TRUNCATED SRV query type response can still be exploited + */ + if (dns_query->type != DNS_RTYPE_SRV && flags & DNS_FLAG_TRUNCATED) + return DNS_RESP_TRUNCATED; + /* now parsing response records */ nb_saved_records = 0; for (i = 0; i < dns_p->header.ancount; i++) { -- 2.7.4
From 3262d45e71729d7796cc73efd6317f022722ca15 Mon Sep 17 00:00:00 2001 From: Baptiste Assmann <bed...@gmail.com> Date: Fri, 11 Aug 2017 10:31:22 +0200 Subject: [PATCH 2/3] MINOR: dns: update record dname matching for SRV query types DNS response for SRV queries look like this: - query dname looks like '_http._tcp.red.default.svc.cluster.local' - answer record dname looks like '3336633266663038.red.default.svc.cluster.local.' Of course, it never matches... and it triggers many false positive in the current code (which is suitable for A/AAAA/CNAME). This patch simply ignores this dname matching in the case of SRV query type. --- src/dns.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/dns.c b/src/dns.c index 24f9781..9c604fc 100644 --- a/src/dns.c +++ b/src/dns.c @@ -1172,7 +1172,7 @@ int dns_validate_dns_response(unsigned char *resp, unsigned char *bufend, struct /* check if the current record dname is valid. * previous_dname points either to queried dname or last CNAME target */ - if (memcmp(previous_dname, tmpname, len) != 0) { + if (dns_query->type != DNS_RTYPE_SRV && memcmp(previous_dname, tmpname, len) != 0) { free_dns_answer_item(dns_answer_record); if (i == 0) { /* first record, means a mismatch issue between queried dname -- 2.7.4
From 20a60ab0c8afcdbc10870ee095d95dec70cf9483 Mon Sep 17 00:00:00 2001 From: Baptiste Assmann <bed...@gmail.com> Date: Fri, 11 Aug 2017 10:37:20 +0200 Subject: [PATCH 3/3] MINOR: dns: update dns response buffer reading pointer due to SRV record DNS SRV records uses "dns name compression" to store the target name. "dns compression" principle is simple. Let's take the name below: 3336633266663038.red.default.svc.cluster.local. It can be stored "as is" in the response or it can be compressed like this: 3336633266663038<POINTER> and <POINTER> would point to the string '.red.default.svc.cluster.local.' availble in the question section for example. This mechanism allows storing much more data in a single DNS response. This means the flag "record->data_len" which stores the size of the record (hence the whole string, uncompressed) can't be used to move the pointer forward when reading responses. We must use the "offset" integer which means the real number of bytes occupied by the target name. If we don't do that, we can properly read the first SRV record, then we loose alignment and we start reading unrelated data (still in the response) leading to a false negative error treated as an "invalid" response... --- src/dns.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/dns.c b/src/dns.c index 9c604fc..f54885d 100644 --- a/src/dns.c +++ b/src/dns.c @@ -1301,7 +1301,6 @@ int dns_validate_dns_response(unsigned char *resp, unsigned char *bufend, struct free_dns_answer_item(dns_answer_record); return DNS_RESP_INVALID; } - reader++; dns_answer_record->data_len = len; memcpy(dns_answer_record->target, tmpname, len); dns_answer_record->target[len] = 0; @@ -1323,7 +1322,10 @@ int dns_validate_dns_response(unsigned char *resp, unsigned char *bufend, struct nb_saved_records += 1; /* move forward dns_answer_record->data_len for analyzing next record in the response */ - reader += dns_answer_record->data_len; + if (dns_answer_record->type == DNS_RTYPE_SRV) + reader += offset; + else + reader += dns_answer_record->data_len; /* Lookup to see if we already had this entry */ -- 2.7.4