Re: Build failure of 1.6 and openssl 0.9.8
Le 16/10/2015 22:42, Willy Tarreau a écrit : Hi Christopher, Marcus (in CC) reported that 1.6 doesn't build anymore on SuSE 11 (which uses openssl 0.9.8). After some digging, we found that it is caused by the absence of EVP_PKEY_get_default_digest_nid() which was introduced in 1.0.0 and which was introduced by this patch : commit 7969a33a01c3a70e48cddf36ea5a66710bd7a995 Author: Christopher Faulet <cfau...@qualys.com> Date: Fri Oct 9 11:15:03 2015 +0200 MINOR: ssl: Add support for EC for the CA used to sign generated certificate This is done by adding EVP_PKEY_EC type in supported types for the CA privat key when we get the message digest used to sign a generated X509 certificate So now, we support DSA, RSA and EC private keys. And to be sure, when the type of the private key is not directly supported, get its default message digest using the function 'EVP_PKEY_get_default_digest_nid'. We also use the key of the default certificate instead of generated it. So w are sure to use the same key type instead of always using a RSA key. Interestingly, not all 0.9.8 will see the same problem since SNI is not enabled by default, it requires a build option. This explains why on my old PC I didn't get this problem with the same version. I initially thought it would just be a matter of adding a #if on the openssl version but it doesn't appear that easy given that the previous code was different, so I have no idea how to fix this. Do you have any idea ? Probably we can have a block of code instead of EVP_PKEY_... on older versions and that will be fine. I even wonder if EC was supported on 0.9.8. It's unfortunate that we managed to break things just a few days before the release with code that looked obviously right :-( Thanks for any insight. Hi Willy, Damned! I generated a huge amount of disturbances with my paches! Really sorry for that. Add a #ifdef to check the OpenSSL version seems to be a good fix. I don't know if there is a workaround to do the same than EVP_PKEY_get_default_digest_nid() for old OpenSSL versions. This function is used to get default signature digest associated to the private key used to sign generated X509 certificates. It is called when the private key differs than EVP_PKEY_RSA, EVP_PKEY_DSA and EVP_PKEY_EC. It should be enough for most of cases (maybe all cases ?). By the way, I attached a patch to fix the bug. Regards, -- Christopher Faulet >From 76e79a8c8a98474f3caf701b75370f50729516b2 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@qualys.com> Date: Mon, 19 Oct 2015 13:59:24 +0200 Subject: [PATCH 2/2] BUILD: ssl: fix build error introduced in commit 7969a3 with OpenSSL < 1.0.0 The function 'EVP_PKEY_get_default_digest_nid()' was introduced in OpenSSL 1.0.0. So for older version of OpenSSL, compiled with the SNI support, the HAProxy compilation fails with the following error: src/ssl_sock.c: In function 'ssl_sock_do_create_cert': src/ssl_sock.c:1096:7: warning: implicit declaration of function 'EVP_PKEY_get_default_digest_nid' if (EVP_PKEY_get_default_digest_nid(capkey, ) <= 0) [...] src/ssl_sock.c:1096: undefined reference to `EVP_PKEY_get_default_digest_nid' collect2: error: ld returned 1 exit status Makefile:760: recipe for target 'haproxy' failed make: *** [haproxy] Error 1 So we must add a #ifdef to check the OpenSSL version (>= 1.0.0) to use this function. It is used to get default signature digest associated to the private key used to sign generated X509 certificates. It is called when the private key differs than EVP_PKEY_RSA, EVP_PKEY_DSA and EVP_PKEY_EC. It should be enough for most of cases. --- src/ssl_sock.c | 4 1 file changed, 4 insertions(+) diff --git a/src/ssl_sock.c b/src/ssl_sock.c index 35a3edf..7c82464 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -1091,12 +1091,16 @@ ssl_sock_do_create_cert(const char *servername, unsigned int serial, else if (EVP_PKEY_type (capkey->type) == EVP_PKEY_EC) digest = EVP_sha256(); else { +#if (OPENSSL_VERSION_NUMBER >= 0x100fL) int nid; if (EVP_PKEY_get_default_digest_nid(capkey, ) <= 0) goto mkcert_error; if (!(digest = EVP_get_digestbynid(nid))) goto mkcert_error; +#else + goto mkcert_error; +#endif } if (!(X509_sign(newcrt, capkey, digest))) -- 2.4.3
Re: haproxy 1.6.0 crashes
Hi Willy, Le 16/10/2015 19:07, Willy Tarreau a écrit : The SSL_CTX and SSL objects are reference-counted objects, so there is no problem. When a SSL_CTX object is created, its refcount is set to 1. When a SSL connection use it, it is incremented and when the connection is closed, it is decremented. Of course, it is also decremented when SSL_CTX_free is called. During a call to SSL_free or SSL_CTX_free, its reference count reaches 0, the SSL_CTX object is freed. Note that SSL_free and SSL_CTX_free can be called in any order. OK so the unused objects in the tree have a refcount of 1 while the used ones have 2 or more, thus the refcount is always valid. Good that also means we must not test if the tree is null or not in ssl_sock_close(), we must always free the ssl_ctx as long as it was dynamically created, so that its refcount decreases, otherwise it keeps increasing upon every reuse. No. Maybe my explanation was not really clear. The SSL_CTX refcount is not exposed. It is an internal parameter. So, it is not incremented when the SSL_CTX is pushed in the cache tree. The call to SSL_set_SSL_CTX increases the refcount and the call to SSL_free decrements it (when the connection is closed). And, of course, the call to SSL_CTX_free decrements it too. The SSL_CTX object is released when the refcount reaches 0. For a SSL_CTX object, SSL_CTX_free must be called once. When it is evicted from the cache tree (or when the tree is destroyed) _OR_ when the connection is closed if there is no cache tree. If we always release SSL_CTX objects when the SSL connection is closed, we will have undefined references for cached objects, leading to a segfault. So, if a call to SSL_CTX_free is called whilst a SSL connection uses the corresponding SSL_CTX object, there is no problem. Actually, this happens when a SSL_CTX object is evicted from the cache. There is no need to check if it is used by a connection or not. Not only it is not needed, but we must not. We do not track any reference count on SSL_CTX, it is done internally by openssl. The only thing we must do, is to know if it a generated certificate I totally agree. and to track if it is in the cache or not. And here I disagree for the reason explained above since this is already covered by the refcount. The refcount is not incremented when a SSL_CTX object is pushed in the cache. There is no way to manually increment or decrement it. So, we must really know if the SSL_CTX object was cached or not when the SSL connection is closed. Well, I'm not an openssl guru. It is possible to store and retrieve data on a SSL_CTX object using SSL_CTX_set_ex_data/SSL_CTX_get_ex_data functions. But I don't know if this a good practice to use it. And I don't know if this is expensive or not. That's also what Rémi suggested. I don't know how it's used, I'm seeing an index with it and that's already used for DH data, so I don't know how it mixes (if at all) with this. I'm not much concerned by the access cost in fact since we're supposed to access it once at session creation and once during the release. It's just that I don't understand how this works. Maybe the connection flag is simpler for now. Well, use SSL_CTX_set_ex_data/SSL_CTX_get_ex_data seems to work. But, I'm not a SSL expert, so maybe I missed something (and bugs related to my recent patches show that this is not false modesty...). I sent 2 fixes for this bug [1][2]. If you want I rework one of them, I will be happy to do it. [1] https://www.mail-archive.com/haproxy@formilux.org/msg19962.html [2] https://www.mail-archive.com/haproxy@formilux.org/msg19995.html Regards -- Christopher Faulet
Re: haproxy 1.6.0 crashes
Le 15/10/2015 10:51, Seri, Kim a écrit : Hi, all HAProxy 1.6.0 crashes in multiple certificates environment as belows, bind :443 ssl crt test.com.pem crt test2.com.pem ecdhe prime256v1 but, in single certificate environment, haproxy doesn't crash. bind :443 ssl crt test.com.pem ecdhe prime256v1 after applying commit d2cab92, haproxy seems to crash. Hi, I confirm the bug. Here is a very quick patch. Could you confirm that it works for you ? -- Christopher Faulet diff --git a/include/types/connection.h b/include/types/connection.h index dfbff6a..070d779 100644 --- a/include/types/connection.h +++ b/include/types/connection.h @@ -122,7 +122,10 @@ enum { /* This connection may not be shared between clients */ CO_FL_PRIVATE = 0x1000, - /* unused : 0x2000, 0x4000 */ + /* A dynamically generated SSL certificate was used for this connection */ + CO_FL_DYN_SSL_CTX = 0x2000, + + /* unused : 0x4000 */ /* This last flag indicates that the transport layer is used (for instance * by logs) and must not be cleared yet. The last call to conn_xprt_close() diff --git a/src/ssl_sock.c b/src/ssl_sock.c index 5319532..2829af8 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -1232,6 +1232,7 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, struct bind_conf *s) ctx = ssl_sock_get_generated_cert(serial, s); if (ctx) { /* switch ctx */ + conn->flags |= CO_FL_DYN_SSL_CTX; SSL_set_SSL_CTX(ssl, ctx); return SSL_TLSEXT_ERR_OK; } @@ -1271,6 +1272,9 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, struct bind_conf *s) if (s->generate_certs && (ctx = ssl_sock_generate_certificate(servername, s, ssl))) { /* switch ctx */ + struct connection *conn = (struct connection *)SSL_get_app_data(ssl); + + conn->flags |= CO_FL_DYN_SSL_CTX; SSL_set_SSL_CTX(ssl, ctx); return SSL_TLSEXT_ERR_OK; } @@ -3124,11 +3128,11 @@ static void ssl_sock_close(struct connection *conn) { if (conn->xprt_ctx) { #ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME - if (!ssl_ctx_lru_tree && objt_listener(conn->target)) { + if ((conn->flags & CO_FL_DYN_SSL_CTX) && !ssl_ctx_lru_tree) { SSL_CTX *ctx = SSL_get_SSL_CTX(conn->xprt_ctx); - if (ctx != objt_listener(conn->target)->bind_conf->default_ctx) -SSL_CTX_free(ctx); + SSL_CTX_free(ctx); } + conn->flags &= ~CO_FL_DYN_SSL_CTX, #endif SSL_free(conn->xprt_ctx); conn->xprt_ctx = NULL;
Re: 1.6 segfaults
Le 15/10/2015 13:49, Joel Moss a écrit : So you may be right on the two certs on the same line bug. Just removed one of the certs and so far, so good. Can you verify? FYI, I submit a quick patch[1]. Could you check it ? [1] https://www.mail-archive.com/haproxy@formilux.org/msg19948.html -- Christopher Faulet
[PATCH] BUG: ssl: Fix conditions to release SSL_CTX when a SSL connection is closed
Hi, Here is a proper patch to fix the recent bug reported on haproxy 1.6.0 when SNI is used. Willy, I didn't wait your reply to speed-up the code review. But if there is any problem with this patch, let me know. Regards, -- Christopher Faulet >From c89e1256113aa36826b00706094ccde98490684d Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@qualys.com> Date: Thu, 15 Oct 2015 15:29:34 +0200 Subject: [PATCH] BUG: ssl: Fix conditions to release SSL_CTX when a SSL connection is closed When a SSL connection is closed, if a SSL certificate was generated and if there is no cache to store it, it must be released. But, we must be sure that this is a generated certificate and not one of the default ones. This check was buggy when multiple certificates were used on the same bind line. We checked that the certificate is not the default one (ssl_ctx != bind_conf->default_ctx). But we should also have to check it against SNI certificates (bind_conf->sni_ctx and bind_conf->sni_wc_ctx). This bug was introducted with the commit d2cab92 and it leads to a segfault in certain circumstances. A SNI certificate can be released when a connection is closed and then reused for a new one. This commit fix the bug. Now, when a SSL certificate is generated, we set the flag 'CO_FL_DYN_SSL_CTX' on the corresponding connection. So, when a connection is closed, if this flag is set and if there is no cache to store the generated SSL certificates, we release it. This way to do is speedier than to compare a SSL certifacte to the default ones. --- include/types/connection.h | 5 - src/ssl_sock.c | 10 +++--- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/include/types/connection.h b/include/types/connection.h index dfbff6a..070d779 100644 --- a/include/types/connection.h +++ b/include/types/connection.h @@ -122,7 +122,10 @@ enum { /* This connection may not be shared between clients */ CO_FL_PRIVATE = 0x1000, - /* unused : 0x2000, 0x4000 */ + /* A dynamically generated SSL certificate was used for this connection */ + CO_FL_DYN_SSL_CTX = 0x2000, + + /* unused : 0x4000 */ /* This last flag indicates that the transport layer is used (for instance * by logs) and must not be cleared yet. The last call to conn_xprt_close() diff --git a/src/ssl_sock.c b/src/ssl_sock.c index 5319532..2829af8 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -1232,6 +1232,7 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, struct bind_conf *s) ctx = ssl_sock_get_generated_cert(serial, s); if (ctx) { /* switch ctx */ + conn->flags |= CO_FL_DYN_SSL_CTX; SSL_set_SSL_CTX(ssl, ctx); return SSL_TLSEXT_ERR_OK; } @@ -1271,6 +1272,9 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, struct bind_conf *s) if (s->generate_certs && (ctx = ssl_sock_generate_certificate(servername, s, ssl))) { /* switch ctx */ + struct connection *conn = (struct connection *)SSL_get_app_data(ssl); + + conn->flags |= CO_FL_DYN_SSL_CTX; SSL_set_SSL_CTX(ssl, ctx); return SSL_TLSEXT_ERR_OK; } @@ -3124,11 +3128,11 @@ static void ssl_sock_close(struct connection *conn) { if (conn->xprt_ctx) { #ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME - if (!ssl_ctx_lru_tree && objt_listener(conn->target)) { + if ((conn->flags & CO_FL_DYN_SSL_CTX) && !ssl_ctx_lru_tree) { SSL_CTX *ctx = SSL_get_SSL_CTX(conn->xprt_ctx); - if (ctx != objt_listener(conn->target)->bind_conf->default_ctx) -SSL_CTX_free(ctx); + SSL_CTX_free(ctx); } + conn->flags &= ~CO_FL_DYN_SSL_CTX, #endif SSL_free(conn->xprt_ctx); conn->xprt_ctx = NULL; -- 2.4.3
Re: haproxy 1.6.0 crashes
Le 15/10/2015 14:45, Seri, Kim a écrit : Christopher Faulet <cfaulet@...> writes: I confirm the bug. Here is a very quick patch. Could you confirm that it works for you ? Hi, I can confirm this patch fixes the crash!! cf. because of my mail service, I've changed my e-mail Thanks a lot. Great! Willy, is it ok to you if I add the CO_FL_DYN_SSL_CTX flag to track connections with a generated SSL certificate or do you prefer I find another way to fix the bug ? -- Christopher Faulet
[PATCH] MINOR: http: Add OPTIONS in supported http methods (found by, find_http_meth)
Hi, The 'OPTIONS' method was not in the list of supported HTTP methods and find_http_meth return HTTP_METH_OTHER instead of HTTP_METH_OPTIONS. Regards -- Christopher Faulet >From ec4669273b06bee074389baa2d00ef0202dbea1c Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@qualys.com> Date: Fri, 31 Jul 2015 14:26:57 +0200 Subject: [PATCH] MINOR: http: Add OPTIONS in supported http methods (found by find_http_meth) --- src/proto_http.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/proto_http.c b/src/proto_http.c index 6d4a6b3..32b9063 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -458,6 +458,9 @@ const struct http_method_desc http_methods[26][3] = { ['H' - 'A'] = { [0] = { .meth = HTTP_METH_HEAD, .len=4, .text="HEAD"}, }, + ['O' - 'A'] = { + [0] = { .meth = HTTP_METH_OPTIONS , .len=7, .text="OPTIONS" }, + }, ['P' - 'A'] = { [0] = { .meth = HTTP_METH_POST, .len=4, .text="POST"}, [1] = { .meth = HTTP_METH_PUT , .len=3, .text="PUT" }, -- 2.4.3
Minor SSL fixes
Hi, Here are some SSL fixes. The last one is a fix to a bug reported in a previous thread[1]. [1] https://www.mail-archive.com/haproxy@formilux.org/msg19243.html Regards, -- Christopher Faulet >From 4994166ef6be91e768607c67e431d5fc20fbde1e Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@qualys.com> Date: Tue, 28 Jul 2015 16:03:47 +0200 Subject: [PATCH 1/3] BUG/MINOR: ssl: fix management of the cache where forged certificates are stored First, the LRU cache must be initialized after the configuration parsing to correctly set its size. Next, the function 'ssl_sock_set_generated_cert' returns -1 when an error occurs (0 if success). In that case, the caller is responsible to free the memory allocated for the certificate. Finally, when a SSL certificate is generated by HAProxy but cannot be inserted in the cache, it must be freed when the SSL connection is closed. This happens when 'tune.ssl.ssl-ctx-cache-size' is set to 0. --- include/proto/ssl_sock.h | 2 +- src/ssl_sock.c | 38 -- 2 files changed, 21 insertions(+), 19 deletions(-) diff --git a/include/proto/ssl_sock.h b/include/proto/ssl_sock.h index c2156bb..1b6c081 100644 --- a/include/proto/ssl_sock.h +++ b/include/proto/ssl_sock.h @@ -72,7 +72,7 @@ int ssl_sock_load_global_dh_param_from_file(const char *filename); SSL_CTX *ssl_sock_create_cert(const char *servername, unsigned int serial, X509 *cacert, EVP_PKEY *capkey); SSL_CTX *ssl_sock_get_generated_cert(unsigned int serial, X509 *cacert); -void ssl_sock_set_generated_cert(SSL_CTX *ctx, unsigned int serial, X509 *cacert); +int ssl_sock_set_generated_cert(SSL_CTX *ctx, unsigned int serial, X509 *cacert); unsigned int ssl_sock_generated_cert_serial(const void *data, size_t len); #endif /* _PROTO_SSL_SOCK_H */ diff --git a/src/ssl_sock.c b/src/ssl_sock.c index 401ff67..deb658e 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -1127,8 +1127,9 @@ ssl_sock_get_generated_cert(unsigned int serial, X509 *cacert) return NULL; } -/* Set a certificate int the LRU cache used to store generated certificate. */ -void +/* Set a certificate int the LRU cache used to store generated + * certificate. Return 0 on success, otherwise -1 */ +int ssl_sock_set_generated_cert(SSL_CTX *ssl_ctx, unsigned int serial, X509 *cacert) { struct lru64 *lru = NULL; @@ -1136,11 +1137,13 @@ ssl_sock_set_generated_cert(SSL_CTX *ssl_ctx, unsigned int serial, X509 *cacert) if (ssl_ctx_lru_tree) { lru = lru64_get(serial, ssl_ctx_lru_tree, cacert, 0); if (!lru) - return; + return -1; if (lru->domain && lru->data) lru->free((SSL_CTX *)lru->data); lru64_commit(lru, ssl_ctx, cacert, 0, (void (*)(void *))SSL_CTX_free); + return 0; } + return -1; } /* Compute the serial that will be used to create/set/get a certificate. */ @@ -1164,13 +1167,13 @@ ssl_sock_generate_certificate(const char *servername, struct bind_conf *bind_con lru = lru64_get(serial, ssl_ctx_lru_tree, cacert, 0); if (lru && lru->domain) ssl_ctx = (SSL_CTX *)lru->data; - } - - if (!ssl_ctx) { - ssl_ctx = ssl_sock_create_cert(servername, serial, cacert, capkey); - if (lru) + if (!ssl_ctx && lru) { + ssl_ctx = ssl_sock_create_cert(servername, serial, cacert, capkey); lru64_commit(lru, ssl_ctx, cacert, 0, (void (*)(void *))SSL_CTX_free); + } } + else + ssl_ctx = ssl_sock_create_cert(servername, serial, cacert, capkey); return ssl_ctx; } @@ -2489,6 +2492,10 @@ ssl_sock_load_ca(struct bind_conf *bind_conf, struct proxy *px) if (!bind_conf || !bind_conf->generate_certs) return err; + if (global.tune.ssl_ctx_cache) + ssl_ctx_lru_tree = lru64_new(global.tune.ssl_ctx_cache); + ssl_ctx_lru_seed = (unsigned int)time(NULL); + if (!bind_conf->ca_sign_file) { Alert("Proxy '%s': cannot enable certificate generation, " "no CA certificate File configured at [%s:%d].\n", @@ -2838,7 +2845,6 @@ int ssl_sock_handshake(struct connection *conn, unsigned int flag) } reneg_ok: - /* Handshake succeeded */ if (!SSL_session_reused(conn->xprt_ctx)) { if (objt_server(conn->target)) { @@ -3082,6 +3088,11 @@ static int ssl_sock_from_buf(struct connection *conn, struct buffer *buf, int fl static void ssl_sock_close(struct connection *conn) { if (conn->xprt_ctx) { + if (!ssl_ctx_lru_tree && objt_listener(conn->target)) { + SSL_CTX *ctx = SSL_get_SSL_CTX(conn->xprt_ctx); + if (ctx != objt_listener(conn->target)->bind_conf->default_ctx) +SSL_CTX_free(ctx); + } SSL_free(conn->xprt_ctx); conn->xprt_ctx = NULL; sslconns--; @@ -5344,21 +5355,12 @@ static void __ssl_sock_init(void) #ifndef OPENSSL_NO_DH ssl_dh_ptr_index = SSL_CTX_get_ex_new_index(0, NULL, NULL, NULL, NULL); #endif - -#ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME - /* Add a global parameter for the LRU cache size */ - if (global.tune.ssl_ctx_cache)
Re: haproxy 1.6.0 crashes
Le 15/10/2015 16:55, Willy Tarreau a écrit : Hi Christopher, On Thu, Oct 15, 2015 at 03:22:52PM +0200, Christopher Faulet wrote: Le 15/10/2015 14:45, Seri, Kim a écrit : Christopher Faulet <cfaulet@...> writes: I confirm the bug. Here is a very quick patch. Could you confirm that it works for you ? Hi, I can confirm this patch fixes the crash!! cf. because of my mail service, I've changed my e-mail Thanks a lot. Great! Willy, is it ok to you if I add the CO_FL_DYN_SSL_CTX flag to track connections with a generated SSL certificate or do you prefer I find another way to fix the bug ? I'm still having doubts on the fix, because I feel like we're working around a design issue here. First, the problem is that it's unclear to me in which condition we may end up calling this code. How can it happen that we end up in this code with an empty LRU tree ? Can we generate cookies without a cert cache ? Or can the cert cache be empty with some certs still in use ? If the later, maybe instead we should keep a reference to the cache using the refcount so that we don't kill the entry as long as it's being used. Indeed, this is mostly a matter of being sure that we free an ssl_ctx that was allocated, so there should be other ways to do it than adding more SSL knowledge into the session. I'm not opposed to merging this fix as a quick one to fix the trouble for the affected users, but I'd prefer that we find a cleaner solution if possible. Hi, First the LRU tree is only initialized when the SSL certs generation is configured on a bind line. So, in the most of cases, it is NULL (it is not the same thing than empty). When the SSL certs generation is used, if the cache is not NULL, a such certificate is pushed in the cache and there is no need to release it when the connection is closed. But it can be disabled in the configuration. So in that case, we must free the generated certificate when the connection is closed. Then here, we have really a bug. Here is the buggy part: 3125) if (conn->xprt_ctx) { 3126) #ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME 3127) if (!ssl_ctx_lru_tree && objt_listener(conn->target)) { 3128) SSL_CTX *ctx = SSL_get_SSL_CTX(conn->xprt_ctx); 3129) if (ctx != 3130) SSL_CTX_free(ctx); 3131) } #endif 3133) SSL_free(conn->xprt_ctx); 3134) conn->xprt_ctx = NULL; 3135) sslconns--; 3136) } The check on the line 3127 is not enough to determine if this is a generated certificate or not. Because ssl_ctx_lru_tree is NULL, generated certificates, if any, must be freed. But here ctx should also be compared to all SNI certificates and not only to default_ctx. Because of this bug, when a SNI certificate is used for a connection, it is erroneously freed when this connection is closed. In my patch, I chosen to use a specific flag on the connection instead of doing certificates comparisons. This seems to me easier to understand and more efficient. But it could be discussed, there are many other solutions I guess. finally, we can of course discuss the design of this feature. There is no problem. I will be happy to find a more elegant way to handle it, if it is possible. -- Christopher Faulet
Re: haproxy 1.6.0 crashes
Le 16/10/2015 10:38, Willy Tarreau a écrit : Thus this sparks a new question : when the cache is disabled, are we sure to always free the ssl_ctx on all error paths after it's generated ? Or are we certain that we always pass through ssl_sock_close() ? That's a good question. By greping on SSL_free, it should be good. The other problem I'm having is related to how we manage the LRU cache. Is there a risk that we kill some objects in this cache while they're still in use ? The SSL_CTX and SSL objects are reference-counted objects, so there is no problem. When a SSL_CTX object is created, its refcount is set to 1. When a SSL connection use it, it is incremented and when the connection is closed, it is decremented. Of course, it is also decremented when SSL_CTX_free is called. During a call to SSL_free or SSL_CTX_free, its reference count reaches 0, the SSL_CTX object is freed. Note that SSL_free and SSL_CTX_free can be called in any order. So, if a call to SSL_CTX_free is called whilst a SSL connection uses the corresponding SSL_CTX object, there is no problem. Actually, this happens when a SSL_CTX object is evicted from the cache. There is no need to check if it is used by a connection or not. In my patch, I chosen to use a specific flag on the connection instead of doing certificates comparisons. This seems to me easier to understand and more efficient. But it could be discussed, there are many other solutions I guess. I'm not disagreeing with your proposal, it may even end up being the only solution. I'm just having a problem with keeping this information outside of the ssl_ctx itself while I think we have to deal with a ref count there due to it being present both in the tree and in sessions which make use of it. Very likely we'll have the flag in the connection to indicate that the cert was generated and must be freed one way or another, but I'm still bothered by checking the lru_tree when doing this because I fear that it means that we don't properly track the ssl_ctx's usage. We do not track any reference count on SSL_CTX, it is done internally by openssl. The only thing we must do, is to know if it a generated certificate and to track if it is in the cache or not. finally, we can of course discuss the design of this feature. There is no problem. I will be happy to find a more elegant way to handle it, if it is possible. Ideally we'd have the info in the ssl_ctx itself, but I remember that Emeric told me a while ago that we couldn't store anything in an ssl_ctx. Thus I can understand that we can't easily "tag" the ssl_ctx as being statically or dynamically allocated, which is why I understand the need for the flag on the connection as an alternative. Well, I'm not an openssl guru. It is possible to store and retrieve data on a SSL_CTX object using SSL_CTX_set_ex_data/SSL_CTX_get_ex_data functions. But I don't know if this a good practice to use it. And I don't know if this is expensive or not. Functionally, I agree with you. It would be better to keep info on SSL_CTX object inside this object. And, at the beginning, I considered using these functions. But I was not enough confident to do it. Maybe Emeric can enlighten us. -- Christopher Faulet
Re: [PATCH] BUG: ssl: Fix conditions to release SSL_CTX when a SSL connection is closed
Le 15/10/2015 16:50, Christopher Faulet a écrit : Hi, Here is a proper patch to fix the recent bug reported on haproxy 1.6.0 when SNI is used. Willy, I didn't wait your reply to speed-up the code review. But if there is any problem with this patch, let me know. Regards, After our discussion on this bug, I reworked my patch to use SSL_CTX_set_ex_data/SSL_CTX_get_ex_data functions. Willy, I let you choose the patch you prefer :) PS: I do some checks and AFAIK, it works. But a double-check will not to be too much... -- Christopher Faulet >From 171bb9ca0522c12d2f4f9a105c557e01c0011ecc Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@qualys.com> Date: Thu, 15 Oct 2015 15:29:34 +0200 Subject: [PATCH] BUG: ssl: Fix conditions to release SSL_CTX object when a connection is closed When a SSL connection is closed, if its associated SSL_CTX object was generated and if it was not cached[1], then it must be freed[2]. But, we must check that it was generated. And this check is buggy when multiple certificates are used on the same bind line. We check that the SSL_CTX object is not the default one (ssl_ctx != bind_conf->default_ctx). But it is not enough to determine if it was generated or not. We should also check it against SNI certificates (bind_conf->sni_ctx and bind_conf->sni_wc_ctx). This bug was introducted with the commit d2cab92 and it leads to a segfault in certain circumstances. A SNI certificate can be erroneously released when a connection is closed. This commit fix the bug. Now, when a SSL_CTX object is generated, we mark it using SSL_CTX_set_ex_data function. Then when the connection is closed, we check the presence of this mark using SSL_CTX_get_ex_data functon. If the SSL_CTX object is marked and not cached (because ssl_ctx_lru_tree is NULL), it is freed. More information on this bug can be found on the HAProxy mailing-list[3] [1] This happens when the cache to store generated SSL_CTX object does not exist (ssl_ctx_lru_tree == NULL) because the 'tune.ssl.ssl-ctx-cache-size' option is set to 0. This cache is also NULL when the dynamic generation of SSL certificates is used on no listener. [2] Cached SSL_CTX objects are released when the cache is destroyed (during HAProxy shutdown) or when one of them is evicted from the cache. [3] https://www.mail-archive.com/haproxy@formilux.org/msg19937.html --- src/ssl_sock.c | 23 --- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/src/ssl_sock.c b/src/ssl_sock.c index 5319532..35a3edf 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -153,8 +153,10 @@ static char *x509v3_ext_values[X509V3_EXT_SIZE] = { }; /* LRU cache to store generated certificate */ -static struct lru64_head *ssl_ctx_lru_tree = NULL; -static unsigned int ssl_ctx_lru_seed = 0; +static struct lru64_head *ssl_ctx_lru_tree = NULL; +static unsigned int ssl_ctx_lru_seed = 0; +static int gen_ssl_ctx_ptr_index = -1; +static int is_gen_ssl_ctx= 1; #endif // SSL_CTRL_SET_TLSEXT_HOSTNAME #if (defined SSL_CTRL_SET_TLSEXT_STATUS_REQ_CB && !defined OPENSSL_NO_OCSP) @@ -1128,6 +1130,9 @@ ssl_sock_do_create_cert(const char *servername, unsigned int serial, } #endif end: + if (!SSL_CTX_set_ex_data(ssl_ctx, gen_ssl_ctx_ptr_index, _gen_ssl_ctx)) + goto mkcert_error; + return ssl_ctx; mkcert_error: @@ -3124,11 +3129,10 @@ static void ssl_sock_close(struct connection *conn) { if (conn->xprt_ctx) { #ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME - if (!ssl_ctx_lru_tree && objt_listener(conn->target)) { - SSL_CTX *ctx = SSL_get_SSL_CTX(conn->xprt_ctx); - if (ctx != objt_listener(conn->target)->bind_conf->default_ctx) -SSL_CTX_free(ctx); - } + SSL_CTX *ctx = SSL_get_SSL_CTX(conn->xprt_ctx); + int *flag= SSL_CTX_get_ex_data(ctx, gen_ssl_ctx_ptr_index); + if (!ssl_ctx_lru_tree && flag != NULL && *flag == is_gen_ssl_ctx) + SSL_CTX_free(ctx); #endif SSL_free(conn->xprt_ctx); conn->xprt_ctx = NULL; @@ -5392,6 +5396,11 @@ static void __ssl_sock_init(void) #ifndef OPENSSL_NO_DH ssl_dh_ptr_index = SSL_CTX_get_ex_new_index(0, NULL, NULL, NULL, NULL); #endif + +#ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME + gen_ssl_ctx_ptr_index = SSL_CTX_get_ex_new_index(0, NULL, NULL, NULL, NULL); +#endif + } __attribute__((destructor)) -- 2.4.3
Re: haproxy 1.6.0 crashes
Le 19/10/2015 17:01, Willy Tarreau a écrit : On Mon, Oct 19, 2015 at 03:06:44PM +0200, Christopher Faulet wrote: OK so the unused objects in the tree have a refcount of 1 while the used ones have 2 or more, thus the refcount is always valid. Good that also means we must not test if the tree is null or not in ssl_sock_close(), we must always free the ssl_ctx as long as it was dynamically created, so that its refcount decreases, otherwise it keeps increasing upon every reuse. No. Maybe my explanation was not really clear. The SSL_CTX refcount is not exposed. It is an internal parameter. So, it is not incremented when the SSL_CTX is pushed in the cache tree. The call to SSL_set_SSL_CTX increases the refcount and the call to SSL_free decrements it (when the connection is closed). And, of course, the call to SSL_CTX_free decrements it too. The SSL_CTX object is released when the refcount reaches 0. For a SSL_CTX object, SSL_CTX_free must be called once. When it is evicted from the cache tree (or when the tree is destroyed) _OR_ when the connection is closed if there is no cache tree. If we always release SSL_CTX objects when the SSL connection is closed, we will have undefined references for cached objects, leading to a segfault. OK, I understood the opposite, which is that we kept a refcount for each user (cache and/or sessions). But then how do we know that an SSL_CTX is still in use when we want to evict it from the cache and that we must not free it ? Is it just the fact that between the moment it's picked from the cache using ssl_sock_get_generated_cert() and the moment it's associated to a session using SSL_set_SSL_CTX() it's not possible to yield and destroy the cached object so no race is possible here ? If so I'm fine with it for now (though it will become "fun" when we start to play with threads), I just want to be certain we're not overlooking this part as well. This is not an issue because when we get (or create) a SSL_CTX object then it is associated to a session, without any interruption. So it cannot be evicted from the cache in the middle. After this step, the refcount is >= 2, so, if the SSL_CTX object is evicted from the cache, the refcount is decremented and the SSL_CTX is not released. It will be automatically released with the closure of the last SSL connection using it. But, now this works for a non-threaded environment. Is there any plan to add thread support? If yes, this feature will not work. Also that raises another point : if the issue is related to SSL_CTX_free() being called on static contexts, then to me it means that these contexts were not properly refcounted when assigned to the SSL. Don't you think that we shouldn't instead do something like the following to properly refcount any context attached to an SSL and ensure that the SSL_CTX_free() can always be performed regardless of parallel activities in the LRU tree or anything else ? /* Alloc a new SSL session ctx */ conn->xprt_ctx = SSL_new(objt_server(conn->target)->ssl_ctx.ctx); + SSL_set_SSL_CTX(conn->xprt_ctx, objt_server(conn->target)->ssl_ctx.ctx); This last call will have no effect. Because the SSL_CTX is the same, this function returns immediately. Note that if the context changes, the refcount of the old one is decremented. But there is no issue here, because the static contexts are only released when HAProxy is stopped. Here is live cycle of static contexts: - HAProxy is started, static contexts are initialized by calling SSL_CTX_new (refcount is set to 1). - SSL connections use these contexts. SSL_new or SSL_set_SSL_CTX are called to assign a context to a SSL object. The refcount is incremented by 1 each time. When a SSL connection is closed, a call to SSL_free is done to release the SSL object and the refcount of the associated context is decremented. So the refcount is always greater or equal to 1. - HAPRoxy is stopped, all connections are closed, and finally, static contexts are freed by calling SSL_CTX_free. The refcount is equal to 1, so when SSL_CTX_free is called, it reaches 0 and the contexts is freed. The refcount is not incremented when a SSL_CTX object is pushed in the cache. There is no way to manually increment or decrement it. So, we must really know if the SSL_CTX object was cached or not when the SSL connection is closed. I'm having an issue here as well since the LRU's destroy callback is set to SSL_CTX_free. This we start with a non-null refcount. I'm sorry if I am not clear, but the problem I'm having could be described like this : - two sets of entities can use a shared resource at any instant : cache and SSL sessions ; - each of them uses SSL_CTX_free() at release time to release the object ; - SSL_CTX_free() takes care of the refcount to know if it must free or not, which means that these two entities above are each responsi
Re: haproxy 1.6.0 crashes
Le 20/10/2015 14:07, Willy Tarreau a écrit : On Tue, Oct 20, 2015 at 01:59:52PM +0200, Willy Tarreau wrote: Then my understanding is that we should instead proceed differently : - the cert is generated. It gets a refcount = 1. - we assign it to the SSL. Its refcount becomes two. - we try to insert it into the tree. The tree will handle its freeing using SSL_CTX_free() during eviction. - if we can't insert into the tree because the tree is disabled, then we have to call SSL_CTX_free() ourselves, then we'd rather do it immediately. It will more closely mimmick the case where the cert is added to the tree and immediately evicted by concurrent activity on the cache. - we never have to call SSL_CTX_free() during ssl_sock_close() because the SSL session only relies on openssl doing the right thing based on the refcount only. - thus we never need to know how the cert was created since the SSL_CTX_free() is either guaranteed or already done for generated certs, and this protects other ones against any accidental call to SSL_CTX_free() without having to track where the cert comes from. This patch does this, and based on my understanding of your explanations, it should do the right thing and be safe all the time. What's your opinion ? Yes, it should work and it avoids keeping extra info on generated certificates. Good idea ! -- Christopher Faulet
Re: haproxy 1.6.0 crashes
Le 20/10/2015 14:41, Willy Tarreau a écrit : On Tue, Oct 20, 2015 at 02:14:37PM +0200, Christopher Faulet wrote: Le 20/10/2015 14:07, Willy Tarreau a écrit : On Tue, Oct 20, 2015 at 01:59:52PM +0200, Willy Tarreau wrote: Then my understanding is that we should instead proceed differently : - the cert is generated. It gets a refcount = 1. - we assign it to the SSL. Its refcount becomes two. - we try to insert it into the tree. The tree will handle its freeing using SSL_CTX_free() during eviction. - if we can't insert into the tree because the tree is disabled, then we have to call SSL_CTX_free() ourselves, then we'd rather do it immediately. It will more closely mimmick the case where the cert is added to the tree and immediately evicted by concurrent activity on the cache. - we never have to call SSL_CTX_free() during ssl_sock_close() because the SSL session only relies on openssl doing the right thing based on the refcount only. - thus we never need to know how the cert was created since the SSL_CTX_free() is either guaranteed or already done for generated certs, and this protects other ones against any accidental call to SSL_CTX_free() without having to track where the cert comes from. This patch does this, and based on my understanding of your explanations, it should do the right thing and be safe all the time. What's your opinion ? Yes, it should work and it avoids keeping extra info on generated certificates. Good idea ! Thanks. Do you have a easy reproducer for the issue with the certs ? I tried a little bit but probably didn't test the proper sequence. Of course. Here is a little config file: global tune.ssl.default-dh-param 2048 daemon listen ssl_server mode tcp bind 127.0.0.1:4443 ssl crt srv1.test.com.pem crt srv2.test.com.pem timeout connect 5000 timeout client 3 timeout server 3 server srv A.B.C.D:80 You just need to generate 2 SSL certificates with 2 CN (here srv1.test.com and srv2.test.com). Then, by doing SSL requests with the first CN, there is no problem. But with the second CN, it should segfault on the 2nd request. openssl s_client -connect 127.0.0.1:4443 -servername srv1.test.com // OK openssl s_client -connect 127.0.0.1:4443 -servername srv1.test.com // OK But, openssl s_client -connect 127.0.0.1:4443 -servername srv2.test.com // OK openssl s_client -connect 127.0.0.1:4443 -servername srv2.test.com // KO -- Christopher Faulet
Re: Minor SSL fixes
Le 09/10/2015 10:27, Willy Tarreau a écrit : Hi Christopher, I applied the first two ones, but the last one seems to be doing a lot of stuff at the same time. It's not even clear to me whether it fixes something or improves something or does both, but the review is quite hard. Is it possible to cut it into functional parts ? In practice we always want to do one patch per feature or per bug fix. If you don't think it can be easily cut by now, we can still blindly apply it but I still don't feel at ease given the description which seems to cover multiple aspects :-/ Hi Willy, Thanks for your work and your feedback. I have split my patch in 3 parts. -- Christopher Faulet >From c9ceb5e9f575c012c3c63fc9f18f0416a01d7444 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@qualys.com> Date: Fri, 9 Oct 2015 10:53:31 +0200 Subject: [PATCH 1/3] MINOR: ssl: Read the file used to generate certificates in any order the file specified by the SSL option 'ca-sign-file' can now contain the CA certificate used to dynamically generate certificates and its private key in any order. --- src/ssl_sock.c | 24 ++-- 1 file changed, 10 insertions(+), 14 deletions(-) diff --git a/src/ssl_sock.c b/src/ssl_sock.c index 0703bc4..54cd6a9 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -2508,43 +2508,39 @@ ssl_sock_load_ca(struct bind_conf *bind_conf, struct proxy *px) Alert("Proxy '%s': cannot enable certificate generation, " "no CA certificate File configured at [%s:%d].\n", px->id, bind_conf->file, bind_conf->line); - err++; - } - - if (err) goto load_error; + } /* read in the CA certificate */ if (!(fp = fopen(bind_conf->ca_sign_file, "r"))) { Alert("Proxy '%s': Failed to read CA certificate file '%s' at [%s:%d].\n", px->id, bind_conf->ca_sign_file, bind_conf->file, bind_conf->line); - err++; goto load_error; } if (!(cacert = PEM_read_X509(fp, NULL, NULL, NULL))) { Alert("Proxy '%s': Failed to read CA certificate file '%s' at [%s:%d].\n", px->id, bind_conf->ca_sign_file, bind_conf->file, bind_conf->line); - fclose (fp); - err++; - goto load_error; + goto read_error; } + rewind(fp); if (!(capkey = PEM_read_PrivateKey(fp, NULL, NULL, bind_conf->ca_sign_pass))) { Alert("Proxy '%s': Failed to read CA private key file '%s' at [%s:%d].\n", px->id, bind_conf->ca_sign_file, bind_conf->file, bind_conf->line); - fclose (fp); - err++; - goto load_error; + goto read_error; } - fclose (fp); + fclose (fp); bind_conf->ca_sign_cert = cacert; bind_conf->ca_sign_pkey = capkey; return err; - load_error: - bind_conf->generate_certs = 0; + read_error: + fclose (fp); if (capkey) EVP_PKEY_free(capkey); if (cacert) X509_free(cacert); + load_error: + bind_conf->generate_certs = 0; + err++; return err; } -- 2.4.3 >From 1bcf51472a1944c807c4089e61d4b3ed1fa585a0 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@qualys.com> Date: Fri, 9 Oct 2015 11:15:03 +0200 Subject: [PATCH 2/3] MINOR: ssl: Add support for EC for the CA used to sign generated certificates This is done by adding EVP_PKEY_EC type in supported types for the CA private key when we get the message digest used to sign a generated X509 certificate. So now, we support DSA, RSA and EC private keys. And to be sure, when the type of the private key is not directly supported, we get its default message digest using the function 'EVP_PKEY_get_default_digest_nid'. We also use the key of the default certificate instead of generated it. So we are sure to use the same key type instead of always using a RSA key. --- include/proto/ssl_sock.h | 6 ++--- src/ssl_sock.c | 61 2 files changed, 39 insertions(+), 28 deletions(-) diff --git a/include/proto/ssl_sock.h b/include/proto/ssl_sock.h index b877580..24b4330 100644 --- a/include/proto/ssl_sock.h +++ b/include/proto/ssl_sock.h @@ -71,9 +71,9 @@ void tlskeys_finalize_config(void); int ssl_sock_load_global_dh_param_from_file(const char *filename); #endif -SSL_CTX *ssl_sock_create_cert(const char *servername, unsigned int serial, X509 *cacert, EVP_PKEY *capkey); -SSL_CTX *ssl_sock_get_generated_cert(unsigned int serial, X509 *cacert); -int ssl_sock_set_generated_cert(SSL_CTX *ctx, unsigned int serial, X509 *cacert); +SSL_CTX *ssl_sock_create_cert(struct connection *conn, const char *servername, unsigned int serial); +SSL_CTX *ssl_sock_get_generated_cert(unsigned int serial, struct bind_conf *bind_conf); +int ssl_sock_set_generated_cert(SSL_CTX *ctx, unsigned int serial, struct bind_conf *bind_conf); unsigned int ssl_sock_generated_cert_serial(const void *data, size_t len); #endif /* _PROTO_SSL_SOCK_H */ diff --git a/src/ssl_sock.c b/src/ssl_sock.c index 54cd6a9..03751a3 100644 --- a/src
Re: Minor SSL fixes
Le 09/10/2015 12:19, Willy Tarreau a écrit : On Fri, Oct 09, 2015 at 11:59:00AM +0200, Christopher Faulet wrote: Le 09/10/2015 10:27, Willy Tarreau a écrit : Hi Christopher, I applied the first two ones, but the last one seems to be doing a lot of stuff at the same time. It's not even clear to me whether it fixes something or improves something or does both, but the review is quite hard. Is it possible to cut it into functional parts ? In practice we always want to do one patch per feature or per bug fix. If you don't think it can be easily cut by now, we can still blindly apply it but I still don't feel at ease given the description which seems to cover multiple aspects :-/ Hi Willy, Thanks for your work and your feedback. I have split my patch in 3 parts. Thanks for the quick response. I've applied them. During a build attempt I noticed that your previous patch "BUG/MINOR: ssl: fix management..." broke the build here due to some missing #ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME that I had to reintroduce (openssl 0.9.8 here). I thought that an alternative could simply be to declare ssl_ctx_lru_tree oustide of the ifdef, but I noticed there is a new test in ssl_sock_close() based on ssl_ctx_lru_tree being NULL, so I'm having doubts now about doing this since I don't understand why this SSL_CTX_free() call which depends on the ability to build a cert on the fly only has to be called if the cert tree is not initialized :-/ I would appreciate it if you could double-check. Thanks! You're right. I didn't checked it on old versions of openssl. My bad. ssl_ctx_lru_tree could be defined outside the ifdef, but it is only used when SNI extension is available. So there is no reason to initialize it if there is no SNI. Then, when SNI is available, the tree can be NULL if the cache of generated certificates is disabled (tune.ssl.ssl-ctx-cache-size == 0). So, in this situation, we need to free the certificate when the SSL connection is closed to avoid memory leak. We could want to generate dynamically SSL certificates without any cache. -- Christopher Faulet
Re: certificate generation
Le 04/09/2015 23:32, Michael Rennecke a écrit : -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hallo, is it possible with HAProxy to generate a certificate for each incoming hostname on the fly? I will use subca for HAProxy. I think to generate the certificates on the fly is cooler, then a certificate for each hostname. I found possibilities to generate the certificate, but this doesn't work :-( bind unix@/var/run/haproxy_ssl_ecc.sock accept-proxy ssl crt /etc/haproxy/ecc_star.rennecke.dyndns.dk.pem ca-sign-file /etc/haproxy/ecc_subca.pem ecdhe secp521r1 user nobody generate-certificates ecc_subca.pem included the the subca and the key. The key has no pass phrase. I will balance some other (fun) TLDs with haproxy - my small home automation project Hi Michael, The "genereate-certificates" option creates certificates on the fly, but only for clients using TLS SNI extension to set the remote hostname. So be sure that your clients use it. -- Christopher Faulet
Filters bugfixes
Hi, Here are 2 patches fixing bugs in data filtering. -- Christopher >From bdfa4535d6fe1f0f5bca52a48d2e4205e41bbd81 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <christopher.fau...@capflam.org> Date: Tue, 21 Jun 2016 10:44:32 +0200 Subject: [PATCH 1/2] BUG/MEDIUM: filters: Fix data filtering when data are modified Filters can alter data during the parsing, i.e when http_data or tcp_data callbacks are called. For now, the update must be done by hand. So we must handle changes in the channel buffers, especially on the number of input bytes pending (buf->i). In addition, a filter can choose to switch channel buffers to do its updates. So, during data filtering, we must always use the right buffer and not use variable to reference them. Without this patch, filters cannot safely alter data during the data parsing. --- src/filters.c | 28 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/src/filters.c b/src/filters.c index 139440d..7f8fae4 100644 --- a/src/filters.c +++ b/src/filters.c @@ -418,12 +418,11 @@ int flt_http_data(struct stream *s, struct http_msg *msg) { struct filter *filter; - struct buffer *buf = msg->chn->buf; unsigned int buf_i; - intret = 0; + intdelta = 0, ret = 0; /* Save buffer state */ - buf_i = buf->i; + buf_i = msg->chn->buf->i; list_for_each_entry(filter, _flt(s)->filters, list) { unsigned int *nxt; @@ -440,9 +439,12 @@ flt_http_data(struct stream *s, struct http_msg *msg) *nxt = msg->next; if (FLT_OPS(filter)->http_data) { + unsigned int i = msg->chn->buf->i; + ret = FLT_OPS(filter)->http_data(s, filter, msg); if (ret < 0) break; + delta += (int)(msg->chn->buf->i - i); /* Update the next offset of the current filter */ *nxt += ret; @@ -450,18 +452,18 @@ flt_http_data(struct stream *s, struct http_msg *msg) /* And set this value as the bound for the next * filter. It will not able to parse more data than this * one. */ - buf->i = *nxt; + msg->chn->buf->i = *nxt; } else { /* Consume all available data and update the next offset * of the current filter. buf->i is untouched here. */ - ret = MIN(msg->chunk_len + msg->next, buf->i) - *nxt; + ret = MIN(msg->chunk_len + msg->next, msg->chn->buf->i) - *nxt; *nxt += ret; } } /* Restore the original buffer state */ - buf->i = buf_i; + msg->chn->buf->i = buf_i + delta; return ret; } @@ -806,12 +808,11 @@ static int flt_data(struct stream *s, struct channel *chn) { struct filter *filter; - struct buffer *buf = chn->buf; unsigned int buf_i; - intret = 0; + intdelta = 0, ret = 0; /* Save buffer state */ - buf_i = buf->i; + buf_i = chn->buf->i; list_for_each_entry(filter, _flt(s)->filters, list) { unsigned int *nxt; @@ -822,9 +823,12 @@ flt_data(struct stream *s, struct channel *chn) nxt = _NXT(filter, chn); if (FLT_OPS(filter)->tcp_data) { + unsigned int i = chn->buf->i; + ret = FLT_OPS(filter)->tcp_data(s, filter, chn); if (ret < 0) break; + delta += (int)(chn->buf->i - i); /* Increase next offset of the current filter */ *nxt += ret; @@ -832,11 +836,11 @@ flt_data(struct stream *s, struct channel *chn) /* And set this value as the bound for the next * filter. It will not able to parse more data than the * current one. */ - buf->i = *nxt; + chn->buf->i = *nxt; } else { /* Consume all available data */ - *nxt = buf->i; + *nxt = chn->buf->i; } /* Update value to be sure to have the last one when we @@ -846,7 +850,7 @@ flt_data(struct stream *s, struct channel *chn) } /* Restore the original buffer state */ - chn->buf->i = buf_i; + chn->buf->i = buf_i + delta; return ret; } -- 2.5.5 >From f7b54e739b1d2e56b0969fbd8ee9ba2f0e229639 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <christopher.fau...@capflam.org> Date: Tue, 21 Jun 2016 11:04:34 +0200 Subject: [PATCH 2/2] BUG/MINOR: filters: Fix HTTP parsing when a filter loops on data forwarding A filter can choose to loop on data forwarding. When this loop occurs in HTTP_MSG_ENDING state, http_foward_data callbacks are called twice because of a goto on the wrong label. A filter can also choose to loop at the end of a HTTP message, in http_end callback. Here the goto is good but the label is not at the right place. We must be sure to upate msg->sov value. --- src/proto_http.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/src/proto_http.c b/src/proto_http.c index 78c1ac3..4a6e6d3 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -6851,7 +6851,7 @@ http_msg_forward_body(struct stream *s, struct http_msg *msg) b_adv(chn->buf, ret); msg->next -= ret
Re: [PATCH] MAJOR: ssl: add 'tcp-fallback' bind option for SSL listeners
Le 09/02/2016 09:04, Willy Tarreau a écrit : thanks for this. It looks clean enough to be merged. I'm a little bit concerned with the addition of conn->ssl_detection_exp because we try to keep the connection struct as small as possible. But in this case there's no other place to store it. Thus I would change it to "exp" and ensure it can be used by whatever requires an expiration in the connection. That may be used in the future to wait for the proxy protocol header or for outgoing connection timeout for example. That also makes me wonder how we currently deal with a timeout during the SSL handshake, I'll have to check this. Maybe something is redundant, I don't remember. There's something interesting in your approach. When we started to implement SSL, I wondered if we could have a "tcp-request connection" rule such as "expect-ssl" or something like this so that we could decide whether or not we'll use SSL based on some ACLs. One of the difficulties I didn't solve was the ability to specify the SSL parameters in the rule. With your model we could easily imagine having a bind option to disable the SSL upgrade by default, and only enable it once the correct rule has been matched in tcp-request connection. Hi Willy, Thanks ! About the fallback timeout, maybe I missed something, but I didn't found any parameter to set a timeout on the SSL handshake. By the way, good point about the renaming of 'ssl_detection_exp' into 'exp'. It is always good to take a step back and think a minute on possible extension :) -- Christopher
[PATCH] MAJOR: ssl: add 'tcp-fallback' bind option for SSL listeners
>From a3b372da2463e98b13e016c9b56344757b0e94bc Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@qualys.com> Date: Wed, 29 Jul 2015 16:01:57 +0200 Subject: [PATCH] MAJOR: ssl: add 'tcp-fallback' bind option for SSL listeners This option can be use to fall back on TCP when a non-SSL connection is established on a SSL listener. A delay must be defined to set the maximum time to decide if an incoming connection is a SSL connection or a TCP connection. If the timeout is reached or if the SSL detection failed, we fall back on TCP. By setting it to 0, haproxy will wait infinitely more data to detect the connection type. If used, this option must be set on a ssl bind line: bind 0.0.0.0:1234 ssl ... tcp-fallback 250ms It is a good idea to set a timeout if you expect to handle 'server-initiated' protocols, like SMTP. Else the connection will be blocked on the SSL detection. If you are sure to handle 'client-initiated' protocols only, it is safe to set it to 0. Internally, the detection is inverted. Each connection is started as a TCP connection and we try to switch it in SSL by parsing the first few bytes sent by the client to detect SSL ClientHello message. --- doc/configuration.txt | 12 + include/proto/connection.h | 1 + include/proto/ssl_sock.h | 1 + include/types/connection.h | 5 ++- include/types/listener.h | 29 +++- src/connection.c | 4 ++ src/listener.c | 7 ++- src/session.c | 7 ++- src/ssl_sock.c | 107 - 9 files changed, 156 insertions(+), 17 deletions(-) diff --git a/doc/configuration.txt b/doc/configuration.txt index 7dd5744..89dcd0a 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -9960,6 +9960,18 @@ strict-sni a certificate. The default certificate is not used. See the "crt" option for more information. +tcp-fallback + This setting is only available when support for OpenSSL was built in. It + enables a TCP fallback for SSL listeners when a non-SSL connection is + established or when the timeout is reached before the SSL ClientHello message + is received. By setting the delay to 0, haproxy will wait infinitely. It is a + good idea to set a delay if you expect to handle 'server-initiated' + protocols, like SMTP. Else the connection will be blocked, waiting a SSL + ClientHello message that could never be received. The drawback using this + option for such protocols is to add a latency for all non-SSL connections. If + you are sure to handle 'client-initiated' protocols only, it is safe to set + it to 0. + tcp-ut Sets the TCP User Timeout for all incoming connections instanciated from this listening socket. This option is available on Linux since version 2.6.37. It diff --git a/include/proto/connection.h b/include/proto/connection.h index 952f9ea..b63aef1 100644 --- a/include/proto/connection.h +++ b/include/proto/connection.h @@ -482,6 +482,7 @@ static inline void conn_init(struct connection *conn) conn->target = NULL; conn->proxy_netns = NULL; LIST_INIT(>list); + conn->ssl_detection_exp = 0; } /* Tries to allocate a new connection and initialized its main fields. The diff --git a/include/proto/ssl_sock.h b/include/proto/ssl_sock.h index cb9a1e9..71dd727 100644 --- a/include/proto/ssl_sock.h +++ b/include/proto/ssl_sock.h @@ -43,6 +43,7 @@ int ssl_sock_is_ssl(struct connection *conn) } int ssl_sock_handshake(struct connection *conn, unsigned int flag); +int ssl_sock_detection(struct connection *conn, int flag); int ssl_sock_prepare_ctx(struct bind_conf *bind_conf, SSL_CTX *ctx, struct proxy *proxy); int ssl_sock_prepare_all_ctx(struct bind_conf *bind_conf, struct proxy *px); int ssl_sock_prepare_srv_ctx(struct server *srv, struct proxy *px); diff --git a/include/types/connection.h b/include/types/connection.h index dfbff6a..185c630 100644 --- a/include/types/connection.h +++ b/include/types/connection.h @@ -107,10 +107,10 @@ enum { CO_FL_SEND_PROXY= 0x0100, /* send a valid PROXY protocol header */ CO_FL_SSL_WAIT_HS = 0x0200, /* wait for an SSL handshake to complete */ CO_FL_ACCEPT_PROXY = 0x0400, /* receive a valid PROXY protocol header */ - /* unused : 0x0800 */ + CO_FL_SSL_DETECTION = 0x0800, /* try to detect ssl connection */ /* below we have all handshake flags grouped into one */ - CO_FL_HANDSHAKE = CO_FL_SEND_PROXY | CO_FL_SSL_WAIT_HS | CO_FL_ACCEPT_PROXY, + CO_FL_HANDSHAKE = CO_FL_SEND_PROXY | CO_FL_SSL_WAIT_HS | CO_FL_ACCEPT_PROXY | CO_FL_SSL_DETECTION, /* when any of these flags is set, polling is defined by socket-layer * operations, as opposed to data-layer. Transport is explicitly not @@ -257,6 +257,7 @@ struct connection { void *xprt_ctx; /* general purpose pointer, initialized to NULL */ void *owner; /* pointer to upper layer's entity (eg: str
[PATCH] BUG/MINOR: ssl: Be sure to use unique serial for regenerated certificates
>From 5d3a89943c9eb855837c0d606ae361825b6e2800 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@qualys.com> Date: Thu, 12 Nov 2015 11:35:51 +0100 Subject: [PATCH] BUG/MINOR: ssl: Be sure to use unique serial for regenerated certificates The serial number for a generated certificate was computed using the requested servername, without any variable/random part. It is not a problem from the moment it is not regenerated. But if the cache is disabled or when the certificate is evicted from the cache, we may need to regenerate it. It is important to not reuse the same serial number for the new certificate. Else clients (especially browsers) trigger a warning because 2 certificates issued by the same CA have the same serial number. So now, the serial is a static variable initialized with now_ms (internal date in milliseconds) and incremented at each new certificate generation. (Ref MPS-2031) --- include/proto/ssl_sock.h | 8 src/ssl_sock.c | 42 +++--- 2 files changed, 27 insertions(+), 23 deletions(-) diff --git a/include/proto/ssl_sock.h b/include/proto/ssl_sock.h index 24b4330..cb9a1e9 100644 --- a/include/proto/ssl_sock.h +++ b/include/proto/ssl_sock.h @@ -71,10 +71,10 @@ void tlskeys_finalize_config(void); int ssl_sock_load_global_dh_param_from_file(const char *filename); #endif -SSL_CTX *ssl_sock_create_cert(struct connection *conn, const char *servername, unsigned int serial); -SSL_CTX *ssl_sock_get_generated_cert(unsigned int serial, struct bind_conf *bind_conf); -int ssl_sock_set_generated_cert(SSL_CTX *ctx, unsigned int serial, struct bind_conf *bind_conf); -unsigned int ssl_sock_generated_cert_serial(const void *data, size_t len); +SSL_CTX *ssl_sock_create_cert(struct connection *conn, const char *servername, unsigned int key); +SSL_CTX *ssl_sock_get_generated_cert(unsigned int key, struct bind_conf *bind_conf); +int ssl_sock_set_generated_cert(SSL_CTX *ctx, unsigned int key, struct bind_conf *bind_conf); +unsigned int ssl_sock_generated_cert_key(const void *data, size_t len); #endif /* _PROTO_SSL_SOCK_H */ diff --git a/src/ssl_sock.c b/src/ssl_sock.c index 5200069..5cec6a4 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -1118,9 +1118,10 @@ static DH *ssl_get_tmp_dh(SSL *ssl, int export, int keylen); /* Create a X509 certificate with the specified servername and serial. This * function returns a SSL_CTX object or NULL if an error occurs. */ static SSL_CTX * -ssl_sock_do_create_cert(const char *servername, unsigned int serial, - struct bind_conf *bind_conf, SSL *ssl) +ssl_sock_do_create_cert(const char *servername, struct bind_conf *bind_conf, SSL *ssl) { + static unsigned int serial = 0; + X509 *cacert = bind_conf->ca_sign_cert; EVP_PKEY *capkey = bind_conf->ca_sign_pkey; SSL_CTX *ssl_ctx = NULL; @@ -1143,7 +1144,9 @@ ssl_sock_do_create_cert(const char *servername, unsigned int serial, * number */ if (X509_set_version(newcrt, 2L) != 1) goto mkcert_error; - ASN1_INTEGER_set(X509_get_serialNumber(newcrt), serial); + if (!serial) + serial = now_ms; + ASN1_INTEGER_set(X509_get_serialNumber(newcrt), serial++); /* Set duration for the certificate */ if (!X509_gmtime_adj(X509_get_notBefore(newcrt), (long)-60*60*24) || @@ -1248,21 +1251,22 @@ ssl_sock_do_create_cert(const char *servername, unsigned int serial, } SSL_CTX * -ssl_sock_create_cert(struct connection *conn, const char *servername, unsigned int serial) +ssl_sock_create_cert(struct connection *conn, const char *servername, unsigned int key) { struct bind_conf *bind_conf = objt_listener(conn->target)->bind_conf; - return ssl_sock_do_create_cert(servername, serial, bind_conf, conn->xprt_ctx); + + return ssl_sock_do_create_cert(servername, bind_conf, conn->xprt_ctx); } /* Do a lookup for a certificate in the LRU cache used to store generated * certificates. */ SSL_CTX * -ssl_sock_get_generated_cert(unsigned int serial, struct bind_conf *bind_conf) +ssl_sock_get_generated_cert(unsigned int key, struct bind_conf *bind_conf) { struct lru64 *lru = NULL; if (ssl_ctx_lru_tree) { - lru = lru64_lookup(serial, ssl_ctx_lru_tree, bind_conf->ca_sign_cert, 0); + lru = lru64_lookup(key, ssl_ctx_lru_tree, bind_conf->ca_sign_cert, 0); if (lru && lru->domain) return (SSL_CTX *)lru->data; } @@ -1272,12 +1276,12 @@ ssl_sock_get_generated_cert(unsigned int serial, struct bind_conf *bind_conf) /* Set a certificate int the LRU cache used to store generated * certificate. Return 0 on success, otherwise -1 */ int -ssl_sock_set_generated_cert(SSL_CTX *ssl_ctx, unsigned int serial, struct bind_conf *bind_conf) +ssl_sock_set_generated_cert(SSL_CTX *ssl_ctx, unsigned int key, struct bind_conf *bind_conf) { struct lru64 *lru = NULL; if (ssl_ctx_lru_tree) { - lru = lru64_get(serial, ssl_ctx_lru_tree, bind_conf->ca_sign_cert, 0); + lru = lru64_get
Re: [PATCH] MAJOR: ssl: add 'tcp-fallback' bind option for SSL listeners
Hi, I've slightly updated my patch to improve it and to fix some inconsistencies. First of all, now "ssl-upgrade" and "no-ssl-upgrade" actions can be used on "tcp-request content" rules _AND_ "tcp-request connection" rules, in a frontend _OR_ a backend definition. Then, these actions are now custom actions. I think this is cleaner this way. And finally, by default, no SSL upgrade is done when "defer-ssl-upgrade" option is used. So you need to use explicitly a "ssl-upgrade" rule to perform it. For a lack of finding the right place to do SSL upgrades when no "tcp-request" rule is defined, I've decided to change the default behavior. I've kept the "defer-ssl-upgrade" keyword, but now, "skip-ssl-upgrade" could be more appropriate. If you prefer, i can do the change. -- Christopher >From 05c439bfcb39ae297c1d1652f8e4fd72850ff775 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@qualys.com> Date: Thu, 3 Mar 2016 16:15:28 +0100 Subject: [PATCH] MAJOR: ssl: Allow the postpone of the SSL upgrade and a way to enable/disable it the bind option "defer-ssl-upgrade" has been added. It can be used iff "ssl" option is set. It postpone the SSL upgrade when evaluation of "tcp-request" rules. 2 actions have been added in "tcp-request connection/content" rules to conditionally enable/disable this upgrade: * ssl-upgrade * no-ssl-upgrade The "ssl-upgrade" is used to upgrade a TCP connection to a SSL connection and "no-ssl-upgrade" is used to disable it. This is only available for connections opened on a SSL listener configured with the "defer-ssl-upgrade" option and ignored for all others. When a "ssl-upgrade" rule matches, it invalidates "no-ssl-upgrade" rules that follow, and conversely. if neither the "ssl-upgrade" rules nor the "no-ssl-upgrade" rules matchn, no SSL upgrade is performed (this is the default behavior). Here are two examples: # Do the SSL connection upgrade only if the SNI field is set to "example.com" frontend l mode tcp bind *:1234 ssl crt /path/to/srv.pem defer-ssl-upgrade tcp-request inspect-delay 5s tcp-request content ssl-upgrade if { req.ssl_sni -i example.com } # by default no SSL upgrade is performed use_backend offloaded-ssl if ssl_fc default_backend raw-ssl # Accept SSL and non-SSL connctions on the same listener frontend l mode tcp bind *:1234 ssl crt /path/to/srv.pem defer-ssl-upgrade tcp-request inspect-delay 5s tcp-request content ssl-upgrade { req.ssl_ver gt 0 } use_backend ssl if ssl_fc default_backend tcp This feature is based on the work of PiBa-NL <piba.nl@gmail.com> --- doc/configuration.txt | 36 - include/proto/ssl_sock.h | 1 + include/types/connection.h | 4 +- include/types/listener.h | 25 ++-- src/listener.c | 7 +++- src/proto_tcp.c| 99 -- src/session.c | 6 +++ src/ssl_sock.c | 71 + 8 files changed, 230 insertions(+), 19 deletions(-) diff --git a/doc/configuration.txt b/doc/configuration.txt index 279d076..d490afa 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -8534,7 +8534,7 @@ tcp-request connection [{if | unless} ] accept the incoming connection. There is no specific limit to the number of rules which may be inserted. - Four types of actions are supported : + Following actions are supported : - accept : accepts the connection if the condition is true (when used with "if") or false (when used with "unless"). The first such rule executed ends @@ -8646,6 +8646,19 @@ tcp-request connection [{if | unless} ] TCP reset doesn't pass the first router, though it's still delivered to local networks. Do not use it unless you fully understand how it works. +- "ssl-upgrade" : +When SSL upgrade of a TCP connection was postponed, this action +performs it. This is only available for connections opened on a SSL +listener configured with the "defer-ssl-upgrade" option and ignored for +all others. The next "ssl-upgrade" and "no-ssl-upgrade" rules will have +no effect. See also the bind option "defer-ssl-upgrade". + +- "no-ssl-upgrade" : +This does the opposite of the previous action. It explicitly disables +any postponed SSL upgrade. The next "ssl-upgrade" and "no-ssl-upgrade" +rules will have no effect. See also the bind option "defer-ssl-upgrade". + + Note that the "if/unless" condition is optional. If no condition is
Re: [PATCH] MAJOR: ssl: add 'tcp-fallback' bind option for SSL listeners
Le 28/02/2016 23:32, PiBa-NL a écrit : > Hi Christopher, Willy, > > I've created a patch that can be applied on top of your tcp-fallback > patch to allow for 'conditional' offloading. > It shows 'ability' to have both offloading and pass-trough for ssl > depending on a sni name or other acl criteria. > > -I resorted rather heavily to changing flags and statuses to make it > step back to handshake negotiation. And although it seems to work, it > should be checked if any of this will possibly cause hangs / memory > leaks / other trouble.. > -Task timer added to fix excessive looping while waiting for > tcp-fallback to happen in case of a smtp connection. > -'peek'-ing for filling the data buffer so the FD stream keeps the > handshake data it had before for the ssl object to read if the > ssl_upgrade() is performed later. I couldnt find how to fill the ssl > object otherwise.. I think SSL_set_rbio() would be the solution there, > but that isnt available in current openssl releases. > > I hope you guys have the time and ability to check for problems > introduced by my 'hackery'.. Or just throw it in the bin and > re-implement it using a better understanding of how the flow is supposed > to go. > > Any feedback is appreciated :) > Hi guys, Sorry for the delay, I was pretty busy. I've checked your patch. It is quite interesting. First of all, I think that "tcp fallback" option and "conditional SSL offloading" are redundant. Your way to do is more flexible and generic, but the spirit is the same. So I'm tempted to trash my patch. Then, about your changes, there are 2/3 things that can be discussed. You've added integers in the connection structure. Willy already expressed the need to keep it as small as possible. So it might be fine to find another way to do. Using BIO is probably a good alternative (but no the easier one...). And, your way to peeking data seems to "penalize" all streams. This is not huge of course, but this could be improved. I'm also curious to see if it works with a complex "tcp-request content" ruleset. Finally, defining the "offloadssl" action on "tcp-response content" is useless and unused. But I really like your idea. So I've reworked it. Let me known if it's ok for you. It uses BIO to let connection structure untouched and to have no overhead when the feature is not used. I've done some test but not much. And I don't know if overhead imposed by BIO is huge or not. I'm not an OpenSSL expert, so maybe there is a better way to do. I've renamed "offloadssl" action to use "ssl-upgrade" and "no-ssl-upgrade" instead. My patch is done on the HEAD of the master branch, ignoring my previous patch. Note that you need to postpone the SSL upgrade to use this feature. To do so, you must add "defer-ssl-upgrade" on the bind line. Willy, if you are agree, this new patch can replace my previous one. Of course, all remarks are welcome. I'll try to do more tests. I quickly checked it on OpenSSL 0.9.8zg and 1.0.2f. -- Christopher >From 07133669314724b2b4462e0eb3e8cd114f3fd3b9 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@qualys.com> Date: Thu, 3 Mar 2016 16:15:28 +0100 Subject: [PATCH] MAJOR: ssl: Allow the postpone of the SSL upgrade and a way to enable/disable it the bind option "defer-ssl-upgrade" has been added. It can be used iff "ssl" option is set. It postpone the SSL upgrade after the evaluation of all "tcp-request content" rules. 2 actions have also been added in "tcp-request content" rules to conditionally enable/disable this upgrade: * ssl-upgrade * no-ssl-upgrade The "ssl-upgrade" is used to upgrade a TCP connection to a SSL connection and "no-ssl-upgrade" is used to disable it. This is only available for connections opened on a SSL listener configured with the "defer-ssl-upgrade" option and ignored for all others. When a "ssl-upgrade" rule matches, it invalidates "no-ssl-upgrade" rules that follow, and conversely. When all rules are evaluated, a postponed SSL upgrade is always performed when neither the "ssl-upgrade" rules nor the "no-ssl-upgrade" rules match. Here are two examples: # Do the SSL connection upgrade only if the SNI field is set to "example.com" frontend l mode tcp bind *:1234 ssl crt /path/to/srv.pem defer-ssl-upgrade tcp-request inspect-delay 5s tcp-request content ssl-upgrade if { req.ssl_sni -i example.com } tcp-request content no-ssl-upgrade use_backend offloaded-ssl if ssl_fc default_backend raw-ssl # Accept SSL and non-SSL connctions on the same listener frontend l mode tcp bind *:1234 ssl crt /path/to/srv.pem defer-ssl-upgrade tcp-request inspect-
[PATCH] BUG/MINOR: dumpstats: Fix the "Total bytes saved" counter in, backends stats
Hi, This is just a typo fix. -- Christopher Faulet >From bfc2be71794987fcfb0b5806607617431e23a65d Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@qualys.com> Date: Thu, 28 Apr 2016 15:09:31 +0200 Subject: [PATCH] BUG/MINOR: dumpstats: Fix the "Total bytes saved" counter in backends stats Instead of subtracting ST_F_COMP_OUT (Compression out) from ST_F_COMP_IN (Compressio in) in backends stats, ST_F_COMP_BYP (Compression bypass) was used. --- src/dumpstats.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/dumpstats.c b/src/dumpstats.c index bb62c41..bfd5b5c 100644 --- a/src/dumpstats.c +++ b/src/dumpstats.c @@ -3821,7 +3821,7 @@ static int stats_dump_fields_html(struct chunk *out, const struct field *stats, U2H(stats[ST_F_COMP_OUT].u.u64), stats[ST_F_COMP_IN].u.u64 ? (int)(stats[ST_F_COMP_OUT].u.u64 * 100 / stats[ST_F_COMP_IN].u.u64) : 0, U2H(stats[ST_F_COMP_BYP].u.u64), - U2H(stats[ST_F_COMP_IN].u.u64 - stats[ST_F_COMP_BYP].u.u64), + U2H(stats[ST_F_COMP_IN].u.u64 - stats[ST_F_COMP_OUT].u.u64), stats[ST_F_BOUT].u.u64 ? (int)((stats[ST_F_COMP_IN].u.u64 - stats[ST_F_COMP_OUT].u.u64) * 100 / stats[ST_F_BOUT].u.u64) : 0, (stats[ST_F_COMP_IN].u.u64 || stats[ST_F_COMP_BYP].u.u64) ? "":""); -- 2.5.5
Re: Gzip compression and transfer: chunked
Hi Vladimir, Sorry for my late reply, I was pretty busy these last days. I investigated a little on your problem. I've done some tests and carefully read the code. Everything seems to work as expected. I was not able to reproduce what you experienced with HAProxy 1.7.2. First, in HAProxy, except with a specific configuration, we never remove any "Transfer-Encoding" headers. And we never add any "Content-Length" headers. During the HTTP headers parsing, if the "Transfer-Encoding" header is found with the value "chunked", we remove all "Content-Length" headers, if any. This is always done, with or without the compression filter. Then, when the compression is enabled, if the response payload must be compressed by HAProxy, we remove all "Content-Length" headers and add "Transfer-Encoding: chunked" if needed. Then, when the response is truncated, there is no "Content-Encoding" header. So I'm tempted to think that the GZIP compression is not used on the response. So, if there is a bug, it is well hidden (as always with chunked HTTP payload ...*sigh*...). And it will be hard for me to hunt it without more information about exchanges between HAProxy and your backend. The best would be a full-packet network capture. But, if the bug is reproducible, it can be a good start to have the output of the following command: curl -H header_1 -H header_2 ... -v --raw --compress -o raw_response your_backend_url Be sure to set same headers than for a request on HAProxy. If the response contains sensitive information, you can remove them. I only need to have the chunk sizes. -- Christopher
[PATCH] 2 fixes for replace-header rules
Willy, Here are 2 patches to fix bugs on replace-header rules. The first one is similar to the one on redirect rules. It fixes an issue reported by Holger Just ("Strange behavior of sample fetches in http-response replace-header option"). The second one is a trivial fix :) -- Christopher Faulet >From 8c9496b9b568ec68312210af4a2cfcd3757c7230 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Wed, 8 Feb 2017 12:17:07 +0100 Subject: [PATCH 1/2] BUG/MEDIUM: http: Prevent replace-header from overwriting a buffer X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 This is the same fix as which concerning the redirect rules (0d94576c). The buffer used to expand the argument must be protected to prevent it being overwritten during build_logline() execution (the function used to expand the format string). This patch should be backported in 1.7, 1.6 and 1.5. It relies on commit b686afd ("MINOR: chunks: implement a simple dynamic allocator for trash buffers") for the trash allocator, which has to be backported as well. --- src/proto_http.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/src/proto_http.c b/src/proto_http.c index 80ba566..3d8005e 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -3419,13 +3419,22 @@ static int http_transform_header(struct stream* s, struct http_msg *msg, struct list *fmt, struct my_regex *re, int action) { - struct chunk *replace = get_trash_chunk(); + struct chunk *replace; + int ret = -1; + + replace = alloc_trash_chunk(); + if (!replace) + goto leave; replace->len = build_logline(s, replace->str, replace->size, fmt); if (replace->len >= replace->size - 1) - return -1; + goto leave; + + ret = http_transform_header_str(s, msg, name, name_len, replace->str, re, action); - return http_transform_header_str(s, msg, name, name_len, replace->str, re, action); + leave: + free_trash_chunk(replace); + return ret; } /* Executes the http-request rules for stream , proxy and -- 2.9.3 >From a1b4dd296f063bf2010116aca01c80b0df1e022d Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Wed, 8 Feb 2017 12:41:31 +0100 Subject: [PATCH 2/2] BUG/MINOR: http: Return an error when a replace-header rule failed on the response X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 Historically, http-response rules couldn't produce errors generating HTTP responses during their evaluation. This possibility was "implicitly" added with http-response redirect rules (51d861a4). But, at the time, replace-header rules were kept untouched. When such a rule failed, the rules processing was just stopped (like for an accept rule). Conversely, when a replace-header rule fails on the request, it generates a HTTP response (400 Bad Request). With this patch, errors on replace-header rule are now handled in the same way for HTTP requests and HTTP responses. This patch should be backported in 1.7 and 1.6. --- src/proto_http.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/proto_http.c b/src/proto_http.c index 3d8005e..5ad2956 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -3823,7 +3823,7 @@ resume_execution: rule->arg.hdr_add.name_len, >arg.hdr_add.fmt, >arg.hdr_add.re, rule->action)) -return HTTP_RULE_RES_STOP; /* note: we should report an error here */ +return HTTP_RULE_RES_BADREQ; break; case ACT_HTTP_DEL_HDR: -- 2.9.3
Re: Strange behavior of sample fetches in http-response replace-header option
Le 07/02/2017 à 16:41, Holger Just a écrit : Hi all, I just checked and the issue is still present in current master. Could you maybe have a look at this issue? It smells a bit like this could potentially be connected to the issue discussed in the thread "Lua sample fetch logging ends up in response when doing http-request redirect". However, I couldn't reproduce my issue when `http-request redirect`, neither with the patch nor without so it might also be a red herring. Hi Holger, You did well to reopen the issue. And you're right, this bug is similar to the one on redirect rules. I submitted a patch and it will be merged soon by Willy (see "[PATCH] 2 fixes for replace-header rules"). Thanks -- Christopher Faulet
[PATCH] BUG/MEDIUM: filters: Do not truncate HTTP response when body length is undefined
Hi, This patch fixes the bug reported by Kristjan Koppel and Brian Loss in the thread "Gzip compression and transfer: chunked". It should be backported in 1.7 Kristjan and Brian, thanks for your help. -- Christopher Faulet >From 23b39e87cce785437950552f5be0744b5768914a Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Wed, 8 Feb 2017 09:45:13 +0100 Subject: [PATCH] BUG/MEDIUM: filters: Do not truncate HTTP response when body length is undefined X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 Some users have experienced some troubles using the compression filter when the HTTP response body length is undefined. They complained about receiving truncated responses. In fact, the bug can be triggered if there is at least one filter attached to the stream but none registered to analyze the HTTP response body. In this case, when the body length is undefined, data should be forwarded without any parsing. But, because of a wrong check, we were starting to parse them. Because it was not expected, the end of response was not correctly detected and the response could be truncted. So now, we rely on HAS_DATA_FILTER macro instead of HAS_FILTER one to choose to parse HTTP response body or not. Furthermore, in http_response_forward_body, the test to not forward the server closure to the client has been updated to reflect conditions listed in the associated comment. And finally, in http_msg_forward_body, when the body length is undefined, we continue the parsing it until the server closes the connection without any on filters. So filters can safely stop to filter data during their parsing. This fix should be backported in 1.7 --- src/proto_http.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/src/proto_http.c b/src/proto_http.c index 3490aa7..0e419e7 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -6820,7 +6820,7 @@ int http_process_res_common(struct stream *s, struct channel *rep, int an_bit, s } skip_header_mangling: - if ((msg->flags & HTTP_MSGF_XFER_LEN) || HAS_FILTERS(s) || + if ((msg->flags & HTTP_MSGF_XFER_LEN) || HAS_DATA_FILTERS(s, rep) || (txn->flags & TX_CON_WANT_MSK) == TX_CON_WANT_TUN) { rep->analysers &= ~AN_RES_FLT_XFER_DATA; rep->analysers |= AN_RES_HTTP_XFER_BODY; @@ -6971,8 +6971,8 @@ int http_response_forward_body(struct stream *s, struct channel *res, int an_bit * keep-alive is set on the client side or if there are filters * registered on the stream, we don't want to forward a close */ - if ((msg->flags & HTTP_MSGF_TE_CHNK) || !msg->body_len || - HAS_FILTERS(s) || + if ((msg->flags & HTTP_MSGF_TE_CHNK) || !(msg->flags & HTTP_MSGF_XFER_LEN) || + HAS_DATA_FILTERS(s, res) || (txn->flags & TX_CON_WANT_MSK) == TX_CON_WANT_KAL || (txn->flags & TX_CON_WANT_MSK) == TX_CON_WANT_SCL) channel_dont_close(res); @@ -7064,11 +7064,10 @@ http_msg_forward_body(struct stream *s, struct http_msg *msg) goto missing_data_or_waiting; } - if (!(msg->flags & HTTP_MSGF_XFER_LEN) && !(chn->flags & CF_SHUTR) && - HAS_DATA_FILTERS(s, chn)) { - /* The server still sending data that should be filtered */ + /* The server still sending data that should be filtered */ + if (!(msg->flags & HTTP_MSGF_XFER_LEN) && !(chn->flags & CF_SHUTR)) goto missing_data_or_waiting; - } + msg->msg_state = HTTP_MSG_ENDING; ending: -- 2.9.3
Re: Gzip compression and transfer: chunked
Hi guys, Could you check if the attached patch fixes your bug please ? If I'm right, the bug is about a premature close of the server connection when the content length cannot be determined (neither "Content-Length" nor "Transfer-encoding" headers) if a filter is used (here, the compression). Vladimir, I don't know if this will fix your bug because I need more information to fully understand your problem. It could be an unrelated bug. But, with a bit of luck, it will work. -- Christopher diff --git a/src/proto_http.c b/src/proto_http.c index dd5c731..5889566 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -6820,7 +6820,7 @@ int http_process_res_common(struct stream *s, struct channel *rep, int an_bit, s } skip_header_mangling: - if ((msg->flags & HTTP_MSGF_XFER_LEN) || HAS_FILTERS(s) || + if ((msg->flags & HTTP_MSGF_XFER_LEN) || HAS_DATA_FILTERS(s, rep) || (txn->flags & TX_CON_WANT_MSK) == TX_CON_WANT_TUN) { rep->analysers &= ~AN_RES_FLT_XFER_DATA; rep->analysers |= AN_RES_HTTP_XFER_BODY; @@ -6971,8 +6971,8 @@ int http_response_forward_body(struct stream *s, struct channel *res, int an_bit * keep-alive is set on the client side or if there are filters * registered on the stream, we don't want to forward a close */ - if ((msg->flags & HTTP_MSGF_TE_CHNK) || !msg->body_len || - HAS_FILTERS(s) || + if ((msg->flags & HTTP_MSGF_TE_CHNK) || !(msg->flags & HTTP_MSGF_XFER_LEN) || + HAS_DATA_FILTERS(s, res) || (txn->flags & TX_CON_WANT_MSK) == TX_CON_WANT_KAL || (txn->flags & TX_CON_WANT_MSK) == TX_CON_WANT_SCL) channel_dont_close(res); @@ -7064,8 +7064,7 @@ http_msg_forward_body(struct stream *s, struct http_msg *msg) goto missing_data_or_waiting; } - if (!(msg->flags & HTTP_MSGF_XFER_LEN) && !(chn->flags & CF_SHUTR) && - HAS_DATA_FILTERS(s, chn)) { + if (!(msg->flags & HTTP_MSGF_XFER_LEN) && !(chn->flags & CF_SHUTR)) { /* The server still sending data that should be filtered */ goto missing_data_or_waiting; }
Re: Gzip compression and transfer: chunked
Le 03/02/2017 à 14:36, Kristjan Koppel a écrit : Hi! I seem to have run into the same (or at least similar) problem as reported by Vladimir Mihailenco a little while ago. I'm running HAProxy v1.7.2 and my backend server is etcd v2.3.7. The client application is using HTTP/1.0 and I have compression enabled in HAProxy. With this configuration everything worked fine with HAProxy v1.6.10, but after upgrading to v1.7 the response is almost always cut off at about 4KB (let's say roughly 9 times out of 10, but it seems to be completely random). If I switch the client to using HTTP/1.1 or comment out the compression lines in HAProxy config, then everything is fine again. Hi Kristjan and Brian, Thanks for these information. This helped me to find a bug. I don't know yet if this is the same problem than Vladimir has experienced. I'll try to send you a patch very soon. For now, I need to test all cases to be sure that everything is fixed. -- Christopher
Re: Gzip compression and transfer: chunked
Le 23/01/2017 à 11:54, Vladimir Mihailenco a écrit : Hi, I am using haproxy as load balancer/reverse proxy for Rails/Go application. I am upgrading from working Haproxy 1.6 config to 1.7.2. And it looks like I need to change my existing config, because Haproxy 1.7 truncates responses from Rails/Rack application. With Haproxy 1.6 and enabled compression - i can load full HTML (200kb) - HTML is not compressed - Transfer-encoding: "chunked" - no Content-Length header With same config Haproxy 1.7 - only first 14kb are avalable - no Transfer-encoding - Content-Length: 14359 With Haproxy 1.7 and compression disabled - full HTML is available - HTML is not compressed - Transfer-encoding: "chunked" - no Content-Length header Any recommendations? Should I disable compression from Rails/Rack app? Hi, Could you share your configurations, the both please ? And if possible, the request/response headers for all scenarios. The compression was rewritten in 1.7. So it is possible that something was broken. Headers returned by your backend could be useful too. -- Christopher
Re: Gzip compression and transfer: chunked
Le 24/01/2017 à 10:55, Vladimir Mihailenco a écrit : This is the config - https://gist.github.com/vmihailenco/9010ad37f5aeb800095a6b18909ae7d5. Backends don't have any options. I already tried to remove `http-reuse safe`, but it does not make any difference. Haproxy 1.7 with compression (HTML not fully loaded) - https://gist.github.com/vmihailenco/05bda6e7a49b6f78cd2f749abb0cf5b3 Haproxy 1.7 without compression (HTML fully loaded) - https://gist.github.com/vmihailenco/d8732e53acac3769a85b59afd7336bab Haproxy 1.7 with compression and Rails configured to set Content-Length via config.middleware.use Rack::ContentLength (HTML fully loaded) - https://gist.github.com/vmihailenco/13a809f486c4e1833ef813a019549180 Hi, Thanks for details. There are some things that puzzle me. I guess that when you disable the compression, it means that you comment "compression" lines in the frontend section. In that case, we can see the response is chunked. Because it is untouched by HAProxy, this comes from your backend. Here there is no problem. But when the compression is enabled, there is a Content-Length header and no Transfer-Encoding header. That's really strange because, HAProxy never adds Content-Length header, at any place. So, I'm tempted to think that it comes from your backend. But that's contradicts my previous remark. And there is no Content-Encoding header. So, it means that HAProxy didn't compressed the response. So, to be sure, if possible, it could be helpful to have tcpdump of data exchanged between HAProxy and your backends (or something similar, as you prefer). Send me them in private to not flood the ML. In the meantime, I will try to investigate. With Haproxy 1.6 and enabled compression - i can load full HTML (200kb) - HTML is not compressed - Transfer-encoding: "chunked" - no Content-Length header Just to be sure, here there is a typo error. You meant "HTML is compressed", right ? -- Christopher
Re: TLS-PSK: making a http(s) lookup call from inside haproxy code
Le 22/02/2017 à 16:02, thierry.fourn...@arpalert.org a écrit : On Wed, 22 Feb 2017 15:43:36 +0100 Braňo Žarnovičan <zarnovi...@gmail.com> wrote: Options: (a) implement lookup call in C I should be able to whip up simple http 1.0 request via low-level socket programming. However, I would like some more, fancier features like https, persistent-connections, basic-auth, handle timeouts, etc. Even with the simple socket code I'm not sure, how will that play with haproxy's event-driven nature. I would appreciate if someone could point me to an example where haproxy is doing something similar already. Hi, there are no way to implement easyly http request from haproxy. If you are looking for an example, you can look the code of SPOE, the stats page, stats CLI or the Lua code for "core.socket". The idea is to create a client applet and use an internal proxy to process connection and the data exchange and the SSL. The HTTP protocol as client must be implemented in our side. I just took a quick look at the patch of Nenad, but it seems impossible to do asynchronous processing in PSK callback functions. If I'm right, you cannot loop on these callbacks, waiting for the completion of an external lookup. So you should do your HTTP request synchronously, and you certainly do not want to do that ! Using an applet here, as Thierry suggested, will not help you. (b) integrate it with Lua Lua sounds like a better option for writing custom code to HAproxy. However, I'm afraid that I wouldn't be able to hook it to the TLS handshake itself (that stage is too early in the process). Seems, that it's not a good use-case for Lua. I confirm, you cant have a hook in the https, and you cant configure the https parameters. Maybe in a fture version, for now, I'm waiting some feedback about the actual process. An other way is to use the new SPOE protocol to forward some data at your own service which will process SSL. Look for an exemple of SPEO client ins the directory "contrib/spoa_example". The SPOE (Stream Processing Offload Engine), as its name said, must be used to offload processing on streams. So, it cannot be used during the SSL handshake, because there is not yet stream at this step. This is not a limitation of the SPOE in itself, but of the filters API. There is no hook to handle TCP/SSL connections creation (not yet). BTW, in the thread about the TLS-PSK support, it was suggested to use a map to handle identities. When it will be done, it will be possible to dynamically update the map. -- Christopher Faulet
Re: Opinion about blog post of SPOE
Le 15/02/2017 à 02:43, Aleksandar Lazic a écrit : Hi. Due to the fact that I like this SPOE I have started to write about it ;-) https://me2digital.online/2017/02/15/haproxy-1-7-feature-spoe/ Is the concept displayed correct in the picture? Are the data (tcp, http-header, http-body) quoted when HAProxy send it to the SPOEA? I'm interested to create a spoea for https://modsecurity.org/ do anyone else see a need for this? Hi Aleks, Nice to see someone interested by the SPOE and by implementing a service using it. Your picture is correct. The protocol used to communicate with agents (the SPOP) is a binary protocol. So data are not quoted. About the modsecurity, it is definitely a good candidate for a SPOA. Note that for now, it is not possible to send all the HTTP request (Header + Body). With the option "http-buffer-request", you could handle simple/small requests. But it will not work for chunked or too big payloads. For now, I'm working on SPOE improvements. The current version is an "experimental" version with many flaws and limited features. But the payload filtering is definitely on the roadmap (without deadline however). This requires changes in the HTTP parser, so it is a bit tricky. And new sample fetches need to be added. So there is still a lot of work before you can implement a fully functional WAF. But I'm on it and all help/suggestion/remarks are welcome :) -- Christopher Faulet
Re: Opinion about blog post of SPOE
Le 16/02/2017 à 12:41, Aleksandar Lazic a écrit : Do you think that there will be also big changes in the protocol? No not really. The protocol should remain mostly unchanged. In fact, except new "capabilities", there are no big changes. And these capabilities will only influence how frames will be exchanged between HAProxy and SPOAs. With the current version, when a frame is sent to an agent, the connection is "locked", waiting the acknowledgment from the agent and cannot be used to handle another frame. So connections are underused and you need to have more opened connections than necessary. With the next version, it will be possible to multiplex frames on connections. It will also be possible to send fragmented frames to agent when the payload size exceeds the frame size. However, it will not possible to receive such kind of frames in HAProxy for now. -- Christopher Faulet
Re: [PATCH] MAJOR: filters: Add filters support
On 18/09/2016 04:17, Bertrand Jacquin wrote: > Hi Christopher and Willy, > > Today I noticed data corruption when haproxy is used for compression > offloading. I bisected twice, and it lead to this specific commit but > I'm not 100% confident this commit is the actual root cause. > > HTTP body coming from the nginx backend is consistent, but HTTP headers > are different depending on the setup I'm enabling. Data corruption only > happens with transfer encoding chunked. HTTP body coming then from > haproxy to curl can be randomly corrupted, I attached a diff > (v1.7-dev1-50-gd7c9196ae56e.Transfer-Encoding-chunked.diff) revealing an > unrelated blob like TLS structure in the middle of the javascript. For > example, you will find my x509 client certificate in there > > I'm also attaching HTTP headers from haproxy to nginx may that help. > > Note that I tested with zlib 1.2.8 and libslz 1.0.0, result remains the > same in both case. > > Here is the setup I am using: > > HA-Proxy version 1.7-dev4-41d5e3a 2016/08/14 > Copyright 2000-2016 Willy Tarreau> > Build options : > TARGET = linux2628 > CPU = generic > CC = armv7a-hardfloat-linux-gnueabi-gcc > CFLAGS = -march=native -O2 -pipe -fomit-frame-pointer > -fno-strict-aliasing > OPTIONS = USE_LIBCRYPT=1 USE_GETADDRINFO=1 USE_SLZ=1 USE_OPENSSL=1 > USE_PCRE=1 USE_PCRE_JIT=1 > > Default settings : > maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 > > Encrypted password support via crypt(3): yes > Built with libslz for stateless compression. > Compression algorithms supported : identity("identity"), > deflate("deflate"), raw-deflate("deflate"), gzip("gzip") > Built with OpenSSL version : OpenSSL 1.0.2h 3 May 2016 > Running on OpenSSL version : OpenSSL 1.0.2h 3 May 2016 > OpenSSL library supports TLS extensions : yes > OpenSSL library supports SNI : yes > OpenSSL library supports prefer-server-ciphers : yes > Built with PCRE version : 8.38 2015-11-23 > PCRE library supports JIT : yes > Built without Lua support > Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT > IP_FREEBIND > > Available polling systems : > epoll : pref=300, test result OK > poll : pref=200, test result OK >select : pref=150, test result OK > Total: 3 (3 usable), will use epoll. > > Available filters : > [COMP] compression > [TRACE] trace > Hi Bertrand, I will investigate. -- Christopher
Re: [PATCH] MAJOR: filters: Add filters support
Le 18/09/2016 à 04:17, Bertrand Jacquin a écrit : > Today I noticed data corruption when haproxy is used for compression > offloading. I bisected twice, and it lead to this specific commit but > I'm not 100% confident this commit is the actual root cause. > > HTTP body coming from the nginx backend is consistent, but HTTP headers > are different depending on the setup I'm enabling. Data corruption only > happens with transfer encoding chunked. HTTP body coming then from > haproxy to curl can be randomly corrupted, I attached a diff > (v1.7-dev1-50-gd7c9196ae56e.Transfer-Encoding-chunked.diff) revealing an > unrelated blob like TLS structure in the middle of the javascript. For > example, you will find my x509 client certificate in there > > I'm also attaching HTTP headers from haproxy to nginx may that help. > > Note that I tested with zlib 1.2.8 and libslz 1.0.0, result remains the > same in both case. Hi Bertrand, I've done some tests. Unfortunately, I'm unable to reproduce the bug. So I need more information. First, I need to known how you hit it. Is this happen under load or randomly when you do a single request ? Then, do you still have the bug when you are not using SSL ? Let me also know how often the bug appear. And finally, If you can share with me your HA and Nginx configurations, this could help. Thanks, -- Christopher
[PATCH] New filter callbacks: attach, detach and stream_set_backend
Hi Willy, Here are quite old pending patches I had to submit. Thanks, -- Christopher >From a20ac171da06f16cb0bc1fb49cbf1939156cb2af Mon Sep 17 00:00:00 2001 From: Christopher Faulet <christopher.fau...@capflam.org> Date: Tue, 21 Jun 2016 11:42:37 +0200 Subject: [PATCH 1/3] MEDIUM: filters: Add attch/detach and stream_set_backend callbacks New callbacks have been added to handle creation and destruction of filter instances: * 'attach' callback is called after a filter instance creation, when it is attached to a stream. This happens when the stream is started for filters defined on the stream's frontend and when the backend is set for filters declared on the stream's backend. It is possible to ignore the filter, if needed, by returning 0. This could be useful to have conditional filtering. * 'detach' callback is called when a filter instance is detached from a stream, before its destruction. This happens when the stream is stopped for filters defined on the stream's frontend and when the analyze ends for filters defined on the stream's backend. In addition, the callback 'stream_set_backend' has been added to know when a backend is set for a stream. It is only called when the frontend and the backend are not the same. And it is called for all filters attached to a stream (frontend and backend). Finally, the TRACE filter has been updated. --- include/types/filters.h | 34 +++--- src/filters.c | 27 --- src/flt_trace.c | 48 ++-- 3 files changed, 97 insertions(+), 12 deletions(-) diff --git a/include/types/filters.h b/include/types/filters.h index 62fcfb1..62b457d 100644 --- a/include/types/filters.h +++ b/include/types/filters.h @@ -64,14 +64,32 @@ struct flt_kw_list { * number of errors encountered. * * + * - attach : Called after a filter instance creation, when it is + * attached to a stream. This happens when the stream + * is started for filters defined on the stream's + * frontend and when the backend is set for filters + * declared on the stream's backend. + * Returns a negative value if an error occurs, 0 if + * the filter must be ignored for the stream, any other + * value otherwise. * - stream_start: Called when a stream is started. This callback will - * only be called for filters defined on a proxy with - * the frontend capability. + * only be called for filters defined on the stream's + * frontend. + * Returns a negative value if an error occurs, any + * other value otherwise. + * - stream_set_backend : Called when a backend is set for a stream. This + * callbacks will be called for all filters attached + * to a stream (frontend and backend). * Returns a negative value if an error occurs, any * other value otherwise. * - stream_stop : Called when a stream is stopped. This callback will - * only be called for filters defined on a proxy with - * the frontend capability. + * only be called for filters defined on the stream's + * frontend. + * - detach : Called when a filter instance is detached from a + * stream, before its destruction. This happens when + * the stream is stopped for filters defined on the + * stream's frontend and when the analyze ends for + * filters defined on the stream's backend. * * * - channel_start_analyze: Called when a filter starts to analyze a channel. @@ -133,12 +151,14 @@ struct flt_ops { int (*init) (struct proxy *p, struct flt_conf *fconf); void (*deinit)(struct proxy *p, struct flt_conf *fconf); int (*check) (struct proxy *p, struct flt_conf *fconf); - /* * Stream callbacks */ - int (*stream_start) (struct stream *s, struct filter *f); - void (*stream_stop) (struct stream *s, struct filter *f); + int (*attach)(struct stream *s, struct filter *f); + int (*stream_start) (struct stream *s, struct filter *f); + int (*stream_set_backend)(struct stream *s, struct filter *f, struct proxy *be); + void (*stream_stop) (struct stream *s, struct filter *f); + void (*detach)(struct stream *s, struct filter *f); /* * Channel callbacks diff --git a/src/filters.c b/src/filters.c index 7f8fae4..a1b36ba 100644 --- a/src/filters.c +++ b/src/filters.c @@ -304,1
Re: [PATCH] MAJOR: filters: Add filters support
Le 22/09/2016 à 04:05, Bertrand Jacquin a écrit : > On Tue, Sep 20, 2016 at 08:16:09AM +0200, Willy Tarreau wrote: >> Hi Bertrand, >> >> On Tue, Sep 20, 2016 at 12:13:32AM +0100, Bertrand Jacquin wrote: >>>> And finally, If you can share with me your HA and >>>> Nginx configurations, this could help. >>> >>> I'm attaching a strip down version of haproxy/nginx/php-fpm on which I >>> can reproduice this issue. >> >> I think another thing would be extremely useful, it would be a full-packet >> network capture between haproxy and nginx so that we have the full headers, >> the chunk sizes (if any) and the response timing, which generally matters a >> lot. Ideally the dump in text (or binary) format in a distinct file using >> curl -i would be nice as well. > > Sure thing, you will find attached a tcpdump between haproxy and nginx > along with a curl output. > > I changed initial configuration to use TCP instead of UNIX socket for > capture purpose. I also removed the proxy protocol between haproxy and > nginx to get an easier output to parse. > Hi Bertrand, Thanks for all these information. It helps me to find the bug. I attached a patch. Could you check if it fixes your bug ? Willy, if Bertrand confirms that his bug is gone, and if everything is ok for you, you will be able to merge it. -- Christopher Faulet >From d756fba92d1c14ab80c2c962843ddd556c5135ea Mon Sep 17 00:00:00 2001 From: Christopher Faulet <christopher.fau...@capflam.org> Date: Thu, 22 Sep 2016 15:31:43 +0200 Subject: [PATCH] BUG: http/compression: Fix how chunked data are copied during the HTTP body parsing When the compression is enable on HTTP responses, the chunked data are copied in a temporary buffer during the HTTP body parsing and then compressed when everything is forwarded to the client. But the amout of data that can be copied was not correctly calculated. In many cases, it worked, else on the edge when the channel buffer was almost full. --- src/flt_http_comp.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/src/flt_http_comp.c b/src/flt_http_comp.c index 9ddc858..249ccdf 100644 --- a/src/flt_http_comp.c +++ b/src/flt_http_comp.c @@ -177,15 +177,17 @@ comp_http_data(struct stream *s, struct filter *filter, struct http_msg *msg) } if (msg->flags & HTTP_MSGF_TE_CHNK) { - int block = bi_contig_data(buf); + int block; len = MIN(tmpbuf->size - buffer_len(tmpbuf), len); - if (len > block) { - memcpy(bi_end(tmpbuf), b_ptr(buf, *nxt), block); - memcpy(bi_end(tmpbuf)+block, buf->data, len - block); - } - else - memcpy(bi_end(tmpbuf), b_ptr(buf, *nxt), len); + + b_adv(buf, *nxt); + block = bi_contig_data(buf); + memcpy(bi_end(tmpbuf), bi_ptr(buf), block); + if (len > block) + memcpy(bi_end(tmpbuf)+block, buf->data, len-block); + b_rew(buf, *nxt); + tmpbuf->i += len; ret= len; } -- 2.7.4
Re: [PATCH] BUG: spoe: Fix parsing of SPOE actions in ACK frames
Le 24/11/2016 à 19:50, Willy Tarreau a écrit : Hi Christopher, On Thu, Nov 24, 2016 at 03:06:13PM +0100, Christopher Faulet wrote: >From 7ed3c2942d57ea2ddfc8973cce9cc1c94bca01da Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Thu, 24 Nov 2016 14:53:22 +0100 Subject: [PATCH] BUG: spoe: Fix parsing of SPOE actions in ACK frames X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 For "SET-VAR" actions, data was not correctly parsed. 'idx' variable was not correctly updated when the 3rd argument was parsed. Are you sure your patch is correct ? I think it's bogus : diff --git a/src/flt_spoe.c b/src/flt_spoe.c index 0b722b6..8227140 100644 --- a/src/flt_spoe.c +++ b/src/flt_spoe.c @@ -2017,7 +2017,7 @@ process_spoe_actions(struct stream *s, struct spoe_context *ctx, goto skip; memset(, 0, sizeof(smp)); smp_set_owner(, s->be, s->sess, s, dir|SMP_OPT_FINAL); - if (decode_spoe_data(p+idx, p+size, ) == -1) + if ((idx += decode_spoe_data(p+idx, p+size, )) == -1) goto skip; The only case it will work is when idx = 0 before decoding, which doesn't really look like the only case you're interested in. I guess you wanted to do this instead : - if (decode_spoe_data(p+idx, p+size, ) == -1) + ret = decode_spoe_data(p+idx, p+size, ); + if (ret == -1) goto skip; + idx += ret; Am I wrong ? That's the reason why I hate assignments in "if" conditions, half of the time they are bogus, the other half they make the reader scratch his head wondering if it's bogus or intended :-) Ah of course you're right ! too tired... New patch attached. thanks. -- Christopher >From 1b46491bf93c76602122ea5814c42fdcfbf2d816 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Thu, 24 Nov 2016 14:53:22 +0100 Subject: [PATCH] BUG: spoe: Fix parsing of SPOE actions in ACK frames X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 For "SET-VAR" actions, data was not correctly parsed. 'idx' variable was not correctly updated when the 3rd argument was parsed. --- src/flt_spoe.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/flt_spoe.c b/src/flt_spoe.c index 0b722b6..776848e 100644 --- a/src/flt_spoe.c +++ b/src/flt_spoe.c @@ -2017,8 +2017,10 @@ process_spoe_actions(struct stream *s, struct spoe_context *ctx, goto skip; memset(, 0, sizeof(smp)); smp_set_owner(, s->be, s->sess, s, dir|SMP_OPT_FINAL); -if (decode_spoe_data(p+idx, p+size, ) == -1) + +if ((i = decode_spoe_data(p+idx, p+size, )) == -1) goto skip; +idx += i; SPOE_PRINTF(stderr, "%d.%06d [SPOE/%-15s] %s: stream=%p" " - set-var '%s.%s.%.*s'\n", -- 2.7.4
[PATCH] BUG: spoe: Fix parsing of SPOE actions in ACK frames
Hi, Here is a small bug fix on SPOE filter. -- Christopher >From 7ed3c2942d57ea2ddfc8973cce9cc1c94bca01da Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Thu, 24 Nov 2016 14:53:22 +0100 Subject: [PATCH] BUG: spoe: Fix parsing of SPOE actions in ACK frames X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 For "SET-VAR" actions, data was not correctly parsed. 'idx' variable was not correctly updated when the 3rd argument was parsed. --- src/flt_spoe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/flt_spoe.c b/src/flt_spoe.c index 0b722b6..8227140 100644 --- a/src/flt_spoe.c +++ b/src/flt_spoe.c @@ -2017,7 +2017,7 @@ process_spoe_actions(struct stream *s, struct spoe_context *ctx, goto skip; memset(, 0, sizeof(smp)); smp_set_owner(, s->be, s->sess, s, dir|SMP_OPT_FINAL); -if (decode_spoe_data(p+idx, p+size, ) == -1) +if ((idx += decode_spoe_data(p+idx, p+size, )) == -1) goto skip; SPOE_PRINTF(stderr, "%d.%06d [SPOE/%-15s] %s: stream=%p" -- 2.7.4
Re: CONNECT method broken in 1.7?
Le 26/11/2016 à 16:10, Willy Tarreau a écrit : Hello, On Sat, Nov 26, 2016 at 04:19:08PM +0800, Live Dev wrote: Hello, this is actually my first time posting to the mailing list Welcome then :-) really excited to hear about the 1.7 release and thrilled to try out the new features like SPOE and OpenSSL 1.1.0 support kudos to willy and all others who made this happen! however in my use case (which utilizes CONNECT requests, alongside GET and POST) some initial testing shortly following an upgrade to 1.7 seems to suggest i'm hitting a bug at the CONNECT requests Hmmm you're perfectly right, I've just reproduced it. I could bisect it, it was brought by the patch below. I'm very surprized that none of us had detected it yet, as this bug has been present for a while now. It probably means that few people experiment with development code in front of proxies (which probably makes sense) : commit d7c9196ae56e8ee6babca07ec2ec98a4146bcfd1 Author: Christopher Faulet <cfau...@qualys.com> AuthorDate: Thu Apr 30 11:48:27 2015 +0200 Commit: Willy Tarreau <w...@1wt.eu> CommitDate: Tue Feb 9 14:53:15 2016 +0100 MAJOR: filters: Add filters support This patch adds the support of filters in HAProxy. The main idea is to have a way to "easely" extend HAProxy by adding some "modules", called filters, that will be able to change HAProxy behavior in a programmatic way. It will not be the funniest one to debug (at least for me), but it's very easy to reproduce. I guess it's related to the changes on the forwarding path to insert the data hooks. I'll check this on Monday with Chris who knows this part very well. We'll also have to check that Upgrade still works correctly. Hi, Here is the patch. It fixed the CONNECT requests. The bug affects only these requests. The HTTP connection upgrade is not concerned here (I've checked it with the WebSockets). -- Christopher >From 38da774c8158a3793a5261b816454206bd1f0ef1 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Mon, 28 Nov 2016 10:14:03 +0100 Subject: [PATCH] BUG/MEDIUM: http: Fix tunnel mode when the CONNECT method is used X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 When a 2xx response to a CONNECT request is returned, the connection must be switched in tunnel mode immediatly after the headers, and Transfer-Encoding and Content-Length headers must be ignored. So from the HTTP parser point of view, there is no body. The bug comes from the fact the flag HTTP_MSGF_XFER_LEN was not set on the response (This flag means that the body size can be determined. In our case, it can, it is 0). So, during data forwarding, the connection was never switched in tunnel mode and we were blocked in a state where we were waiting that the server closes the connection to ends the response. Setting the flag HTTP_MSGF_XFER_LEN on the response fixed the bug. --- src/proto_http.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/proto_http.c b/src/proto_http.c index 04aaf90..c24c01e 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -6320,7 +6320,8 @@ int http_wait_for_response(struct stream *s, struct channel *rep, int an_bit) */ if (txn->meth == HTTP_METH_HEAD || (txn->status >= 100 && txn->status < 200) || - txn->status == 204 || txn->status == 304) { + txn->status == 204 || txn->status == 304 || + (txn->meth == HTTP_METH_CONNECT && txn->status == 200)) { msg->flags |= HTTP_MSGF_XFER_LEN; goto skip_content_length; } -- 2.7.4
Re: HAProxy 1.7 memory leak?
Le 08/12/2016 à 09:35, rickytato rickytato a écrit : I'm testing HAProxy 1.7.0 (I'm using 1.6.x from several time) but I notice this strange thing, seen that memory continue to growing. Could you provide the output of "haproxy -vv" command and more information about your HAProxy configuration ? -- Christopher
[PATCH] BUG/MINOR: stream: Fix how backend-specific analyzers are set, on a stream
-- Christopher >From 73b8871a5e31004ec305a3eb2cd4747c4f569d5e Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Mon, 9 Jan 2017 16:33:19 +0100 Subject: [PATCH] BUG/MINOR: stream: Fix how backend-specific analyzers are set on a stream X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 When the stream's backend was defined, the request's analyzers flag was always set to 0 if the stream had no listener. This bug was introduced with the filter API but never triggered (I think so). Because of the commit 5820a366, it is now possible to encountered it. For example, this happens when the trace filter is enabled on a SPOE backend. The fix is pretty trivial. This fix must be backported to 1.7. --- src/proxy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/proxy.c b/src/proxy.c index 2d9f06c..d85f6a9 100644 --- a/src/proxy.c +++ b/src/proxy.c @@ -1156,7 +1156,7 @@ int stream_set_backend(struct stream *s, struct proxy *be) * be more reliable to store the list of analysers that have been run, * but what we do here is OK for now. */ - s->req.analysers |= be->be_req_ana & (strm_li(s) ? ~strm_li(s)->analysers : 0); + s->req.analysers |= (strm_li(s) ? be->be_req_ana & ~strm_li(s)->analysers : be->be_req_ana); /* If the target backend requires HTTP processing, we have to allocate * the HTTP transaction and hdr_idx if we did not have one. -- 2.9.3
[PATCH] BUG/MINOR: Fix the sending function in Lua's cosocket
This bug was reported by Thierry. -- Christopher >From 62d2733d05feb49d070094667dc25b0e716ab940 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Mon, 19 Dec 2016 09:29:06 +0100 Subject: [PATCH] BUG/MINOR: Fix the sending function in Lua's cosocket X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 This is a regression from the commit a73e59b6901a164d19b1145e8511602d9814f28f. When data are sent from a cosocket, the action is done in the context of the applet running a lua stack and not in the context of the applet owning the cosocket. So we must take care, explicitly, that this last applet have a buffer (the req buffer of the cosocket). This patch must be backported in 1.7 --- src/hlua.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/src/hlua.c b/src/hlua.c index f662cb0..17a6750 100644 --- a/src/hlua.c +++ b/src/hlua.c @@ -1889,14 +1889,19 @@ static int hlua_socket_write_yield(struct lua_State *L,int status, lua_KContext * the request buffer if its not required. */ if (socket->s->req.buf->size == 0) { - si_applet_cant_put(>s->si[0]); - goto hlua_socket_write_yield_return; + appctx = hlua->task->context; + if (!channel_alloc_buffer(>s->req, >buffer_wait)) + goto hlua_socket_write_yield_return; } /* Check for avalaible space. */ len = buffer_total_space(socket->s->req.buf); - if (len <= 0) + if (len <= 0) { + appctx = objt_appctx(socket->s->si[0].end); + if (!hlua_com_new(>com, >ctx.hlua_cosocket.wake_on_write, hlua->task)) + WILL_LJMP(luaL_error(L, "out of memory")); goto hlua_socket_write_yield_return; + } /* send data */ if (len < send_len) @@ -1934,9 +1939,6 @@ static int hlua_socket_write_yield(struct lua_State *L,int status, lua_KContext return 1; hlua_socket_write_yield_return: - appctx = objt_appctx(socket->s->si[0].end); - if (!hlua_com_new(>com, >ctx.hlua_cosocket.wake_on_write, hlua->task)) - WILL_LJMP(luaL_error(L, "out of memory")); WILL_LJMP(hlua_yieldk(L, 0, 0, hlua_socket_write_yield, TICK_ETERNITY, 0)); return 0; } -- 2.9.3
Re: 100% cpu usage with compression in haproxy.cfg
Le 30/03/2017 à 11:50, Christopher Faulet a écrit : Le 29/03/2017 à 13:23, Cyril Bonté a écrit : De: "Cyril Bonté" <cyril.bo...@free.fr> À: nos...@mrietzler.de Cc: haproxy@formilux.org Envoyé: Mercredi 29 Mars 2017 12:36:01 Objet: Re: 100% cpu usage with compression in haproxy.cfg Hi all, De: nos...@mrietzler.de À: haproxy@formilux.org Envoyé: Mercredi 29 Mars 2017 11:52:10 Objet: 100% cpu usage with compression in haproxy.cfg i have upgraded to haproxy 1.7.4 when i add the compression algo & type configs in frontend, haproxy raises up to 100%. this happens also with 1.7.1. when i comment those two lines out, everything is working like a charm, even compression is working. so i expect i don't need this config option at all - only when i want to disable or set some differenzt algo/compression options I've just hit the issue today on a"public lab" server and haproxy 1.8-dev (1.8-dev0-827385-303). I won't have time to investigate it more for the next hours, but I can reproduce it easily in HTTP/1.0 (in my logs, the issue occured when a bot came with HTTP/1.0 requests). Let's take such a simple PHP script (bug.php) : Requesting the web server without haproxy, the response is : $echo -ne "GET /bug.php HTTP/1.0\r\nHost: localhost\r\n\r\n" | nc localhost 80 HTTP/1.1 200 OK Date: Wed, 29 Mar 2017 11:15:32 GMT Server: Apache/2.4.10 (Debian) Vary: Accept-Encoding Connection: close Content-Type: text/html; charset=UTF-8 XX... Here, we don't have any Content-Length, which will produce a 100% cpu loop when the request is made through haproxy. I add Christopher Faulet to the thread, maybe those details will help. Hi all, Here is 2 patches that fix the bugs. I made some tests and it seems to work without breaking other use cases. Could you confirm that it works for you ? Thanks With patches this time ... -- Christopher Faulet >From 3372129a9c69c8dd37f1ebf84612c07e1c6c21e6 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Thu, 30 Mar 2017 10:54:35 +0200 Subject: [PATCH 1/2] BUG/MEDIUM: http: Fix blocked HTTP/1.0 responses when compression is enabled X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 When the compression filter is enabled, if a HTTP/1.0 response is received (no content-length and no transfer-encoding), no data are forwarded to the client because of a bug and the transaction is blocked indefinitly. The bug comes from the fact we need to synchronize the end of the request and the response because of the compression filter. This inhibits the infinite forwarding of data. But for these responses, the compression is not activated. So the response body is not analyzed. This leads to a deadlock. The solution is to enable the analyze of the response body in all cases and handle this one to enable the infinite forwarding. All other cases should already by handled. This fix should be backported to 1.7. --- src/proto_http.c | 24 +++- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/src/proto_http.c b/src/proto_http.c index 56480da..487c0fc 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -6856,11 +6856,9 @@ int http_process_res_common(struct stream *s, struct channel *rep, int an_bit, s } skip_header_mangling: - if ((msg->flags & HTTP_MSGF_XFER_LEN) || HAS_DATA_FILTERS(s, rep) || - (txn->flags & TX_CON_WANT_MSK) == TX_CON_WANT_TUN) { - rep->analysers &= ~AN_RES_FLT_XFER_DATA; - rep->analysers |= AN_RES_HTTP_XFER_BODY; - } + /* Always enter in the body analyzer */ + rep->analysers &= ~AN_RES_FLT_XFER_DATA; + rep->analysers |= AN_RES_HTTP_XFER_BODY; /* if the user wants to log as soon as possible, without counting * bytes from the server, then this is the right moment. We have @@ -7003,11 +7001,11 @@ int http_response_forward_body(struct stream *s, struct channel *res, int an_bit /* When TE: chunked is used, we need to get there again to parse * remaining chunks even if the server has closed, so we don't want to - * set CF_DONTCLOSE. Similarly, if the body length is undefined, if - * keep-alive is set on the client side or if there are filters - * registered on the stream, we don't want to forward a close + * set CF_DONTCLOSE. Similarly, if keep-alive is set on the client side + * or if there are filters registered on the stream, we don't want to + * forward a close */ - if ((msg->flags & HTTP_MSGF_TE_CHNK) || !(msg->flags & HTTP_MSGF_XFER_LEN) || + if ((msg->flags & HTTP_MSGF_TE_CHNK) || HAS_DATA_FILTERS(s, res) || (txn->flags & TX_CON_WANT_MSK) == TX_CON_WANT_KAL || (txn->flags & TX_CON_WANT_MSK) == TX_CON_WANT_SCL) @@ -7024,6 +7022,14 @@ int http_response_forward_body(struct stream *s, struct channel *res, int an_bit if ((msg->flags & HTTP_MSGF_TE_CHNK) || (msg->flags & HTTP_MS
Re: 100% cpu usage with compression in haproxy.cfg
Le 29/03/2017 à 13:23, Cyril Bonté a écrit : De: "Cyril Bonté" <cyril.bo...@free.fr> À: nos...@mrietzler.de Cc: haproxy@formilux.org Envoyé: Mercredi 29 Mars 2017 12:36:01 Objet: Re: 100% cpu usage with compression in haproxy.cfg Hi all, De: nos...@mrietzler.de À: haproxy@formilux.org Envoyé: Mercredi 29 Mars 2017 11:52:10 Objet: 100% cpu usage with compression in haproxy.cfg i have upgraded to haproxy 1.7.4 when i add the compression algo & type configs in frontend, haproxy raises up to 100%. this happens also with 1.7.1. when i comment those two lines out, everything is working like a charm, even compression is working. so i expect i don't need this config option at all - only when i want to disable or set some differenzt algo/compression options I've just hit the issue today on a"public lab" server and haproxy 1.8-dev (1.8-dev0-827385-303). I won't have time to investigate it more for the next hours, but I can reproduce it easily in HTTP/1.0 (in my logs, the issue occured when a bot came with HTTP/1.0 requests). Let's take such a simple PHP script (bug.php) : Requesting the web server without haproxy, the response is : $echo -ne "GET /bug.php HTTP/1.0\r\nHost: localhost\r\n\r\n" | nc localhost 80 HTTP/1.1 200 OK Date: Wed, 29 Mar 2017 11:15:32 GMT Server: Apache/2.4.10 (Debian) Vary: Accept-Encoding Connection: close Content-Type: text/html; charset=UTF-8 XX... Here, we don't have any Content-Length, which will produce a 100% cpu loop when the request is made through haproxy. I add Christopher Faulet to the thread, maybe those details will help. Hi all, Here is 2 patches that fix the bugs. I made some tests and it seems to work without breaking other use cases. Could you confirm that it works for you ? Thanks -- Christopher Faulet
Re: 100% cpu usage with compression in haproxy.cfg
Le 29/03/2017 à 13:23, Cyril Bonté a écrit : De: "Cyril Bonté" <cyril.bo...@free.fr> À: nos...@mrietzler.de Cc: haproxy@formilux.org Envoyé: Mercredi 29 Mars 2017 12:36:01 Objet: Re: 100% cpu usage with compression in haproxy.cfg Hi all, De: nos...@mrietzler.de À: haproxy@formilux.org Envoyé: Mercredi 29 Mars 2017 11:52:10 Objet: 100% cpu usage with compression in haproxy.cfg i have upgraded to haproxy 1.7.4 when i add the compression algo & type configs in frontend, haproxy raises up to 100%. this happens also with 1.7.1. when i comment those two lines out, everything is working like a charm, even compression is working. so i expect i don't need this config option at all - only when i want to disable or set some differenzt algo/compression options I've just hit the issue today on a"public lab" server and haproxy 1.8-dev (1.8-dev0-827385-303). I won't have time to investigate it more for the next hours, but I can reproduce it easily in HTTP/1.0 (in my logs, the issue occured when a bot came with HTTP/1.0 requests). Let's take such a simple PHP script (bug.php) : Requesting the web server without haproxy, the response is : $echo -ne "GET /bug.php HTTP/1.0\r\nHost: localhost\r\n\r\n" | nc localhost 80 HTTP/1.1 200 OK Date: Wed, 29 Mar 2017 11:15:32 GMT Server: Apache/2.4.10 (Debian) Vary: Accept-Encoding Connection: close Content-Type: text/html; charset=UTF-8 XX... Here, we don't have any Content-Length, which will produce a 100% cpu loop when the request is made through haproxy. I add Christopher Faulet to the thread, maybe those details will help. Hi, Thanks for these information. I found the bug. It is a collateral damage introduced by the commit e6006245. I'm on it. -- Christopher Faulet
Minor HTTP patches
Hi Willy, Following my recent patches on HTTP/1.0 responses without content-length when compression filter is enabled, here is 2 small patches. The first one is a small code cleanup and the second one adds handy debug messages. Thanks, -- Christopher Faulet >From c998beb94b02be0f07adc950400c495f60776b91 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Thu, 30 Mar 2017 11:21:53 +0200 Subject: [PATCH 1/2] MINOR: http: remove useless check on HTTP_MSGF_XFER_LEN for the request X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 The flag HTTP_MSGF_XFER_LEN is always set for an HTTP request because we always now the body length. So there is no need to do check on it. --- src/proto_http.c | 39 +++ 1 file changed, 15 insertions(+), 24 deletions(-) diff --git a/src/proto_http.c b/src/proto_http.c index 487c0fc..a00ea78 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -3122,7 +3122,7 @@ int http_wait_for_request(struct stream *s, struct channel *req, int an_bit) /* set TE_CHNK and XFER_LEN only if "chunked" is seen last */ while (http_find_header2("Transfer-Encoding", 17, req->buf->p, >hdr_idx, )) { if (ctx.vlen == 7 && strncasecmp(ctx.line + ctx.val, "chunked", 7) == 0) - msg->flags |= (HTTP_MSGF_TE_CHNK | HTTP_MSGF_XFER_LEN); + msg->flags |= HTTP_MSGF_TE_CHNK; else if (msg->flags & HTTP_MSGF_TE_CHNK) { /* chunked not last, return badreq */ goto return_bad_req; @@ -3158,7 +3158,7 @@ int http_wait_for_request(struct stream *s, struct channel *req, int an_bit) goto return_bad_req; /* already specified, was different */ } - msg->flags |= HTTP_MSGF_CNT_LEN | HTTP_MSGF_XFER_LEN; + msg->flags |= HTTP_MSGF_CNT_LEN; msg->body_len = msg->chunk_len = cl; } @@ -4256,8 +4256,7 @@ static int http_apply_redirect_rule(struct redirect_rule *rule, struct stream *s /* let's log the request time */ s->logs.tv_request = now; - if ((req->flags & HTTP_MSGF_XFER_LEN) && - ((!(req->flags & HTTP_MSGF_TE_CHNK) && !req->body_len) || (req->msg_state == HTTP_MSG_DONE)) && + if (((!(req->flags & HTTP_MSGF_TE_CHNK) && !req->body_len) || (req->msg_state == HTTP_MSG_DONE)) && ((txn->flags & TX_CON_WANT_MSK) == TX_CON_WANT_SCL || (txn->flags & TX_CON_WANT_MSK) == TX_CON_WANT_KAL)) { /* keep-alive possible */ @@ -4847,22 +4846,20 @@ int http_process_request(struct stream *s, struct channel *req, int an_bit) req->analysers |= AN_REQ_HTTP_BODY; } - if (msg->flags & HTTP_MSGF_XFER_LEN) { - req->analysers &= ~AN_REQ_FLT_XFER_DATA; - req->analysers |= AN_REQ_HTTP_XFER_BODY; + req->analysers &= ~AN_REQ_FLT_XFER_DATA; + req->analysers |= AN_REQ_HTTP_XFER_BODY; #ifdef TCP_QUICKACK - /* We expect some data from the client. Unless we know for sure - * we already have a full request, we have to re-enable quick-ack - * in case we previously disabled it, otherwise we might cause - * the client to delay further data. - */ - if ((sess->listener->options & LI_O_NOQUICKACK) && - cli_conn && conn_ctrl_ready(cli_conn) && - ((msg->flags & HTTP_MSGF_TE_CHNK) || - (msg->body_len > req->buf->i - txn->req.eoh - 2))) - setsockopt(cli_conn->t.sock.fd, IPPROTO_TCP, TCP_QUICKACK, , sizeof(one)); + /* We expect some data from the client. Unless we know for sure + * we already have a full request, we have to re-enable quick-ack + * in case we previously disabled it, otherwise we might cause + * the client to delay further data. + */ + if ((sess->listener->options & LI_O_NOQUICKACK) && + cli_conn && conn_ctrl_ready(cli_conn) && + ((msg->flags & HTTP_MSGF_TE_CHNK) || + (msg->body_len > req->buf->i - txn->req.eoh - 2))) + setsockopt(cli_conn->t.sock.fd, IPPROTO_TCP, TCP_QUICKACK, , sizeof(one)); #endif - } /* * OK, that's finished for the headers. We have done what we * @@ -4871,12 +4868,6 @@ int http_process_request(struct stream *s, struct channel *req, int an_bit) req->analyse_exp = TICK_ETERNITY; req->analysers &= ~an_bit; - /* if the server closes the connection, we want to immediately react - * and close the socket to save packets and syscalls. - */ - if (!(req->analysers & AN_REQ_HTTP_XFER_BODY)) - s->si[1].flags |= SI_FL_NOHALF; - s->logs.tv_request = now; /* OK let's go on with the BODY now */ return 1; -- 2.9.3 >From 776963e253b229f8e4749d8c4e22bd01dea6c7aa Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Thu, 30 Mar 2017 11:33:44 +0200 Subject: [PATCH 2/2] MINOR: http: Add debug mess
[PATCH] BUG/MEDIUM: buffers: Fix how input/output data are injected into buffers
Willy, I tagged this patch as a bug. But I don't found a way to hit it for now. It can be backported or not, as you wish. -- Christopher Faulet >From 4ffdfbed993eaeb6c777c148e1eb6a712bfc9e18 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Wed, 29 Mar 2017 11:58:28 +0200 Subject: [PATCH] BUG/MEDIUM: buffers: Fix how input/output data are injected into buffers The function buffer_contig_space is buggy and could lead to pernicious bugs (never hitted until now, AFAIK). This function should return the number of bytes that can be written into the buffer at once (without wrapping). First, this function is used to inject input data (bi_putblk) and to inject output data (bo_putblk and bo_inject). But there is no context. So it cannot decide where contiguous space should placed. For input data, it should be after bi_end(buf) (ie, buf->p + buf->i modulo wrapping calculation). For output data, it should be after bo_end(buf) (ie, buf->p) and input data are assumed to not exist (else there is no space at all). Then, considering we need to inject input data, this function does not always returns the right value. And when we need to inject output data, we must be sure to have no input data at all (buf->i == 0), else the result can also be wrong (but this is the caller responsibility, so everything should be fine here). The buffer can be in 3 different states: 1) no wrapping < o ><- i -> +++-++ |||i|| +++-++ ^ p ^^ lr 2) input wrapping ...--->< o >< i ---... +-++++ |i|||| +-++++ ^ ^^p lr 3) output wrapping ...-- o --><- i -><... +--+-++--+ |oo|i||oo| +--+-++--+ ^ p ^^ lr buffer_contig_space returns (l - r). The cases 1 and 3 are correctly handled. But for the second case, r is wrong. It points on the buffer's end (buf->data + buf->size). It should be bo_end(buf) (ie, buf->p - buf->o). To fix the bug, the function has been splitted. Now, bi_contig_space and bo_contig_space should be used to know the contiguous space available to insert, respectively, input data and output data. For bo_contig_space, input data are assumed to not exist. And the right version is used, depending what we want to do. In addition, to clarify the buffer's API, buffer_realign does not return value anymore. So it has the same API than buffer_slow_realign. This patch can be backported in 1.7, 1.6 and 1.5. --- include/common/buffer.h | 60 +++-- src/channel.c | 6 ++--- 2 files changed, 41 insertions(+), 25 deletions(-) diff --git a/include/common/buffer.h b/include/common/buffer.h index ce3eb40..3a6dfd7 100644 --- a/include/common/buffer.h +++ b/include/common/buffer.h @@ -156,6 +156,41 @@ static inline int bo_contig_data(const struct buffer *b) return b->o; } +/* Return the amount of bytes that can be written into the input area at once + * including reserved space which may be overwritten (this is the caller + * responsibility to know if the reserved space is protected or not). +*/ +static inline int bi_contig_space(const struct buffer *b) +{ + const char *left, *right; + + left = bi_end(b); + right = bo_ptr(b); + + if (left >= right) + right = b->data + b->size; + + return (right - left); +} + +/* Return the amount of bytes that can be written into the output area at once + * including reserved space which may be overwritten (this is the caller + * responsibility to know if the reserved space is protected or not). Input data + * are assumed to not exist. +*/ +static inline int bo_contig_space(const struct buffer *b) +{ + const char *left, *right; + + left = bo_end(b); + right = bo_ptr(b); + + if (left >= right) + right = b->data + b->size; + + return (right - left); +} + /* Return the buffer's length in bytes by summing the input and the output */ static inline int buffer_len(const struct buffer *buf) { @@ -226,21 +261,6 @@ static inline int buffer_contig_area(const struct buffer *buf, const char *start return count; } -/* Return the amount of bytes that can be written into the buffer at once, - * including reserved space
[PATCH] BUG/MINOR: http: Fix conditions to clean up a txn and to handle the next request
Willy, Another fix (with some cleanup in other paches). The first one (and probably the second one) can be backported. But I don't know if this is mandatory. It is really tricky to find conditions where it could be a problem. Thanks -- Christopher Faulet >From 0cdf21fc9678ce44eddf10efb4fdaf737e2115b3 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Tue, 28 Mar 2017 11:51:33 +0200 Subject: [PATCH 1/4] BUG/MINOR: http: Fix conditions to clean up a txn and to handle the next request To finish a HTTP transaction and to start the new one, we check, among other things, that there is enough space in the reponse buffer to eventually inject a message during the parsing of the next request. Because these messages can reach the maximum buffers size, it is mandatory to have an empty response buffer. Remaining input data are trimmed during the txn cleanup (in http_reset_txn), so we just need to check that the output data were flushed. The current implementation depends on channel_congested, which does check the reserved area is available. That's not of course good enough. There are other tests on the reponse buffer is http_wait_for_request. But conditions to move on are almost the same. So, we can imagine some scenarii where some output data remaining in the reponse buffer during the request parsing prevent any messages injection. To fix this bug, we just wait that output data were flushed before cleaning up the HTTP txn (ie. s->res.buf->o == 0). In addition, in http_reset_txn we realign the response buffer (note the buffer is empty at this step). Thanks to this changes, there is no more need to set CF_EXPECT_MORE on the response channel in http_end_txn_clean_session. And more important, there is no more need to check the response buffer state in http_wait_for_request. This remove a workaround on response analysers to handle HTTP pipelining. This patch can be backported in 1.7, 1.6 and 1.5. --- src/proto_http.c | 48 +++- src/stream.c | 12 2 files changed, 7 insertions(+), 53 deletions(-) diff --git a/src/proto_http.c b/src/proto_http.c index 2d97fbe..24d034a 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -2649,29 +2649,6 @@ int http_wait_for_request(struct stream *s, struct channel *req, int an_bit) buffer_slow_realign(req->buf); } - /* Note that we have the same problem with the response ; we - * may want to send a redirect, error or anything which requires - * some spare space. So we'll ensure that we have at least - * maxrewrite bytes available in the response buffer before - * processing that one. This will only affect pipelined - * keep-alive requests. - */ - if ((txn->flags & TX_NOT_FIRST) && - unlikely(!channel_is_rewritable(>res) || - bi_end(s->res.buf) < b_ptr(s->res.buf, txn->rsp.next) || - bi_end(s->res.buf) > s->res.buf->data + s->res.buf->size - global.tune.maxrewrite)) { - if (s->res.buf->o) { -if (s->res.flags & (CF_SHUTW|CF_SHUTW_NOW|CF_WRITE_ERROR|CF_WRITE_TIMEOUT)) - goto failed_keep_alive; -/* don't let a connection request be initiated */ -channel_dont_connect(req); -s->res.flags &= ~CF_EXPECT_MORE; /* speed up sending a previous response */ -s->res.flags |= CF_WAKE_WRITE; -s->res.analysers |= an_bit; /* wake us up once it changes */ -return 0; - } - } - if (likely(msg->next < req->buf->i)) /* some unparsed data are available */ http_msg_analyzer(msg, >hdr_idx); } @@ -5292,20 +5269,6 @@ void http_end_txn_clean_session(struct stream *s) s->res.flags |= CF_NEVER_WAIT; } - /* if the request buffer is not empty, it means we're - * about to process another request, so send pending - * data with MSG_MORE to merge TCP packets when possible. - * Just don't do this if the buffer is close to be full, - * because the request will wait for it to flush a little - * bit before proceeding. - */ - if (s->req.buf->i) { - if (s->res.buf->o && - !buffer_full(s->res.buf, global.tune.maxrewrite) && - bi_end(s->res.buf) <= s->res.buf->data + s->res.buf->size - global.tune.maxrewrite) - s->res.flags |= CF_EXPECT_MORE; - } - /* we're removing the analysers, we MUST re-enable events detection. * We don't enable close on the response channel since it's either * already closed, or in keep-alive with an idle connection handler. @@ -5686,13 +5649,13 @@ int http_resync_states(struct stream *s) * possibly killing the server connection and reinitialize * a fresh-new transaction, but only once we're sure there's * enough room in the request and response buffer to process - * another request. The request buffer must not hold any - * pending output data and the request buffer must not have - * output
Re: [PATCH] BUG/MEDIUM: buffers: Fix how input/output data are injected into buffers
Le 31/03/2017 à 14:26, Willy Tarreau a écrit : On Fri, Mar 31, 2017 at 11:29:43AM +0200, Christopher Faulet wrote: Willy, I tagged this patch as a bug. But I don't found a way to hit it for now. It can be backported or not, as you wish. Thanks Christopher. I don't know either how to trigger it since the only problematic case I've found is the one where input wraps, which doesn't happen when we're processing data. However I agree that leaving such a bug behind us is scary and a future fix might rely on this to work correctly so I'd rather backport the fix anyway. I checked your solution and for me it works fine in all situations. By the way, just FYI, there aren't 3 cases to consider for a buffer, but at least 5 ; here are the additional 2 ones (that your patch properly handles) : - input may not wrap but end exactly at the end of the buffer, making buf->p+buf->i == buf->data+buf->size, but bi_end() == buf->data. This is a common error case when computing input lengths. - the output data may end at the end of the buffer and the input be placed at the beginning, causing buf->p to equal buf->data. Similarly it's a common error case when computing output data length. These situations cause trouble when not using the proper arithmetics. Either all the computations are made without wrapping, or all are made with wrapping. Any mix of the two causes issues. Yes, of course. I implicitly considered them as special cases of the wrapping ones. From the moment you use pointers (bi_ptr/bi_end and bo_ptr/bo_end), it is easier. But, it never hurts to mention it :) Thanks -- Christopher Faulet
Re: [PATCH] BUG/MINOR: http: fix typo in http_apply_redirect_rule
Le 21/03/2017 à 07:43, Willy Tarreau a écrit : On Mon, Mar 20, 2017 at 11:08:20AM +0100, Christopher Faulet wrote: Hi, Here is a little patch fixing a bug. It should be backported in 1.7. Thanks Christopher. Are you sure it's *this* minor ? I suspect that not having it could break keep-alive processing on redirect rules, am I wrong ? Well, maybe you're right. In almost all cases, this bug has no effect. After a more careful check, I found it leads to a memory leak if a redirect is done on a http-response rule when the HTTP compression is enabled. So, I set it to MAJOR. Thanks -- Christopher Faulet >From 853d3c93b64cb218f707106f8feb0480bef861fc Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Fri, 10 Mar 2017 13:52:30 +0100 Subject: [PATCH] BUG/MAJOR: http: fix typo in http_apply_redirect_rule X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 Because of this typo, AN_RES_FLT_END was never called when a redirect rule is applied on a keep-alive connection. In almost all cases, this bug has no effect. But, it leads to a memory leak if a redirect is done on a http-response rule when the HTTP compression is enabled. This patch should be backported in 1.7. --- src/proto_http.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/proto_http.c b/src/proto_http.c index 6d55e6a..0cafdb4 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -4279,7 +4279,7 @@ static int http_apply_redirect_rule(struct redirect_rule *rule, struct stream *s req->next -= req->sov; req->sov = 0; s->req.analysers = AN_REQ_HTTP_XFER_BODY | (s->req.analysers & AN_REQ_FLT_END); - s->res.analysers = AN_RES_HTTP_XFER_BODY | (s->req.analysers & AN_RES_FLT_END); + s->res.analysers = AN_RES_HTTP_XFER_BODY | (s->res.analysers & AN_RES_FLT_END); req->msg_state = HTTP_MSG_CLOSED; res->msg_state = HTTP_MSG_DONE; /* Trim any possible response */ -- 2.9.3
[PATCH] BUG/MINOR: http: fix typo in http_apply_redirect_rule
Hi, Here is a little patch fixing a bug. It should be backported in 1.7. Thanks, -- Christopher Faulet >From 11fea91789bf795d8924d37a4f50c100f2cd3805 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Fri, 10 Mar 2017 13:52:30 +0100 Subject: [PATCH] BUG/MINOR: http: fix typo in http_apply_redirect_rule X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 Because of this typo, AN_RES_FLT_END was never called when a redirect rule is applied on a keep-alive connection.. This patch should be backported in 1.7. --- src/proto_http.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/proto_http.c b/src/proto_http.c index 6d55e6a..0cafdb4 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -4279,7 +4279,7 @@ static int http_apply_redirect_rule(struct redirect_rule *rule, struct stream *s req->next -= req->sov; req->sov = 0; s->req.analysers = AN_REQ_HTTP_XFER_BODY | (s->req.analysers & AN_REQ_FLT_END); - s->res.analysers = AN_RES_HTTP_XFER_BODY | (s->req.analysers & AN_RES_FLT_END); + s->res.analysers = AN_RES_HTTP_XFER_BODY | (s->res.analysers & AN_RES_FLT_END); req->msg_state = HTTP_MSG_CLOSED; res->msg_state = HTTP_MSG_DONE; /* Trim any possible response */ -- 2.9.3
Re: WebSocket + compression + timeout tunnel broken in v1.7?
Le 17/03/2017 à 11:57, Kristjan Koppel a écrit : Hi! I upgraded a HAProxy instance from v1.6.10 to v1.7.3 with no changes to configuration and I noticed that now clients connecting to a WebSocket backend are getting disconnected after being idle for "timeout client" time rather than the much longer "timeout tunnel" time. As I understand it, "timeout tunnel" is supposed to override the "timeout client" setting for WebSocket sessions and this worked fine with v1.6 and earlier. Also, this HAProxy instance has compression enabled in the defaults section. If I remove the compression settings from there (or if I move them to other backends only), then "timeout tunnel" starts to work again. So, I guess this is a bug in HAProxy v1.7? I managed to reproduce this with the official HAProxy Docker images, a minimal haproxy.cfg and wscat (from the Debian package node-ws) as follows: haproxy.cfg: global stats socket /run/haproxy.sock mode 660 level admin stats timeout 30s daemon defaults modehttp timeout connect 5s timeout client 50s timeout server 60s timeout tunnel 15m compression algo gzip compression type text/html listen ws_test bind :8000 server wscat 172.17.0.1:8001 I started one wscat instance in listening mode for the backend: $ wscat -l 8001 With HAProxy v1.6.11 running the above config I see the following: $ date; wscat -c ws://172.17.0.2:8000; date Fri Mar 17 12:10:59 EET 2017 connected (press CTRL+C to quit) foo < bar disconnected Fri Mar 17 12:26:03 EET 2017 State of the session about 30 seconds after starting the above command: $ echo "show sess" | socat /run/haproxy.sock stdio 0x12253b0: proto=tcpv4 src=172.17.0.1:52046 fe=ws_test be=ws_test srv=wscat ts=08 age=31s calls=4 rq[f=8848202h,i=0,an=00h,rx=14m32s,wx=,ax=] rp[f=88048202h,i=0,an=00h,rx=14m32s,wx=,ax=] s0=[7,8h,fd=1,ex=] s1=[7,118h,fd=2,ex=] exp=14m29s With HAProxy v1.7.3 running the same config I see the following: $ date; wscat -c ws://172.17.0.3:8000; date Fri Mar 17 12:29:39 EET 2017 connected (press CTRL+C to quit) foo < bar disconnected Fri Mar 17 12:30:33 EET 2017 Obviously it takes me a bit of time to send the 2 strings manually, so total time is a bit more than the "timeout client" value here. State of the session about 30 seconds after starting the above command: $ echo "show sess" | socat /run/haproxy.sock stdio 0x22a4f10: proto=tcpv4 src=172.17.0.1:55996 fe=ws_test be=ws_test srv=wscat ts=04 age=33s calls=6 rq[f=8848000h,i=0,an=00h,rx=20s,wx=,ax=] rp[f=88048000h,i=0,an=00h,rx=30s,wx=,ax=] s0=[7,8h,fd=1,ex=] s1=[7,118h,fd=2,ex=] exp=20s If I remove the compression lines from the config and try again with HAProxy v1.7.3, then it's fine again: $ date; wscat -c ws://172.17.0.3:8000; date Fri Mar 17 12:35:22 EET 2017 connected (press CTRL+C to quit) foo < bar disconnected Fri Mar 17 12:50:26 EET 2017 State of the session about 30 seconds after starting the above command: $ echo "show sess" | socat /run/haproxy.sock stdio 0x1e64500: proto=tcpv4 src=172.17.0.1:56038 fe=ws_test be=ws_test srv=wscat ts=04 age=31s calls=5 rq[f=8848000h,i=0,an=00h,rx=14m33s,wx=,ax=] rp[f=88048000h,i=0,an=00h,rx=14m33s,wx=,ax=] s0=[7,8h,fd=1,ex=] s1=[7,118h,fd=2,ex=] exp=14m33s I'll be happy to provide any additional information if needed. Hi Kristjan, Thanks for this detailed report. You're right. This is a bug and it was fixed in 1.8. It will soon be backported in 1.7. If you're curious, the commit fixing the bug is e600624 ("BUG/MEDIUM: filters: Fix channels synchronization in flt_end_analyze"). You can safely apply it on 1.7.3. -- Christopher Faulet
Re: ModSecurity: First integration patches
Le 11/04/2017 à 10:49, Thierry Fournier a écrit : Hi list I join one usage of HAProxy / SPOE, it is WAF offloading. These patches are a first version, it have some limitations describe in the README file in the directory contrib/modsecurity. - Christopher, please check the patch "BUG/MINOR", it is about spoe functions. - The exemple of ModSecurity compilation can be improved. It is based on my local distro. The feedback are welcome. Hi Thierry, Really nice ! I'll take a look at it soon. Glad to see the first service that uses the SPOE ! Good job. -- Christopher Faulet
Re: ModSecurity: First integration patches
Le 13/04/2017 à 12:53, Thierry Fournier a écrit : On 13 Apr 2017, at 12:28, Willy Tarreau <w...@1wt.eu> wrote: On Thu, Apr 13, 2017 at 12:21:20PM +0200, Thierry Fournier wrote: .) the patches apply only on haproxy 1.8 because some files does not exists on 1.7 ( e. g. include/proto/spoe.h ) Ok. I think that SPOE was introduced in 1.7, obviously I'm wrong. No, it was introduced in 1.7 but there were some improvements later (like pipelining etc). (...) .) How can the rule-set be reloaded? stop & start || gracefully? I do not process this part. Today, you must stop and start the process. The graceful doesn't exists. I guess than the graceful can be implemented easily. You can ensure the availability of the SPOA Modsec using the properties of the HAProxy backend. Actually that's a very good point. I think it would even be possible to ensure a graceful shutdown using disable-on-404 or using an agent so that you can roll the restart over multiple WAF nodes. Interesting. I think about a system which (on SPOA side) stop listeners and wait for the end of processing current requests. By this way, the SPOA doesn’t accept requests, and HAProxy send requests on the other process. Another way is using the CLI and set one spoa/modsec in graceful mode. Adding a special check is the best way, but the daemon speaks SPOP and not HTTP. Maybe a thread which listen on specific port dedicated to this function ? Or improving the SPOP for asking graceful mode in the agent-hello response message ? (it seems that haproxy send periodically haproxy-hello messages, but maybe I’m wrong) The hello-handshake is done only once, when a new connection with a SPOA is opened. But we can improve the SPOP by adding a new frame type to handle admin tasks (like graceful stop). This way, for a specific connection, it would be possible to wait for last ACK frames without sending new frames to the SPOA. Then stopping the SPOA listeners to let the SPOP health check failed should do the trick, I guess. -- Christopher Faulet
Re: ModSecurity: First integration patches
Le 18/04/2017 à 14:40, Willy Tarreau a écrit : On Tue, Apr 18, 2017 at 12:15:20PM +0200, Christopher Faulet wrote: I finally took the time to review your patches, mainly the second one, about the sample fetch. I think it would be pity to introduced such complex sample fetch. All parts, except the HTTP headers, are already available in dedicated sample fetches. It could be better to only add a sample fetch to get HTTP headers (req.hdrs and res.hdrs, something like that). Because a sample fetch cannot return a list, we can probably encode it into a binary buffer using a \0 as separator. something like: \0\0\0\0... This way, the sample fetch does not depend on the SPOE and can be used in another context. And concerning your SPOA, this will be quite easy to parse it. Actually I'm not quite sure about this one, such an encoding could in fact make it harder to process while most use cases will either want to log it (thus copy the whole string as-is) or decode it as received (and might have to rebuild the initial header block with CRLF in between). My first idea was to avoid the headers parsing in the SPOA. Because it was already done by HAProxy, it could be cool to not redo it. Maybe the sample fetch can take an argument to choose the format (raw or encoded). The format is open. \0 may not be the better separator. This is just a trick to deal with a list, like req.hdr_names. However I agree that having something like req.hdrs / res.hdrs could be useful instead of having the whole block containing all the buffer. We already have "req.body" for the body as a binary block, so it totally makes sense to have "req.hdrs" for the same purpose, and let agents deal with either one or the other or both. -- Christopher Faulet
Re: HTTP Basic Authorisation requests failing with HAProxy 1.7.2
Le 07/03/2017 à 20:51, Jon Simpson a écrit : On 7 March 2017 at 08:48:37, Christopher Faulet (cfau...@haproxy.com <mailto:cfau...@haproxy.com>) wrote: Thanks for these info. By checking how you named your trace filters, I guess you use the HTTP compression. If I'm right, I think I found a bug with the help of your logs. The bug is triggered when your HTTP server replies before the end of the request. I guess it always sends its 401 responses just after reading requests headers. So it is timing issue from the HAProxy point of view. This should work of course. If I'm right, the bug only appears when a filter is used. So a temporary workaround, waiting my fix, is to disable the HTTP compression. I'll try to fix it very soon, I just need to talk with Willy to do it the right way. I will keep you informed. -- Christopher Faulet Hi Christopher, Yes - you’re right, in our normal configuration the compression filter is enabled. I was able to continue replicating the issue even without the compression filter when testing/generating those logs, but that was with the trace filter enabled - so there was always a filter. When I remove all filters (both trace & compression) the issue doesn’t reproduce in my test environment any more, so I think your theory is correct. Thanks for tracking it down - much appreciated :) Hi Jon, Here is a patch that should fix your bug. It was trickier than expected. Could you confirm that it fix your bug ? -- Christopher Faulet >From 0e47d5db9cca60d3633fd4e1183948298a3fe07e Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Fri, 10 Mar 2017 11:52:44 +0100 Subject: [PATCH] BUG/MEDIUM: filters: Fix channels synchronization in flt_end_analyze X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 When a filter is used, there are 2 channel's analyzers to surround all the others, flt_start_analyze and flt_end_analyze. This is the good place to acquire and release resources used by filters, when needed. In addition, the last one is used to synchronize the both channels, especially for HTTP streams. We must wait that the analyze is finished for the both channels for an HTTP transaction before restarting it for the next one. But this part was buggy, leading to unexpected behaviours. First, depending on which channel ends first, the request or the response can be switch in a "forward forever" mode. Then, the HTTP transaction can be cleaned up too early, while a processing is still in progress on a channel. To fix the bug, the flag CF_FLT_ANALYZE has been added. It is set on channels in flt_start_analyze and is kept if at least one filter is still analyzing the channel. So, we can trigger the channel syncrhonization if this flag was removed on the both channels. In addition, the flag TX_WAIT_CLEANUP has been added on the transaction to know if the transaction must be cleaned up or not during channels syncrhonization. This way, we are sure to reset everything once all the processings are finished. This patch should be backported in 1.7. --- include/types/channel.h| 3 ++- include/types/proto_http.h | 7 ++- src/filters.c | 52 -- src/proto_http.c | 19 +++-- 4 files changed, 48 insertions(+), 33 deletions(-) diff --git a/include/types/channel.h b/include/types/channel.h index 76f1ca0..03bb4e2 100644 --- a/include/types/channel.h +++ b/include/types/channel.h @@ -116,7 +116,8 @@ #define CF_NEVER_WAIT 0x0800 /* never wait for sending data (permanent) */ #define CF_WAKE_ONCE 0x1000 /* pretend there is activity on this channel (one-shoot) */ -/* unused: 0x2000, 0x4000 */ +#define CF_FLT_ANALYZE0x2000 /* at least one filter is still analyzing this channel */ +/* unused: 0x4000 */ #define CF_ISRESP 0x8000 /* 0 = request channel, 1 = response channel */ /* Masks which define input events for stream analysers */ diff --git a/include/types/proto_http.h b/include/types/proto_http.h index 66f7397..9987c33 100644 --- a/include/types/proto_http.h +++ b/include/types/proto_http.h @@ -68,7 +68,12 @@ #define TX_CACHE_COOK 0x2000 /* a cookie in the response is cacheable */ #define TX_CACHE_SHIFT 12 /* bit shift */ -/* Unused: 0x4000, 0x8000, 0x1, 0x2, 0x8 */ +/* Unused: 0x4000, 0x8000 */ + +#define TX_WAIT_CLEANUP 0x001 /* this transaction is waiting for a clean up */ + +/* Unused: 0x2, 0x8 */ + /* indicate how we *want* the connection to behave, regardless of what is in * the headers. We have 4 possible values right now : diff --git a/src/filters.c b/src/filters.c index 9ec794a..fa63991 100644 --- a/src/filters.c +++ b/src/filters.c @@ -693,6 +693,9 @@ flt_start_analyze(struct stream *s, struct channel *chn, unsigned int an_bit) /* If this function is called, this means there is at least one filter, * so we do not need to chec
Re: HTTP Basic Authorisation requests failing with HAProxy 1.7.2
Le 03/03/2017 à 15:44, Jon Simpson a écrit : Thanks for your response. I’ve attached the requested configuration and HAProxy trace logs below (TCPdump appeared to be much noisier). I have an input file which reproduces the 503 case but not predictably, so I’ve included one log which generated the correct 401 response from HAProxy and one which returned a 503 response. The 503 response comes back every time with larger requests than this. I’ve stripped down the configuration to a minimum which reproduces the issue. The behaviour only seems to occur with proxy-protocol enabled (accept-proxy). I initially tested directly sending requests directly to HAProxy and that works without the 503 responses. We run this HAProxy configuration behind Amazon ELB - a TCP load balancer with proxy protocol enabled, where the issue occurs (and it occurred in my test environment once proxy protocol was enabled and connections went through the TCP load balancer). Hi Jon, Thanks for these info. By checking how you named your trace filters, I guess you use the HTTP compression. If I'm right, I think I found a bug with the help of your logs. The bug is triggered when your HTTP server replies before the end of the request. I guess it always sends its 401 responses just after reading requests headers. So it is timing issue from the HAProxy point of view. This should work of course. If I'm right, the bug only appears when a filter is used. So a temporary workaround, waiting my fix, is to disable the HTTP compression. I'll try to fix it very soon, I just need to talk with Willy to do it the right way. I will keep you informed. -- Christopher Faulet
Re: HTTP Basic Authorisation requests failing with HAProxy 1.7.2
Le 02/03/2017 à 16:46, Jon Simpson a écrit : Hi all, Further to my previous message to the list I can now reliably reproduce the behaviour where certain requests using authentication headers receive 503 responses from HAProxy 1.7.3. We’re making a request twice, once without basic auth (checking that basic auth is prompted on the URL) and then a second, identical, request but with the the basic auth Authorization header set. The request body is textual file content, so these requests are larger than average GET/POST requests (Content-Length > 3500). We don't see this problem with smaller requests. The two requests happen on a single connection to HAProxy as we have keepalive enabled. The first request is always sent to the backend successfully and but the second request receives a 503 status from HAProxy whenever the request size is over a Content-Length of 3500. Just below this size the 503’s are sporadic and at very small body sizes it doesn’t happen at all - the request is sent to the backend succesfully. There are no errors visible by catting the stats socket, and the second request/503 response is not present in logs at all (using option httplog). We don't get this behaviour when keeping all other variables the same (client, request body & HAProxy configuration) using HAProxy 1.6.11. Thanks in advance for any help anyone can provide. Hi, I've quickly tried to reproduce your bug without success. Please, share you configuration to be sure to do test in the same situation. It could also be helpful to have a tcpdump for data exchanged between your client and HAProxy and another one for data exchanged between HAProxy and your webserver. If you cannot do a tcpdump, you can use the trace filter by adding the following line in your frontend/listener section: filer trace Then run HAProxy in foreground. -- Christopher Faulet
Re: ModSecurity: First integration patches
Le 12/04/2017 à 10:49, Christopher Faulet a écrit : Le 11/04/2017 à 10:49, Thierry Fournier a écrit : Hi list I join one usage of HAProxy / SPOE, it is WAF offloading. These patches are a first version, it have some limitations describe in the README file in the directory contrib/modsecurity. - Christopher, please check the patch "BUG/MINOR", it is about spoe functions. - The exemple of ModSecurity compilation can be improved. It is based on my local distro. The feedback are welcome. Hi Thierry, Really nice ! I'll take a look at it soon. Glad to see the first service that uses the SPOE ! Good job. Hi Thierry, I finally took the time to review your patches, mainly the second one, about the sample fetch. I think it would be pity to introduced such complex sample fetch. All parts, except the HTTP headers, are already available in dedicated sample fetches. It could be better to only add a sample fetch to get HTTP headers (req.hdrs and res.hdrs, something like that). Because a sample fetch cannot return a list, we can probably encode it into a binary buffer using a \0 as separator. something like: \0\0\0\0... This way, the sample fetch does not depend on the SPOE and can be used in another context. And concerning your SPOA, this will be quite easy to parse it. About the SPOA, it seems to be ok. The server part is based on the SPOA example so it should be ok (or you can blame me for all bugs :) For the mod_security part, I blindly trust you. -- Christopher Faulet
Re: [PATCH] MEDIUM: ssl: allow haproxy to start without default certificate
Le 28/07/2017 à 14:28, Emmanuel Hocdet a écrit : . fix generate_certificates issue perhaps it’s more simple to do: *diff --git a/src/ssl_sock.c b/src/ssl_sock.c* *index c71c2e3..311d465 100644* *--- a/src/ssl_sock.c* *+++ b/src/ssl_sock.c* @@ -1587,7 +1587,7 @@ssl_sock_do_create_cert(const char *servername, struct bind_conf *bind_conf, SSL int key_type; /* Get the private key of the defautl certificate and use it */ - if (!(pkey = SSL_get_privatekey(ssl))) +if (!pkey = SSL_CTX_get0_privatekey(bind_conf->default_ctx)) goto mkcert_error; /* Create the certificate */ . for the patch "allow haproxy to start without default certificate" default_ctx could be required when bind_conf.generate_certs is set. SSL_CTX_get0_privatekey is only available in openssl >= 1.0.2. So for previous versions, you need to create a SSL object with the default certificate and then extract the private key: @@ -1637,7 +1639,17 @@ ssl_sock_do_create_cert(const char *servername, struct bind_conf *bind_conf, SSL int key_type; /* Get the private key of the defautl certificate and use it */ - if (!(pkey = SSL_get_privatekey(ssl))) +#if (OPENSSL_VERSION_NUMBER >= 0x10002000L && !defined LIBRESSL_VERSION_NUMBER) + pkey = SSL_CTX_get0_privatekey(bind_conf->default_ctx); +#else +SSL *tmp_ssl = SSL_new(bind_conf->default_ctx); + +if (tmp_ssl) { + pkey = SSL_get_privatekey(tmp_ssl); + SSL_free(tmp_ssl); + } +#endif + if (!pkey) goto mkcert_error; This is the workaround I mentioned in my previous mail. That's acceptable, but my question remains. Is the initial certificate is still needed ? Even if we allow haproxy to be started without default certificate, we can probably remove initial_ctx. That's just I want to be sure to not have missed something :) -- Christopher Faulet
Re: [PATCH] MEDIUM: ssl: allow haproxy to start without default certificate
Le 28/07/2017 à 16:35, Emmanuel Hocdet a écrit : okay compat… SSL_free should not be call until pkey is dup. for SSL_get_privatekey: "These functions retrieve certificate and key data from an SSL object. They return internal pointers that must not be freed by the application program. » Good catch perhaps add the declaration in openssl-compat.h: EVP_PKEY *SSL_CTX_get0_privatekey(const SSL_CTX *ctx) { if (ctx->cert != NULL) return ctx->cert->key->privatekey; else return NULL; } Good idea, I'll do that initial_ctx is still needed, remove it could be painful. The case is ssl params per certificate. take on bind line with only crt-list. crtlist: a.pem [ alpn h2,http/1.1] b.pem default_ctx is set with the first parsed certificate (and is configuration) b.pem will inherited from « alpn » configuration from a.pem to fix that: 1) clean all ssl configuration inherited from default_ctx (does not work in all cases, much time spent testing with openssl versionS) 2) change the definition of default_ctx: first parsed certificate without is configuration (only the bind configuration) I don’t want do that, this is an unexpected behavior 3) force usage of one crt in bind line to set the default cert (and before crt-list) It break old configurations. (and i don’t want a default cert) other? i try, fail and fix with introduce initial_ctx to normalise the behavior in a clean manner. Ok, Thanks for you explanations. I'll keep initial_ctx so. I'll quickly proposed a patch to fix certificates generation. -- Christopher Faulet
[PATCH] BUG/MEDIUM: ssl: Fix regression about certificates generation
Willy, Here is the patch fixing the certificates generation. Thanks -- Christopher Faulet >From 4cfdaa09b218d784e7b814f70981f35d1a7811df Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Fri, 28 Jul 2017 16:56:09 +0200 Subject: [PATCH] BUG/MEDIUM: ssl: Fix regression about certificates generation Since the commit f6b37c67 ["BUG/MEDIUM: ssl: in bind line, ssl-options after 'crt' are ignored."], the certificates generation is broken. To generate a certificate, we retrieved the private key of the default certificate using the SSL object. But since the commit f6b37c67, the SSL object is created with a dummy certificate (initial_ctx). So to fix the bug, we use directly the default certificate in the bind_conf structure. We use SSL_CTX_get0_privatekey function to do so. Because this function does not exist for OpenSSL < 1.0.2 and for LibreSSL, it has been added in openssl-compat.h with the right #ifdef. --- include/proto/openssl-compat.h | 13 + src/ssl_sock.c | 4 ++-- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/include/proto/openssl-compat.h b/include/proto/openssl-compat.h index ea92072e..9b671095 100644 --- a/include/proto/openssl-compat.h +++ b/include/proto/openssl-compat.h @@ -89,6 +89,19 @@ static inline int SSL_SESSION_set1_id_context(SSL_SESSION *s, const unsigned cha } #endif +#if (OPENSSL_VERSION_NUMBER < 0x10002000L) || defined(LIBRESSL_VERSION_NUMBER) +/* + * Functions introduced in OpenSSL 1.0.2 and not yet present in LibreSSL + */ +EVP_PKEY *SSL_CTX_get0_privatekey(const SSL_CTX *ctx) +{ + if (ctx->cert != NULL) + return ctx->cert->key->privatekey; + else + return NULL; +} +#endif + #if (OPENSSL_VERSION_NUMBER < 0x101fL) || defined(LIBRESSL_VERSION_NUMBER) /* * Functions introduced in OpenSSL 1.1.0 and not yet present in LibreSSL diff --git a/src/ssl_sock.c b/src/ssl_sock.c index b4d4e14f..d81dd70c 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -1586,8 +1586,8 @@ ssl_sock_do_create_cert(const char *servername, struct bind_conf *bind_conf, SSL unsigned int i; int key_type; - /* Get the private key of the defautl certificate and use it */ - if (!(pkey = SSL_get_privatekey(ssl))) + /* Get the private key of the default certificate and use it */ + if (!(pkey = SSL_CTX_get0_privatekey(bind_conf->default_ctx))) goto mkcert_error; /* Create the certificate */ -- 2.13.3
Re: [PATCH] MEDIUM: ssl: allow haproxy to start without default certificate
Le 28/07/2017 à 12:41, Emmanuel Hocdet a écrit : A useless certificat should be provide with haproxy configuration?, it’s definitely a workaround. It’s legacy from pre SNI. Not really. The default certificate is not useless. It is the certificate to use when no other matches. Expect if you use the "strict-sni" option, it should be chosen carefully, because everyone can fallback on it. And by default, "strict-sni" is disabled. But here the question is not to know if having a default certificate is a good or a bad thing but it is if an SSL listener without any certificate is an error or not. I think it should be by default, because this is the current behavior and some users may want to rely on it to detect problems. So if we want to support it, it should be done using a keyword. Then, I agree that the feature could be nice, especially, as Willy said, if we add a way to load certificates on the fly. But I still think that having a listener rejecting all connections is a bit clumsy. It could be far better to parse its configuration but disable it. However, I don't see an easy way to do this automatically :) So, for now, maybe having a keyword to let HAProxy starts is a good compromise. -- Christopher
Re: Seeing server termination_state SD after updating from 1.6.11 to 1.7.5
Le 06/07/2017 à 18:56, Lukas Tribus a écrit : Hi Christopher, Am 30.06.2017 um 11:14 schrieb Christopher Faulet: We are seeing this as well on 1.7.5. The problem seems to be intermittent--it doesn't happen very often when I hit a system with almost no load, but is happening very frequently on a high load system. I don't believe it is causing any issues other than the log message. Hi, I already proposed a patch to fix this bug in another thread[1]. I have had no feedback yet. Could you check it please ? [1] https://www.mail-archive.com/haproxy@formilux.org/msg26543.html Simon Ellefsene also reported this on discourse: https://discourse.haproxy.org/t/session-state-sdnn-http-status-code-200-when-upgrading-to-1-7/1409/3 Your patch seems to break responses for him (seems only headers are returned): A request that would normally return around 25kb now returns ~300 bytes (just the headers, empty body), status code 200 Hi Lukas, Thanks. The proposed patched is wrong, I know. In fact, the original bug is harder to fix than I supposed first. I need more time to fully understand it. In fact, I know where the problem is, but for now, all my attempts to fix it break some other parts. It is a real mess and I need to check everything with many options (forceclose / http-server-close ...). I was so busy on this bug that I forgot to warn everyone. Thanks to be aware for me :) I'll try to do my best to proposed a patch very quickly. Then I'll probably need help to validate it on real traffic. Thanks again, -- Christopher Faulet
Re: 2x filter + keep-alive regressions (1.7 affected)
Le 30/06/2017 à 22:09, Lukas Tribus a écrit : Hello Christopher, William, Willy, et all! Matt McDonagh reported a regression on discourse [1] in 1.7.6, that causes haproxy to ignore "timeout http-keep-alive" when going through filters (aka compression is enabled) and also causes logging to be delayed. Because timeouts are ignored and wrong timeouts strike, performance issues are likely (as we don't close idle sessions as expected, we hit maxconn sooner, etc). The root cause are actually 2 different patches in 1.7 stable. This is getting a bit complex here, because in 1.7 both patches cause regressions, while in 1.8 only one of those 2 patches actually cause an issue. We are talking about: 2b553de BUG/MINOR: filters: Don't force the stream's wakeup when we wait in flt_end_analyze c0c672a BUG/MINOR: http: Fix conditions to clean up a txn and to handle the next request Repro config (I used netcat for a HTTP/1.1 request): global maxconn 100 log 10.0.0.4 syslog debug defaults log global mode http option httplog option http-server-close #option http-keep-alive timeout connect 5s timeout client 10s timeout server 20s timeout http-keep-alive 1s frontend myfrontend bind :80 default_backend mybackend backend mybackend compression algo gzip server www www.lan.ltri.eu:80 In v1.7.4 (everything fine): "mode http-server-close": as expected (logs immediately, closes idle keep-alive at "timeout http-keep-alive") "mode http-keep-alive": as expected (logs immediately, closes idle keep-alive at "timeout http-keep-alive") In v1.7.5 (due to b0c3fd3b BUG/MINOR: filters: Don't force the stream's wakeup when we wait in flt_end_analyze): "mode http-server-close": log stalled and keep-alive idle close at "timeout connect"; ignores "timeout http-keep-alive" "mode http-keep-alive": log stalled and keep-alive idle close at "timeout connect + client"; ignores "timeout http-keep-alive" In v1.7.6 (due to 73d071e BUG/MINOR: http: Fix conditions to clean up a txn and to handle the next request): "mode http-server-close": log stalled and keep-alive idle close at "timeout client"; ignores "timeout http-keep-alive" "mode http-keep-alive": log stalled for "timeout server"; keep-alive stalled for "timeout server (then log) + timeout http-keep-alive" As for 1.8: Patches have been applied as per Author-Date, while in 1.7 the patches have been applied in reverse order. "BUG/MINOR: http: Fix conditions to clean up a txn and to handle the next request" -> does not cause issues in 1.8 "BUG/MINOR: filters: Don't force the stream's wakeup when we wait in flt_end_analyze" -> has the same effect as in 1.7(.6). The 1.6/1.5 situation is unkown at this time. Let me know if something is unclear (I'm almost certain it is) ... cheers, lukas [1] http://discourse.haproxy.org/t/keep-alive-behaviour-change-since-1-7-6/1390 Hi guys, Attached patches should fix this bug. The real fix is in the last one. But all the 3 must be backported in 1.7. I made tests with the Lukas config and http-keep-alive timeout is now respected. But because filters seems to generate many bugs these last weeks, a double check is welcomed :) Many thanks Lukas. -- Christopher Faulet >From 6b6d17a120b1af9500d530403963adb4cf43d59b Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Thu, 6 Jul 2017 15:49:30 +0200 Subject: [PATCH 1/3] BUG/MINOR: stream: Don't forget to remove CF_WAKE_ONCE flag on response channel This flag can be set on a channel to pretend there is activity on it. This is a way to wake-up the corresponding stream and evaluate stream analyzers on the channel. It is correctly handled on both channels but removed only on the request channel. This patch is flagged as a bug but for now, CF_WAKE_ONCE is never set on the response channel. --- src/stream.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/stream.c b/src/stream.c index 4e34f38..1aa5475 100644 --- a/src/stream.c +++ b/src/stream.c @@ -1881,6 +1881,7 @@ struct task *process_stream(struct task *t) rp_cons_last = si_f->state; rp_prod_last = si_b->state; + res->flags &= ~CF_WAKE_ONCE; rpf_last = res->flags; if ((res->flags ^ flags) & CF_MASK_STATIC) -- 2.9.4 >From 8868eedf682b978082c46179faffa49919735d61 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Thu, 6 Jul 2017 15:51:35 +0200 Subject: [PATCH 2/3] BUG/MINOR: http: Don't reset the transaction if there are still data to send To reset an HTTP transaction, we need to be sure all data were sent, for the request and the response. There are tests on request and response buffers for that in http_resync_states function. But the return
Re: Seeing server termination_state SD after updating from 1.6.11 to 1.7.5
Le 19/07/2017 à 22:18, Christopher Faulet a écrit : Because these last weeks, there were several regressions on this part (the end of the HTTP transaction), I prefer to be careful this time. Every time I fixed a bug in one side, this broke something else from another one... And because I said that, it happens. I introduced a major bug. It is possible to have an infinite loop during states synchronization... Willy, I produced the patch for the upstream: 0001-BUG-MAJOR-http-Fix-possible-infinity-loop-in-http_sy.patch I also modified the patch for HAProxy 1.7. Sorry guys for this very silly bug ! -- Christopher >From 66cfddf44ef95ae6504910739711984ab5d89239 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Tue, 18 Jul 2017 11:18:46 +0200 Subject: [PATCH] BUG: Fix bug about wrong termination state on some requests introduced in 1.7.5 This pacth contains following fixes coming from the upstream. - * - BUG/MAJOR: http: Fix possible infinity loop in http_sync_(req|res)_state In commit "MINOR: http: Switch requests/responses in TUNNEL mode only by checking txn flags", it is possible to have an infinite loop on HTTP_MSG_CLOSING state. - * - MINOR: http: Reorder/rewrite checks in http_resync_states The previous patch removed the forced symmetry of the TUNNEL mode during the state synchronization. Here, we take care to remove body analyzer only on the channel in TUNNEL mode. In fact, today, this change has no effect because both sides are switched in same time. But this way, with some changes, it will be possible to keep body analyzer on a side (to finish the states synchronization) with the other one in TUNNEL mode. WARNING: This patch will be used to fix a bug. The fix will be commited in a very next commit. So if the fix is backported, this one must be backported too. - * - MINOR: http: Switch requests/responses in TUNNEL mode only by checking txn flags Today, the only way to have a request or a response in HTTP_MSG_TUNNEL state is to have the flag TX_CON_WANT_TUN set on the transaction. So this is a symmetric state. Both the request and the response are switch in same time in this state. This can be done only by checking transaction flags instead of relying on the other side state. This is the purpose of this patch. This way, if for any reason we need to switch only one side in TUNNEL mode, it will be possible. And to prepare asymmetric cases, we check channel flags in DONE _AND_ TUNNEL states. WARNING: This patch will be used to fix a bug. The fix will be commited in a very next commit. So if the fix is backported, this one must be backported too. - * - BUG/MEDIUM: http: Switch HTTP responses in TUNNEL mode when body length is undefined When the body length of a HTTP response is undefined, the HTTP parser is blocked in the body parsing. Before HAProxy 1.7, in this case, because AN_RES_HTTP_XFER_BODY is never set, there is no visible effect. When the server closes its connection to terminate the response, HAProxy catches it as a normal closure. Since 1.7, we always set this analyzer to enter at least once in http_response_forward_body. But, in the present case, when the server connection is closed, http_response_forward_body is called one time too many. The response is correctly sent to the client, but an error is catched and logged with "SD--" flags. To reproduce the bug, you can use the configuration "tests/test-fsm.cfg". The tests 3 and 21 hit the bug. Idea to fix the bug is to switch the response in TUNNEL mode without switching the request. This is possible because of previous patches. First, we need to detect responses with undefined body length during states synchronization. Excluding tunnelled transactions, when the response length is undefined, TX_CON_WANT_CLO is always set on the transaction. So, when states are synchronized, if TX_CON_WANT_CLO is set, the response is switched in TUNNEL mode and the request remains unchanged. Then, in http_msg_forward_body, we add a specific check to switch the response in DONE mode if the body length is undefined and if there is no data filter. This patch depends on following previous commits: * MINOR: http: Switch requests/responses in TUNNEL mode only by checking txn flags * MINOR: http: Reorder/rewrite checks in http_resync_states This patch must be backported in 1.7 with 2 previous ones. --- src/proto_http.c | 154 --- 1 file changed, 78 insertions(+), 76 deletions(-) diff --git a/src/proto_http.c b/src/proto_http.c index 94c8d639..1a36dc6f 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -5294,7 +5294,7 @@ int http_sync_req_state(struct stream *s) unsigned int old_flags = chn->flags; unsigned int old_state = txn->req.msg_state; - if (unlikely(txn->req.msg_state < HTTP_MSG_BODY)) + if (unlikely(txn->req.msg_state < HTTP_MSG_DONE))
[PATCH] Handle SMP_T_METH samples in smp_dup/smp_is_safe/smp_is_rw
Willy, Here are small patches with minor changes about samples. -- Christopher Faulet >From 364139ba3764294acbad413a4cdde94a6ea1289b Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Mon, 24 Jul 2017 16:24:39 +0200 Subject: [PATCH 3/3] MINOR: samples: Don't allocate memory for SMP_T_METH sample when method is known For known methods (GET,POST...), in samples, an enum is used instead of a chunk to reference the method. So there is no needs to allocate memory when a variable is stored with this kind of sample. --- src/vars.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/vars.c b/src/vars.c index e5448e52..8cc08399 100644 --- a/src/vars.c +++ b/src/vars.c @@ -95,7 +95,7 @@ unsigned int var_clear(struct var *var) free(var->data.u.str.str); size += var->data.u.str.len; } - else if (var->data.type == SMP_T_METH) { + else if (var->data.type == SMP_T_METH && var->data.u.meth.meth == HTTP_METH_OTHER) { free(var->data.u.meth.str.str); size += var->data.u.meth.str.len; } @@ -309,7 +309,7 @@ static int sample_store(struct vars *vars, const char *name, struct sample *smp) free(var->data.u.str.str); var_accounting_diff(vars, smp->sess, smp->strm, -var->data.u.str.len); } - else if (var->data.type == SMP_T_METH) { + else if (var->data.type == SMP_T_METH && var->data.u.meth.meth == HTTP_METH_OTHER) { free(var->data.u.meth.str.str); var_accounting_diff(vars, smp->sess, smp->strm, -var->data.u.meth.str.len); } @@ -358,6 +358,10 @@ static int sample_store(struct vars *vars, const char *name, struct sample *smp) memcpy(var->data.u.str.str, smp->data.u.str.str, var->data.u.str.len); break; case SMP_T_METH: + var->data.u.meth.meth = smp->data.u.meth.meth; + if (smp->data.u.meth.meth != HTTP_METH_OTHER) + break; + if (!var_accounting_add(vars, smp->sess, smp->strm, smp->data.u.meth.str.len)) { var->data.type = SMP_T_BOOL; /* This type doesn't use additional memory. */ return 0; @@ -368,7 +372,6 @@ static int sample_store(struct vars *vars, const char *name, struct sample *smp) var->data.type = SMP_T_BOOL; /* This type doesn't use additional memory. */ return 0; } - var->data.u.meth.meth = smp->data.u.meth.meth; var->data.u.meth.str.len = smp->data.u.meth.str.len; var->data.u.meth.str.size = smp->data.u.meth.str.len; memcpy(var->data.u.meth.str.str, smp->data.u.meth.str.str, var->data.u.meth.str.len); -- 2.13.3 >From 8d1d40f9d3a86fdc52f88b10320419f7e7decb45 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Mon, 24 Jul 2017 16:07:12 +0200 Subject: [PATCH 2/3] MINOR: samples: Handle the type SMP_T_METH in smp_is_safe and smp_is_rw For all known methods, samples are considered as safe and rewritable. For unknowns, we handle them like strings (SMP_T_STR). --- include/proto/sample.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/include/proto/sample.h b/include/proto/sample.h index 4319278a..94226d2d 100644 --- a/include/proto/sample.h +++ b/include/proto/sample.h @@ -86,6 +86,11 @@ static inline int smp_is_safe(struct sample *smp) { switch (smp->data.type) { + case SMP_T_METH: + if (smp->data.u.meth.meth != HTTP_METH_OTHER) + return 1; + /* Fall through */ + case SMP_T_STR: if ((smp->data.u.str.len < 0) || (smp->data.u.str.size && smp->data.u.str.len >= smp->data.u.str.size)) @@ -133,6 +138,11 @@ int smp_is_rw(struct sample *smp) return 0; switch (smp->data.type) { + case SMP_T_METH: + if (smp->data.u.meth.meth != HTTP_METH_OTHER) + return 1; + /* Fall through */ + case SMP_T_STR: if (!smp->data.u.str.size || smp->data.u.str.len < 0 || -- 2.13.3 >From b3a215635168a6f97d461bb8365b8a2daed531c6 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Mon, 24 Jul 2017 15:38:41 +0200 Subject: [PATCH 1/3] MINOR: samples: Handle the type SMP_T_METH when we duplicate a sample in smp_dup First, the type SMP_T_METH was not handled by smp_dup function. It was never called with this kind of samples, so it's not really a problem. But, this could be useful in future. For all known HTTP methods (GET, POST...), there is no extra space allocated for a sample of type SMP_T_METH. But for unkown methods, it uses a chunk. So, like for strings, we duplicate data, using a trash chunk. --- src/sample.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/sample.c b/src/sample.c index 20a59bea..28a5fcb2 100644 --- a/src/sample.c +++ b/src/sample.c @@ -658,6 +658,11 @@ int smp_dup(struct sample *smp) /* These type are not const. */ break; + case SMP_T_METH: + if (smp->data.u.meth.meth != HTTP_METH_OTHER) + break; + /* Fall through */ + case SMP_T_STR: tr
Re: [PATCH] MEDIUM: ssl: allow haproxy to start without default certificate
Le 27/07/2017 à 18:16, Emmanuel Hocdet a écrit : Hi Willy, Emeric Can you consider this patch? I think it’s safe and it not depend on any openssl version. (It’s possible since patch f6b37c67) Hi Manu, Since this commit, the certificates generation doesn't work anymore. I'm working on a patch. I'll send it today I guess. To work, the certificates generation uses the default certificate, to get its private key. It uses it for the generated certificates. Generate keys is pretty expensive and the one from the default certificate is not worse than another. Since commit f6b37c67, when certificates generation is performed, instead of the default certificate (default_ctx), the SSL connection is attached to the initial one (initial_ctx). The last one does not have private key, so the generation always fails. My first idea to fix the patch was to remove the initial_ctx. Because by checking all recent changes, it seems useless. Its initialization is done in the same time than default_ctx (that was not the case when initial_ctx was introduced in commit f6b37c67). But it is definitely in conflict with you current patch. Because without initial_ctx, we need to have a default_ctx. So I can probably work around this problem. But before doing it, I prefer to know if your patch will be accepted or not :) From my point of view, I can't see why anyone would want to start a SSL frontend without any certificate. Because it will reject all connections. Simos explained his use-case with Letsencrypt. But it is only useful for the first generation of the first certificate. It is easy to comment the SSL frontend at this step. Get a certificate from Letsencrypt or generate a default one by hand is quick and trivial. To automate this part, you can have a default certificate, probably self-signed, and the strict-sni parameter on the bind line (You already proposed this solution). For me, this is not a workaround, this is the way to do it. But that's just my opinion, I can understand the need :) So I'll let Willy and/or Emeric make the final decision. -- Christopher Faulet
Re: Passing SNI value ( ssl_fc_sni ) to backend's verifyhost.
.Le 25/07/2017 à 19:37, Kevin McArthur a écrit : Hi Willy, I cant replicate your results here I cloned from git and built the package with the debian/ubuntu build scripts from https://launchpad.net/~vbernat/+archive/ubuntu/haproxy-1.7 ... updating the changelog to add a 1.8-dev2 version and calling ./debian/rules binary to build the package. The git log shows: commit 2ab88675ecbf960a6f33ffe9c6a27f264150b201 Author: Willy Tarreau <w...@1wt.eu> Date: Wed Jul 5 18:23:03 2017 +0200 MINOR: ssl: compare server certificate names to the SNI on outgoing connections Hi, There is a bug in this commit. I checked with openssl 1.0.2l and 1.1.0f and I observed the same behavior than Kevin's one. SSL_SESSION_get0_hostname seems to always return NULL when the server returns a default certificate. It tried to explain why in my commit log. Kevin, could you check the patch in attachment to confirm it works ? -- Christopher Faulet >From afe2d426c6aeb82aa11af842e8f75f54a2d9130d Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Wed, 26 Jul 2017 11:50:01 +0200 Subject: [PATCH] BUG/MINOR: ssl: Fix check against SNI during server certificate verification This patch fixes the commit 2ab8867 ("MINOR: ssl: compare server certificate names to the SNI on outgoing connections") When we check the certificate sent by a server, in the verify callback, we get the SNI from the session (SSL_SESSION object). In OpenSSL, tlsext_hostname value for this session is copied from the ssl connection (SSL object). But the copy is done only if the "server_name" extension is found in the server hello message. This means the server has found a certificate matching the client's SNI. When the server returns a default certificate not matching the client's SNI, it doesn't set any "server_name" extension in the server hello message. So no SNI is set on the SSL session and SSL_SESSION_get0_hostname always returns NULL. To fix the problemn, we get the SNI directly from the SSL connection. It is always defined with the value set by the client. If the commit 2ab8867 is backported in 1.7 and/or 1.8, this one must be backported too. --- include/proto/openssl-compat.h | 5 - src/ssl_sock.c | 6 +- 2 files changed, 1 insertion(+), 10 deletions(-) diff --git a/include/proto/openssl-compat.h b/include/proto/openssl-compat.h index a1e75b47..ea92072e 100644 --- a/include/proto/openssl-compat.h +++ b/include/proto/openssl-compat.h @@ -94,11 +94,6 @@ static inline int SSL_SESSION_set1_id_context(SSL_SESSION *s, const unsigned cha * Functions introduced in OpenSSL 1.1.0 and not yet present in LibreSSL */ -static inline const char *SSL_SESSION_get0_hostname(const SSL_SESSION *sess) -{ - return sess->tlsext_hostname; -} - static inline const unsigned char *SSL_SESSION_get0_id_context(const SSL_SESSION *sess, unsigned int *sid_ctx_length) { *sid_ctx_length = sess->sid_ctx_length; diff --git a/src/ssl_sock.c b/src/ssl_sock.c index fa815715..42d27de9 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -3951,11 +3951,7 @@ static int ssl_sock_srv_verifycbk(int ok, X509_STORE_CTX *ctx) */ servername = objt_server(conn->target)->ssl_ctx.verify_host; if (!servername) { - SSL_SESSION *ssl_sess = SSL_get_session(conn->xprt_ctx); - if (!ssl_sess) - return ok; - - servername = SSL_SESSION_get0_hostname(ssl_sess); + servername = SSL_get_servername(conn->xprt_ctx, TLSEXT_NAMETYPE_host_name); if (!servername) return ok; } -- 2.13.3
Re: Seeing server termination_state SD after updating from 1.6.11 to 1.7.5
Le 06/07/2017 à 21:18, Christopher Faulet a écrit : Le 06/07/2017 à 18:56, Lukas Tribus a écrit : Hi Christopher, Am 30.06.2017 um 11:14 schrieb Christopher Faulet: We are seeing this as well on 1.7.5. The problem seems to be intermittent--it doesn't happen very often when I hit a system with almost no load, but is happening very frequently on a high load system. I don't believe it is causing any issues other than the log message. Hi, I already proposed a patch to fix this bug in another thread[1]. I have had no feedback yet. Could you check it please ? [1] https://www.mail-archive.com/haproxy@formilux.org/msg26543.html Simon Ellefsene also reported this on discourse: https://discourse.haproxy.org/t/session-state-sdnn-http-status-code-200-when-upgrading-to-1-7/1409/3 Your patch seems to break responses for him (seems only headers are returned): A request that would normally return around 25kb now returns ~300 bytes (just the headers, empty body), status code 200 Hi Lukas, Thanks. The proposed patched is wrong, I know. In fact, the original bug is harder to fix than I supposed first. I need more time to fully understand it. In fact, I know where the problem is, but for now, all my attempts to fix it break some other parts. It is a real mess and I need to check everything with many options (forceclose / http-server-close ...). I was so busy on this bug that I forgot to warn everyone. Thanks to be aware for me :) I'll try to do my best to proposed a patch very quickly. Then I'll probably need help to validate it on real traffic. Hi all, We finally pushed fixes for this bug in 1.8. I attached a file with all changes in a single patch for 1.7. It could be nice if anyone can check if it does not break anything. And for guys who reported the bug, It could be good to know if it is fixed with this patch. Because these last weeks, there were several regressions on this part (the end of the HTTP transaction), I prefer to be careful this time. Every time I fixed a bug in one side, this broke something else from another one... Thanks -- Christopher Faulet >From 8748ee8fe994a3d7facd21ec39d209e894c14510 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Tue, 18 Jul 2017 11:18:46 +0200 Subject: [PATCH] BUG: Fix bug about wrong termination state on some requests introduced in 1.7.5 This pacth contains following fixes coming from the upstream. - * - MINOR: http: Reorder/rewrite checks in http_resync_states The previous patch removed the forced symmetry of the TUNNEL mode during the state synchronization. Here, we take care to remove body analyzer only on the channel in TUNNEL mode. In fact, today, this change has no effect because both sides are switched in same time. But this way, with some changes, it will be possible to keep body analyzer on a side (to finish the states synchronization) with the other one in TUNNEL mode. WARNING: This patch will be used to fix a bug. The fix will be commited in a very next commit. So if the fix is backported, this one must be backported too. - * - MINOR: http: Switch requests/responses in TUNNEL mode only by checking txn flags Today, the only way to have a request or a response in HTTP_MSG_TUNNEL state is to have the flag TX_CON_WANT_TUN set on the transaction. So this is a symmetric state. Both the request and the response are switch in same time in this state. This can be done only by checking transaction flags instead of relying on the other side state. This is the purpose of this patch. This way, if for any reason we need to switch only one side in TUNNEL mode, it will be possible. And to prepare asymmetric cases, we check channel flags in DONE _AND_ TUNNEL states. WARNING: This patch will be used to fix a bug. The fix will be commited in a very next commit. So if the fix is backported, this one must be backported too. - * - BUG/MEDIUM: http: Switch HTTP responses in TUNNEL mode when body length is undefined When the body length of a HTTP response is undefined, the HTTP parser is blocked in the body parsing. Before HAProxy 1.7, in this case, because AN_RES_HTTP_XFER_BODY is never set, there is no visible effect. When the server closes its connection to terminate the response, HAProxy catches it as a normal closure. Since 1.7, we always set this analyzer to enter at least once in http_response_forward_body. But, in the present case, when the server connection is closed, http_response_forward_body is called one time too many. The response is correctly sent to the client, but an error is catched and logged with "SD--" flags. To reproduce the bug, you can use the configuration "tests/test-fsm.cfg". The tests 3 and 21 hit the bug. Idea to fix the bug is to switch the response in TUNNEL mode without switching the request. This is possible because of previous patches. First, we need to detect responses with undefined body length during stat
Re: Seeing server termination_state SD after updating from 1.6.11 to 1.7.5
Le 29/06/2017 à 14:52, Brian Loss a écrit : On Sat, Jun 03, 2017 at 00:19:19 +, Willy Tarreau wrote: On Fri, Jun 02, 2017 at 06:17:38PM +, Bernard McCormack wrote: > I've done some more testing. After changing the frontend ngnix to use ''' > proxy_http_version 1.1''' > The errors stopped occurring. I wonder if it has to do with how haproxy > handles http/1.0 request? no because there is no fundamental difference between the two. However I suspect that when you force nginx to 1.0 and it doesn't know the response, it only uses close mode which triggers what looks like an haproxy bug. But when you enable 1.1, it uses chunked encoding and there's no such issue anymore. I'll have a look. Cheers, Willy We are seeing this as well on 1.7.5. The problem seems to be intermittent--it doesn't happen very often when I hit a system with almost no load, but is happening very frequently on a high load system. I don't believe it is causing any issues other than the log message. Hi, I already proposed a patch to fix this bug in another thread[1]. I have had no feedback yet. Could you check it please ? [1] https://www.mail-archive.com/haproxy@formilux.org/msg26543.html -- Christopher Faulet diff --git a/src/proto_http.c b/src/proto_http.c index e5f67e5..521743a 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -6958,14 +6958,6 @@ int http_response_forward_body(struct stream *s, struct channel *res, int an_bit if ((msg->flags & HTTP_MSGF_TE_CHNK) || (msg->flags & HTTP_MSGF_COMPRESSING)) res->flags |= CF_EXPECT_MORE; - /* If there is neither content-length, nor transfer-encoding header - * _AND_ there is no data filtering, we can safely forward all data - * indefinitely. */ - if (!(msg->flags & HTTP_MSGF_XFER_LEN) && !HAS_DATA_FILTERS(s, res)) { - buffer_flush(res->buf); - channel_forward_forever(res); - } - /* the stream handler will take care of timeouts and errors */ return 0; @@ -7042,9 +7034,20 @@ http_msg_forward_body(struct stream *s, struct http_msg *msg) goto missing_data_or_waiting; } - /* The server still sending data that should be filtered */ - if (!(msg->flags & HTTP_MSGF_XFER_LEN) && !(chn->flags & CF_SHUTR)) - goto missing_data_or_waiting; + /* This check can only be true for a response. HTTP_MSGF_XFER_LEN is + * always set for a request. */ + if (!(msg->flags & HTTP_MSGF_XFER_LEN)) { + /* The server still sending data that should be filtered */ + if (!(chn->flags & CF_SHUTR) && HAS_DATA_FILTERS(s, chn)) + goto missing_data_or_waiting; + /* There is no data filtering (or no more data to filter) but at + * least one active filter on the stream. So force infinite + * forwarding here. */ + else if (HAS_FILTERS(s)) { + buffer_flush(chn->buf); + channel_forward_forever(chn); + } + } msg->msg_state = HTTP_MSG_ENDING;
Re: 1.7.6 redirect regression (commit 73d071ecc84e0f26ebe1b9576fffc1ed0357ef32)
Le 21/06/2017 à 07:27, Jarno Huuskonen a écrit : Hi, 1.7.6 gives me errors (in log) with redirect rules. Example config that produces 503 errors in logs and curl -v complains: < HTTP/1.1 301 Moved Permanently < Content-length: 0 < Location: https://127.0.0.1:8080/ < * Excess found in a non pipelined read: excess = 212 url = / * (zero-length body) * Connection #0 to host 127.0.0.1 left intact Example config: frontend test bind ipv4@127.0.0.1:8080 redirect scheme https code 301 if { dst_port 8080 } # this also gives 503 #http-request redirect scheme https code 301 if { dst_port 8080 } default_backend test_be2 backend test_be2 server wp1 ip.add.re.ss:80 id 1 (I have quite a few redirect rules in prod. config and all seem to produce 503 in logs). git bisect gives this commit as "bad": commit 73d071ecc84e0f26ebe1b9576fffc1ed0357ef32 BUG/MINOR: http: Fix conditions to clean up a txn and to handle the next req If I revert this commit then the example config gives 301 in log and curl doesn't complain about read. -Jarno Hi Jarno, This bug was fixed in 1.8 (see commit 9f724edbd8d1cf595d4177c3612607f395b4380e "BUG/MEDIUM: http: Drop the connection establishment when a redirect is performed"). I attached the patch. Could you quickly check if it fixes your bug (it should do so) ? It was not backported in 1.7 because we thought it only affected the 1.8. I will check with Willy. Thanks. -- Christopher Faulet commit 9f724edbd8d1cf595d4177c3612607f395b4380e Author: Christopher Faulet <cfau...@haproxy.com> Date: Thu Apr 20 14:16:13 2017 +0200 BUG/MEDIUM: http: Drop the connection establishment when a redirect is performed This bug occurs when a redirect rule is applied during the request analysis on a persistent connection, on a proxy without any server. This means, in a frontend section or in a listen/backend section with no "server" line. Because the transaction processing is shortened, no server can be selected to perform the connection. So if we try to establish it, this fails and a 503 error is returned, while a 3XX was already sent. So, in this case, HAProxy generates 2 replies and only the first one is expected. Here is the configuration snippet to easily reproduce the problem: listen www bind :8080 mode http timeout connect 5s timeout client 3s timeout server 6s redirect location / A simple HTTP/1.1 request without body will trigger the bug: $ telnet 0 8080 Trying 0.0.0.0... Connected to 0. Escape character is '^]'. GET / HTTP/1.1 HTTP/1.1 302 Found Cache-Control: no-cache Content-length: 0 Location: / HTTP/1.0 503 Service Unavailable Cache-Control: no-cache Connection: close Content-Type: text/html 503 Service Unavailable No server is available to handle this request. Connection closed by foreign host. [wt: only 1.8-dev is impacted though the bug is present in older ones] diff --git a/src/proto_http.c b/src/proto_http.c index 24d034a..6c940f1 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -4261,6 +4261,8 @@ static int http_apply_redirect_rule(struct redirect_rule *rule, struct stream *s /* Trim any possible response */ res->chn->buf->i = 0; res->next = res->sov = 0; + /* If not already done, don't perform any connection establishment */ + channel_dont_connect(req->chn); } else { /* keep-alive not possible */ if (unlikely(txn->flags & TX_USE_PX_CONN)) {
Re: SD Termination state after upgrade from 1.5.12 to 1.7.3
Le 16/06/2017 à 16:19, Christopher Faulet a écrit : Le 16/06/2017 à 13:29, Juan Pablo Mora a écrit : Linux version: Red Hat Enterprise Linux Server release 5.11 (Tikanga) Linux dpoweb08 2.6.18-417.el5 #1 SMP Sat Nov 19 14:54:59 EST 2016 x86_64 x86_64 x86_64 GNU/Linux HAProxy versión: 1.7.5 Summary: After upgrading to HAProxy 1.7.5, requests with "SD" status in logs have started to appear in one webservice. There have been no changes to the webservice we invoke. The log shows: Jun 16 12:41:06 localhost.lognet.pre.logalty.es haproxy[17315]: 172.31.2.70:59365 [16/Jun/2017:12:41:06.890] HTTP_INTERNO BUS/BUS1 1/0/0/65/75 200 225565 390 - - SD-- 12/2/0/0/0 0/0 {|Keep-Alive|174|} {|close|} "POST /lgt/lgtbus/rest/private/receiverServiceLight/getSignedBinaryContent HTTP/1.1" The same POST in HAProxy 1.5.12 ends with “ 200 “ and ““. Hi, Bernard McCormack has already reported the same issue. I'm on it. I approximately identified the problem but I need more time to fully understand it and to find the best way to fix it. The bug occurs where we are waiting the end of the response to synchronize the request and the response. The close on the server side is detected as an error when there is no content-length nor transfer-encoding header. But it is a bit tricky because it is partly a timing issue. Depending when the connection close is catched, an error is triggered or not. So I'm on it. Stay tuned. Hi guys, I worked on this problem. Could you check if the attached patch fixes your bug ? Thanks, -- Christopher Faulet diff --git a/src/proto_http.c b/src/proto_http.c index e5f67e5..521743a 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -6958,14 +6958,6 @@ int http_response_forward_body(struct stream *s, struct channel *res, int an_bit if ((msg->flags & HTTP_MSGF_TE_CHNK) || (msg->flags & HTTP_MSGF_COMPRESSING)) res->flags |= CF_EXPECT_MORE; - /* If there is neither content-length, nor transfer-encoding header - * _AND_ there is no data filtering, we can safely forward all data - * indefinitely. */ - if (!(msg->flags & HTTP_MSGF_XFER_LEN) && !HAS_DATA_FILTERS(s, res)) { - buffer_flush(res->buf); - channel_forward_forever(res); - } - /* the stream handler will take care of timeouts and errors */ return 0; @@ -7042,9 +7034,20 @@ http_msg_forward_body(struct stream *s, struct http_msg *msg) goto missing_data_or_waiting; } - /* The server still sending data that should be filtered */ - if (!(msg->flags & HTTP_MSGF_XFER_LEN) && !(chn->flags & CF_SHUTR)) - goto missing_data_or_waiting; + /* This check can only be true for a response. HTTP_MSGF_XFER_LEN is + * always set for a request. */ + if (!(msg->flags & HTTP_MSGF_XFER_LEN)) { + /* The server still sending data that should be filtered */ + if (!(chn->flags & CF_SHUTR) && HAS_DATA_FILTERS(s, chn)) + goto missing_data_or_waiting; + /* There is no data filtering (or no more data to filter) but at + * least one active filter on the stream. So force infinite + * forwarding here. */ + else if (HAS_FILTERS(s)) { + buffer_flush(chn->buf); + channel_forward_forever(chn); + } + } msg->msg_state = HTTP_MSG_ENDING;
Re: haproxy does not capture the complete request header host sometimes
Le 13/06/2017 à 14:16, Christopher Faulet a écrit : Le 13/06/2017 à 10:31, siclesang a écrit : haproxy balances by host,but often captures a part of request header host or null, and requests balance to default server. how to debug it , Hi, I'll try to help you. Can you share your configuration please ? It could help to find a potential bug. Could you also provide the tcpdump of a buggy request ? And finally, could you upgrade your HAProxy to the last 1.6 version (1.6.12) to be sure ? Hi, Just for the record. After some exchanges in private with siclesang, we found the bug in the configuration parser, because of a too high value for tune.http.maxhdr. Here is the explanation: Well, I think I found the problem. This is not a bug (not really). There is something I missed in your configuration. You set tune.http.maxhdr to 64000. I guess you keep this parameter during all your tests. This is an invalid value. It needs to be in the range [0, 32767]. This is mandatory to avoid integer overflow. the size of the array where headers offsets are stored is a signed short. To be fair, there is no check on this value during the configuration parsing. And the documentation does not specify any range for this parameter. I will post a fix very quickly to avoid errors. BTW, this is a really huge value. The default one is 101. You can legitimately increase this value. But there is no reason to have 64000 headers in an HTTP message. IMHO, 1000/2000 is already an very huge limit. I attached a patch to improve the configuration parsing and to update the documentation. It can be backported in 1.7, 1.6 and 1.5. I finally marked this patch as a bug fix. Thanks siclesang for your help, -- Christopher Faulet >From 5b00b9eeda0838a2ab86835c9e3b28b503889b24 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Wed, 21 Jun 2017 16:31:35 +0200 Subject: [PATCH] BUG/MINOR: cfgparse: Check if tune.http.maxhdr is in the range 1..32767 We cannot store more than 32K headers in the structure hdr_idx, because internaly we use signed short integers. To avoid any bugs (due to an integers overflow), a check has been added on tune.http.maxhdr to be sure to not set a value greater than 32767 and lower than 1 (because this is a nonsense to set this parameter to a value <= 0). The documentation has been updated accordingly. This patch can be backported in 1.7, 1.6 and 1.5. --- doc/configuration.txt | 6 +++--- src/cfgparse.c| 8 +++- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/doc/configuration.txt b/doc/configuration.txt index 49bfd85..082b857 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -1374,9 +1374,9 @@ tune.http.maxhdr are blocked with "502 Bad Gateway". The default value is 101, which is enough for all usages, considering that the widely deployed Apache server uses the same limit. It can be useful to push this limit further to temporarily allow - a buggy application to work by the time it gets fixed. Keep in mind that each - new header consumes 32bits of memory for each session, so don't push this - limit too high. + a buggy application to work by the time it gets fixed. The accepted range is + 1..32767. Keep in mind that each new header consumes 32bits of memory for + each session, so don't push this limit too high. tune.idletimer Sets the duration after which haproxy will consider that an empty buffer is diff --git a/src/cfgparse.c b/src/cfgparse.c index 261a0eb..3706bca 100644 --- a/src/cfgparse.c +++ b/src/cfgparse.c @@ -916,7 +916,13 @@ int cfg_parse_global(const char *file, int linenum, char **args, int kwm) err_code |= ERR_ALERT | ERR_FATAL; goto out; } - global.tune.max_http_hdr = atol(args[1]); + global.tune.max_http_hdr = atoi(args[1]); + if (global.tune.max_http_hdr < 1 || global.tune.max_http_hdr > 32767) { + Alert("parsing [%s:%d] : '%s' expects a numeric value between 1 and 32767\n", + file, linenum, args[0]); + err_code |= ERR_ALERT | ERR_FATAL; + goto out; + } } else if (!strcmp(args[0], "tune.comp.maxlevel")) { if (alertif_too_many_args(1, file, linenum, args, _code)) -- 2.9.4
Re: Issue while using Proxy protocol in TCP mode
Le 13/06/2017 à 10:07, Vijay Bais a écrit : Hello, I am using HAProxy version 1.5-dev25-a339395. Hi, This is a *very* old version and not a stable one. Please, first of all, upgrade your HAProxy to a stable version. For HAProxy 1.5, the last stable version is 1.5.19. Facing issues while using proxy protocol to preserve source IP address despite reverse proxies ref: https://www.haproxy.com/blog/preserve-source-ip-address-despite-reverse-proxies/ Haproxy log of load-balancer shows below messages intermittently: "Received something which does not look like a PROXY protocol header" This breaks of my application often. But works swiftly when removed the proxy protocol options. Any help would be great. Thanks, Vijay B -- Christopher Faulet
Re: haproxy does not capture the complete request header host sometimes
Le 13/06/2017 à 10:31, siclesang a écrit : haproxy balances by host,but often captures a part of request header host or null, and requests balance to default server. how to debug it , Hi, I'll try to help you. Can you share your configuration please ? It could help to find a potential bug. Could you also provide the tcpdump of a buggy request ? And finally, could you upgrade your HAProxy to the last 1.6 version (1.6.12) to be sure ? -- Christopher
[PATCH] BUG/MINOR: acls: Set the right refflag when patterns are, loaded from a map
Hi, Here is a little patch to fix a little bug :) Thanks -- Christopher Faulet >From e11c7f0ffe159f1e77c2c2568dd5f217f67327ee Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Wed, 14 Jun 2017 14:41:33 +0200 Subject: [PATCH] BUG/MINOR: acls: Set the right refflag when patterns are loaded from a map For an ACL, we can load patterns from a map using the flag -M. For example: acl test hdr(host) -M -f hosts.map The file is parsed as a map et the ACL will be executed as expected. But the reference flag is wrong. It is set to PAT_REF_ACL. So the map will never be listed by a "show map" on the stat socket. Setting the reference flag to PAT_REF_ACL|PAT_REF_MAP fixes the bug. --- src/acl.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/acl.c b/src/acl.c index da62e6c..9b67a61 100644 --- a/src/acl.c +++ b/src/acl.c @@ -140,7 +140,7 @@ struct acl_expr *parse_acl_expr(const char **args, char **err, struct arg_list * __label__ out_return, out_free_expr; struct acl_expr *expr; struct acl_keyword *aclkw; - int patflags; + int refflags, patflags; const char *arg; struct sample_expr *smp = NULL; int idx = 0; @@ -441,6 +441,7 @@ struct acl_expr *parse_acl_expr(const char **args, char **err, struct arg_list * * -u : force the unique id of the acl * -- : everything after this is not an option */ + refflags = PAT_REF_ACL; patflags = 0; is_loaded = 0; unique_id = -1; @@ -470,7 +471,7 @@ struct acl_expr *parse_acl_expr(const char **args, char **err, struct arg_list * goto out_free_expr; } - if (!pattern_read_from_file(>pat, PAT_REF_ACL, args[1], patflags, load_as_map, err, file, line)) + if (!pattern_read_from_file(>pat, refflags, args[1], patflags, load_as_map, err, file, line)) goto out_free_expr; is_loaded = 1; args++; @@ -503,6 +504,7 @@ struct acl_expr *parse_acl_expr(const char **args, char **err, struct arg_list * args++; } else if (strcmp(*args, "-M") == 0) { + refflags |= PAT_REF_MAP; load_as_map = 1; } else if (strcmp(*args, "--") == 0) { -- 2.9.4
Re: [PATCH] BUG/MINOR: http/filters: Be sure to wait if a filter loops in HTTP_MSG_ENDING
Le 14/06/2017 à 16:47, Willy Tarreau a écrit : On Wed, Jun 14, 2017 at 03:43:19PM +0200, Christopher Faulet wrote: A filter can choose to loop when a HTTP message is in the state HTTP_MSG_ENDING. But the transaction is terminated with an error if the input is closed (CF_SHUTR set on the channel). At this step, we have received all data, so we can wait. So now, we also check the parser state before leaving. This fix only affects configs that use a filter that can wait in http_forward_data or http_end callbacks, when all data were parsed. Applied, thanks. Do you think this one could explain the abnormal "SD" termination flags reported 2 weeks ago by Bernard McCormack ? Hum, after a quick look, I don't think so. He doesn't use any filter (like HTTP compression). So the HTTP parser cannot be in the HTTP_MSG_ENDING state. But, I will check his issue more carefully to be sure. -- Christopher Faulet
[PATCH] BUG/MINOR: ssl: Be sure that SSLv3 connection methods exist for openssl < 1.1.0
Hi, HAProxy compilation fails if OpenSSL 1.0.2 is compiled without the support of SSLv3 methods (SSL3_server_method and SSL3_client_method). The manpage SSL_CTX_new(3) specifies that these functions are available if OPENSSL_NO_SSL3_METHOD is undefined. Here is a fix. Thanks, -- Christopher Faulet >From f8d90c49944a64b153091a6f524dd22db26b8c80 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Thu, 8 Jun 2017 22:18:52 +0200 Subject: [PATCH] BUG/MINOR: ssl: Be sure that SSLv3 connection methods exist for openssl < 1.1.0 For openssl 1.0.2, SSLv3_server_method and SSLv3_client_method are undefined if OPENSSL_NO_SSL3_METHOD is set. So we must add a check on this macro before using these functions. --- src/ssl_sock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/ssl_sock.c b/src/ssl_sock.c index af09cfb..3680515 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -1835,7 +1835,7 @@ typedef enum { SET_CLIENT, SET_SERVER } set_context_func; static void ctx_set_SSLv3_func(SSL_CTX *ctx, set_context_func c) { -#if SSL_OP_NO_SSLv3 +#if SSL_OP_NO_SSLv3 && !defined(OPENSSL_NO_SSL3_METHOD) c == SET_SERVER ? SSL_CTX_set_ssl_version(ctx, SSLv3_server_method()) : SSL_CTX_set_ssl_version(ctx, SSLv3_client_method()); #endif -- 2.9.4
[PATCH] BUG/MINOR: http/filters: Be sure to wait if a filter loops in HTTP_MSG_ENDING
This one is about filters. Thanks -- Christopher Faulet >From 5358c71aa67a5fe21f29063bc7f837073ef8d20d Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Fri, 31 Mar 2017 15:37:29 +0200 Subject: [PATCH] BUG/MINOR: http/filters: Be sure to wait if a filter loops in HTTP_MSG_ENDING A filter can choose to loop when a HTTP message is in the state HTTP_MSG_ENDING. But the transaction is terminated with an error if the input is closed (CF_SHUTR set on the channel). At this step, we have received all data, so we can wait. So now, we also check the parser state before leaving. This fix only affects configs that use a filter that can wait in http_forward_data or http_end callbacks, when all data were parsed. --- src/proto_http.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/proto_http.c b/src/proto_http.c index a72f302..46cb6ff 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -5803,7 +5803,7 @@ int http_request_forward_body(struct stream *s, struct channel *req, int an_bit) missing_data_or_waiting: /* stop waiting for data if the input is closed before the end */ - if (req->flags & CF_SHUTR) { + if (msg->msg_state < HTTP_MSG_ENDING && req->flags & CF_SHUTR) { if (!(s->flags & SF_ERR_MASK)) s->flags |= SF_ERR_CLICL; if (!(s->flags & SF_FINST_MASK)) { @@ -6962,7 +6962,7 @@ int http_response_forward_body(struct stream *s, struct channel *res, int an_bit * so we don't want to count this as a server abort. Otherwise it's a * server abort. */ - if (res->flags & CF_SHUTR) { + if (msg->msg_state < HTTP_MSG_ENDING && res->flags & CF_SHUTR) { if ((s->req.flags & (CF_SHUTR|CF_SHUTW)) == (CF_SHUTR|CF_SHUTW)) goto aborted_xfer; /* If we have some pending data, we continue the processing */ -- 2.9.4
Re: SD Termination state after upgrade from 1.5.12 to 1.7.3
Le 16/06/2017 à 13:29, Juan Pablo Mora a écrit : Linux version: Red Hat Enterprise Linux Server release 5.11 (Tikanga) Linux dpoweb08 2.6.18-417.el5 #1 SMP Sat Nov 19 14:54:59 EST 2016 x86_64 x86_64 x86_64 GNU/Linux HAProxy versión: 1.7.5 Summary: After upgrading to HAProxy 1.7.5, requests with "SD" status in logs have started to appear in one webservice. There have been no changes to the webservice we invoke. The log shows: Jun 16 12:41:06 localhost.lognet.pre.logalty.es haproxy[17315]: 172.31.2.70:59365 [16/Jun/2017:12:41:06.890] HTTP_INTERNO BUS/BUS1 1/0/0/65/75 200 225565 390 - - SD-- 12/2/0/0/0 0/0 {|Keep-Alive|174|} {|close|} "POST /lgt/lgtbus/rest/private/receiverServiceLight/getSignedBinaryContent HTTP/1.1" The same POST in HAProxy 1.5.12 ends with “ 200 “ and ““. Hi, Bernard McCormack has already reported the same issue. I'm on it. I approximately identified the problem but I need more time to fully understand it and to find the best way to fix it. The bug occurs where we are waiting the end of the response to synchronize the request and the response. The close on the server side is detected as an error when there is no content-length nor transfer-encoding header. But it is a bit tricky because it is partly a timing issue. Depending when the connection close is catched, an error is triggered or not. So I'm on it. Stay tuned. -- Christopher Faulet
Re: Issue while using Proxy protocol in TCP mode
Le 13/06/2017 à 18:37, Vijay Bais a écrit : Yes, I agree it's an old and unstable version; will upgrade to a stable release. The client is also haproxy of same old version, which forwards the traffic using proxy protocol (similar to the link specified earlier). Ok, If the problem is still there with a stable release, be sure to have the "send-proxy" directive on your server line (the one which forwards the traffic to haproxy itself). If you have any doubt about your configuration, please, share it. -- Christopher Faulet
Re: Issue while using Proxy protocol in TCP mode
Le 14/06/2017 à 13:07, Vijay Bais a écrit : On Wed, Jun 14, 2017 at 3:06 PM, Christopher Faulet <cfau...@haproxy.com <mailto:cfau...@haproxy.com>> wrote: Ok, If the problem is still there with a stable release, be sure to have the "send-proxy" directive on your server line (the one which forwards the traffic to haproxy itself). If you have any doubt about your configuration, please, share it. Below are the snippets on haproxy configuration of reverse proxy and load balancer: *On reverse proxy(1.2.3.4):* defaults log global mode tcp option tcplog listen revproxy :80 server load-balancer 5.6.7.8:80 <http://5.6.7.8:80> send-proxy *On load balancer(5.6.7.8):* defaults log global option forwardfor mode http option httplog option httpclose frontend web bind 5.6.7.8:80 <http://5.6.7.8:80> acl revproxy src 1.2.3.4/32 <http://1.2.3.4/32> tcp-request connection expect-proxy layer4 if revproxy default_backend web-backend backend web-backend balance leastconn server node1 11.11.11.11:8080 <http://11.11.11.11:8080> check server node2 22.22.22.22:8080 <http://22.22.22.22:8080> check Let me know if anything seems incorrect here. Nothing strange here. In your first message, you said that the error is intermittent. does it means that some HTTP requests are correctly handled and others fail, all coming from 1.2.3.4 through the listener "revproxy" ? With this kind of configuration, all connections from 1.2.3.4 to 5.6.7.8:80 must use the PROXY protocol. Could you start your 2 instances of haproxy with the debug mode enabled (-d option) to do some requests and provide their outputs and the logs ? -- Christopher Faulet
Re: cppcheck finding
Le 15/09/2017 à 15:07, Илья Шипицин a écrit : and what about CI ? something like gitlab-ci, travis, jenkins ? I'll invest some efforts in that No CI. This would be useful to have one but we have no time to work on it for now. Having a CI is not a big deal. The harder is to write tests and set up and maintain the build/test matrix. -- Christopher Faulet
Re: cppcheck finding
Le 15/09/2017 à 08:36, Илья Шипицин a écrit : great, thank for the feedback. there're few things like that [src/flt_http_comp.c:926] -> [src/flt_http_comp.c:926]: (warning) Either the condition 'txn' is redundant or there is possible null pointer dereference: txn. [src/flt_spoe.c:2765] -> [src/flt_spoe.c:2766]: (warning) Either the condition 'ctx==NULL' is redundant or there is possible null pointer dereference: ctx. [src/haproxy.c:568]: (error) Common realloc mistake: 'next_argv' nulled but not freed upon failure [src/server.c:3993] -> [src/server.c:3995]: (warning) Either the condition 'nameserver' is redundant or there is possible null pointer dereference: nameserver. [src/ssl_sock.c:4403] -> [src/ssl_sock.c:4397]: (warning) Either the condition '!bind_conf' is redundant or there is possible null pointer dereference: bind_conf. Hi, You're right, there are bugs there. The worst is on the compression filter. I attached patches to fix them. Willy, could you merge it please ? Some of them must be backported in 1.7. Thanks, -- Christopher Faulet >From 362bf07d61b06469bff839886d52db24daa2aa5e Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Fri, 15 Sep 2017 10:14:43 +0200 Subject: [PATCH 1/5] BUG/MEDIUM: compression: Fix check on txn in smp_fetch_res_comp_algo The check was totally messed up. In the worse case, it led to a crash, when res.comp_algo sample fetch was retrieved on uncompressed response (with the compression enabled). This patch must be backported in 1.7. --- src/flt_http_comp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/flt_http_comp.c b/src/flt_http_comp.c index 64c669d54..4d5332832 100644 --- a/src/flt_http_comp.c +++ b/src/flt_http_comp.c @@ -923,7 +923,7 @@ smp_fetch_res_comp_algo(const struct arg *args, struct sample *smp, struct filter *filter; struct comp_state *st; - if (!(txn || !(txn->rsp.flags & HTTP_MSGF_COMPRESSING))) + if (!txn || !(txn->rsp.flags & HTTP_MSGF_COMPRESSING)) return 0; list_for_each_entry(filter, _flt(smp->strm)->filters, list) { -- 2.13.5 >From 30bb79ce1a47666d5d9cd20636e97b7589e34dd7 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Fri, 15 Sep 2017 11:39:36 +0200 Subject: [PATCH 2/5] BUG/MINOR: compression: Check response headers before http-response rules eval This is required if we want to use res.comp or res.comp_algo sample fetches in http-response rules. This patch must be backported in 1.7. --- src/flt_http_comp.c | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/src/flt_http_comp.c b/src/flt_http_comp.c index 4d5332832..0ed252895 100644 --- a/src/flt_http_comp.c +++ b/src/flt_http_comp.c @@ -103,6 +103,12 @@ comp_start_analyze(struct stream *s, struct filter *filter, struct channel *chn) st->initialized = 0; st->finished= 0; filter->ctx = st; + + /* Register post-analyzer on AN_RES_WAIT_HTTP because we need to + * analyze response headers before http-response rules execution + * to be sure we can use res.comp and res.comp_algo sample + * fetches */ + filter->post_analyzers |= AN_RES_WAIT_HTTP; } return 1; } @@ -135,7 +141,8 @@ comp_http_headers(struct stream *s, struct filter *filter, struct http_msg *msg) if (!(msg->chn->flags & CF_ISRESP)) select_compression_request_header(st, s, msg); else { - select_compression_response_header(st, s, msg); + /* Response headers have already been checked in + * comp_http_post_analyze callback. */ if (st->comp_algo) { register_data_filter(s, msg->chn, filter); st->hdrs_len = s->txn->rsp.sov; @@ -147,6 +154,26 @@ comp_http_headers(struct stream *s, struct filter *filter, struct http_msg *msg) } static int +comp_http_post_analyze(struct stream *s, struct filter *filter, + struct channel *chn, unsigned an_bit) +{ + struct http_txn *txn = s->txn; + struct http_msg *msg = >rsp; + struct comp_state *st = filter->ctx; + + if (an_bit != AN_RES_WAIT_HTTP) + goto end; + + if (!strm_fe(s)->comp && !s->be->comp) + goto end; + + select_compression_response_header(st, s, msg); + + end: + return 1; +} + +static int comp_http_data(struct stream *s, struct filter *filter, struct http_msg *msg) { struct comp_state *st = filter->ctx; @@ -768,6 +795,7 @@ struct flt_ops comp_ops = { .channel_start_analyze = comp_start_analyze, .channel_end_analyze = comp_end_analyze, + .channel_post_analyze = comp_http_post_analyze, .http_headers = comp_http_headers, .http_data = comp_http_data, -- 2.13.5 >From eef0b960798c3db93923f2c920f27bab6e53286e Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Fri, 15 Sep 2017 11:51:18 +0200 Subject: [PATCH 3/5] BUG/MINOR: spoe: Don't rely on SPOE ctx in debug message
Re: Haproxy segfault error 4 in libc-2.24
Le 04/10/2017 à 05:12, Willy Tarreau a écrit : On Wed, Oct 04, 2017 at 05:07:07AM +0200, Willy Tarreau wrote: On Wed, Oct 04, 2017 at 04:40:53AM +0200, Willy Tarreau wrote: On Tue, Oct 03, 2017 at 06:57:45PM +0200, Marcus Ulbrich wrote: Hey Jarno, This seems to work stable! The idea for this acl was to prevent attackers testing for example MySQL injection by sleep command. ,,sleep" is in none of our URLs! Do you have an idea about an acl about this not crashing haproxy? I wouldn't be surprized if there was an issue for example with a %00 causing a length mismatch between two parts. At least now we have an idea where to look for the bug. It shouldn't take very long anymore to spot the problem. I'm seeing an issue in the url_dec sample converter : static int sample_conv_url_dec(const struct arg *args, struct sample *smp, void *private) { /* If the constant flag is set or if not size is avalaible at * the end of the buffer, copy the string in other buffer * before decoding. */ if (smp->flags & SMP_F_CONST || smp->data.u.str.size <= smp->data.u.str. len) { struct chunk *str = get_trash_chunk(); memcpy(str->str, smp->data.u.str.str, smp->data.u.str.len); Here it's missing a check to ensure that str.len <= str.size. Normally there is no known case where this could happen, but the preferred smp_dup() performs the check and truncates the input if needed. I don't know if there's a reason for this to ever happen. Hmmm in fact there's one theorical case where it could happen, it's if the sample contains a full buffer (str.len == str.size), and then we can overflow the block by one byte when writing the trailing zero. But it's not possible in HTTP since the request is at least a few bytes long, and there's the reserve. smp->data.u.str.str = str->str; smp->data.u.str.size = str->size; smp->flags &= ~SMP_F_CONST; } So if you're interested, you can replace the block above by this : if (smp->flags & SMP_F_CONST || smp->data.u.str.size <= smp->data.u.str. len) smp_dup(smp); If this stops crashing we'll have to try to figure in which situation the case above may happen :-/ By the way if you manage to print the contents of "smp" in sample_conv_str2lower() it could help, at least to find smp->str.{len,size}. Hi, I found a bug in url_dec sample converter that could lead to a segmentation fault. the function sample_conv_url_dec always succeeds even if url_decode failed. And the return value of url_decode is always used to set the decoded string length. url_decode returns -1 when an invalid character is found. So, when we try to decode an invalid URL, we end up to manipulate an invalid sample, leading to undefined behavior. I fixed the bug but I'm going to send the patch separately to this thread to be sure everyone see it. -- Christopher Faulet
Re: [PATCH] BUG/MEDIUM: http: Return an error when url_dec sample converter failed
Le 05/10/2017 à 10:52, Christopher Faulet a écrit : Hi, Here is the patch that fixes the bug reported by Marcus (see "Haproxy segfault error 4 in libc-2.24"). Sorry, here is a new version of my patch. No reason to consider zero-length string as an error. -- Christopher Faulet >From 077217437a09e5d81216d377d9aff73dc1ce7122 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Thu, 5 Oct 2017 10:03:12 +0200 Subject: [PATCH] BUG/MEDIUM: http: Return an error when url_dec sample converter failed url_dec sample converter uses url_decode function to decode an URL. This function fails by returning -1 when an invalid character is found. But the sample converter never checked the return value and it used it as length for the decoded string. Because it always succeeded, the invalid sample (with a string length set to -1) could be used by other sample fetches or sample converters, leading to undefined behavior like segfault. The fix is pretty simple, url_dec sample converter just needs to return an error when url_decode fails. This patch must be backported in 1.7 and 1.6. --- src/proto_http.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/proto_http.c b/src/proto_http.c index fb5c0858e..40bd1c76d 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -12414,7 +12414,7 @@ static int sample_conv_url_dec(const struct arg *args, struct sample *smp, void /* Add final \0 required by url_decode(), and convert the input string. */ smp->data.u.str.str[smp->data.u.str.len] = '\0'; smp->data.u.str.len = url_decode(smp->data.u.str.str); - return 1; + return (smp->data.u.str.len >= 0); } static int smp_conv_req_capture(const struct arg *args, struct sample *smp, void *private) -- 2.13.6
[PATCH] BUG/MEDIUM: http: Return an error when url_dec sample converter failed
Hi, Here is the patch that fixes the bug reported by Marcus (see "Haproxy segfault error 4 in libc-2.24"). Thanks -- Christopher Faulet >From 077217437a09e5d81216d377d9aff73dc1ce7122 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Thu, 5 Oct 2017 10:03:12 +0200 Subject: [PATCH] BUG/MEDIUM: http: Return an error when url_dec sample converter failed url_dec sample converter uses url_decode function to decode an URL. This function fails by returning -1 when an invalid character is found. But the sample converter never checked the return value and it used it as length for the decoded string. Because it always succeeded, the invalid sample (with a string length set to -1) could be used by other sample fetches or sample converters, leading to undefined behavior like segfault. The fix is pretty simple, url_dec sample converter just needs to return an error when url_decode fails. This patch must be backported in 1.7 and 1.6. --- src/proto_http.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/proto_http.c b/src/proto_http.c index fb5c0858e..40bd1c76d 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -12414,7 +12414,7 @@ static int sample_conv_url_dec(const struct arg *args, struct sample *smp, void /* Add final \0 required by url_decode(), and convert the input string. */ smp->data.u.str.str[smp->data.u.str.len] = '\0'; smp->data.u.str.len = url_decode(smp->data.u.str.str); - return 1; + return (smp->data.u.str.len > 0); } static int smp_conv_req_capture(const struct arg *args, struct sample *smp, void *private) -- 2.13.6
Re: another cppcheck finding
Le 04/10/2017 à 11:54, Илья Шипицин a écrit : Hi, I'm working on the DNS part to make it thread-safe. In my patch set, among other things, I fixed this one. I will send everything to Willy in few days. So don't bother with it. ... using ThreadSanitizer from google ? No. We planned to add the support of threads in HAProxy 1.8. It requires many changes to make all parts thread-safe. DNS is one of them. Note that currently HAProxy is not multithreaded, there is no thread-safety issue. -- Christopher Faulet
Re: another cppcheck finding
Le 04/10/2017 à 07:49, Илья Шипицин a écrit : 2017-10-04 9:15 GMT+05:00 Willy Tarreau <w...@1wt.eu <mailto:w...@1wt.eu>>: Hi Ilya, [also CCing Baptiste] On Tue, Oct 03, 2017 at 05:25:17PM +0500, ??? wrote: > [src/dns.c:2502]: (error) Memory leak: buffer > > > I do not see any "buffer" usage except conditional free. > should we just remove "buffer" from there ? I think you're referring to this part : struct dns_resolution *dns_alloc_resolution(void) { struct dns_resolution *resolution = NULL; char *buffer = NULL; resolution = calloc(1, sizeof(*resolution)); buffer = calloc(1, global.tune.bufsize); if (!resolution || !buffer) { free(buffer); free(resolution); return NULL; } LIST_INIT(>requester.wait); LIST_INIT(>requester.curr); return resolution; } And there's definitely a memory leak on the allocated buffer. Buffers used to be needed for resolution in the past but I think that's no longer the case, so I think that indeed the buffer can be completely removed (unless it should be assigned somewhere of course). It's great. I'll send a patch today Hi, I'm working on the DNS part to make it thread-safe. In my patch set, among other things, I fixed this one. I will send everything to Willy in few days. So don't bother with it. Thanks -- Christopher Faulet
[PATCH] BUG/MEDIUM: http: Fix a regression bug when a HTTP response is in TUNNEL mode
Hi all, Finally I reworked my previous patch. This one should fix the bug, without side effect (AFAIK). It fixes slowdowns experienced on 1.7.9 for HTTP responses with undefined body length when the compression is enabled. -- Christopher Faulet >From c42035858a58786c296ae3cf3c2420e4fe82aad0 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Tue, 29 Aug 2017 16:06:38 +0200 Subject: [PATCH] BUG/MEDIUM: http: Fix a regression bug when a HTTP response is in TUNNEL mode Unfortunatly, a regression bug was introduced in the commit 1486b0ab ("BUG/MEDIUM: http: Switch HTTP responses in TUNNEL mode when body length is undefined"). HTTP responses with undefined body length are blocked until timeout when the compression is enabled. This bug was fixed in commit 69744d92 ("BUG/MEDIUM: http: Fix blocked HTTP/1.0 responses when compression is enabled"). The bug is still the same. We do not forward response data because we are waiting for the synchronization between the HTTP request and the response. To fix the bug, conditions to infinitly forward channel data has been slightly relaxed. Now, it is done if there is no more analyzer registered on the channel or if _FLT_END analyzer is still there but without the flag CF_FLT_ANALYZE. This last condition is only possible when a channel is waiting the end of the other side. So, fundamentally, it means that no one is analyzing the channel anymore. This is a transitional state during a sync phase. This patch must be backported in 1.7. --- src/stream.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/stream.c b/src/stream.c index 1985ed98a..d6c12299b 100644 --- a/src/stream.c +++ b/src/stream.c @@ -2046,7 +2046,7 @@ struct task *process_stream(struct task *t) * Note that we're checking CF_SHUTR_NOW as an indication of a possible * recent call to channel_abort(). */ - if (unlikely(!req->analysers && + if (unlikely((!req->analysers || (req->analysers == AN_REQ_FLT_END && !(req->flags & CF_FLT_ANALYZE))) && !(req->flags & (CF_SHUTW|CF_SHUTR_NOW)) && (si_f->state >= SI_ST_EST) && (req->to_forward != CHN_INFINITE_FORWARD))) { @@ -2205,7 +2205,7 @@ struct task *process_stream(struct task *t) * Note that we're checking CF_SHUTR_NOW as an indication of a possible * recent call to channel_abort(). */ - if (unlikely(!res->analysers && + if (unlikely((!res->analysers || (res->analysers == AN_RES_FLT_END && !(res->flags & CF_FLT_ANALYZE))) && !(res->flags & (CF_SHUTW|CF_SHUTR_NOW)) && (si_b->state >= SI_ST_EST) && (res->to_forward != CHN_INFINITE_FORWARD))) { -- 2.13.5
Re: HTTP/1.0 with compression enabled broken again in v1.7.9
Le 29/08/2017 à 10:13, Kristjan Koppel a écrit : Hi! It seems that a similar issue to the one I originally reported against v1.7.2 is now present in v1.7.9 (the same setup worked fine with v1.7.8): https://www.mail-archive.com/haproxy@formilux.org/msg24830.html After upgrading to v1.7.9, requests from a client using HTTP/1.0 with compression enabled in HAProxy now always hang. Usually they hang after receiving 3905 bytes of data, but sometimes also after 11145 bytes (the full response is much larger). After hanging for "timeout client" seconds the rest of the response is received and the request finishes successfully. Using HTTP/1.1 or turning off compression fixes the issue as before. My backend server is still the same etcd v2.3.7 and the client is a custom application using HTTP/1.0. Issue was confirmed with curl in HTTP/1.0 mode. I'll be happy to provide additional information if needed. Hi Kristjan, Damn ! I was pretty sure to have test this use-case when I fixed the HTTP parser few weeks ago. But clearly not, because the bug is back. Could you check if the attached patch fixes the bug ? For my part, I will recheck everything to be sure. This will be my punishment. Thanks, -- Christopher Faulet >From 57e627243b02021b3913b33c8bd5c2ed92f82303 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Tue, 29 Aug 2017 16:06:38 +0200 Subject: [PATCH] BUG/MEDIUM: http: Fix a regression bug in http_resync_states Unfortunatly, a regression bug in http_resync_states was introduced in the commit 1486b0ab ("BUG/MEDIUM: http: Switch HTTP responses in TUNNEL mode when body length is undefined"). HTTP/1.0 responses with undefined body length are blocked until timeout when the compression is enabled. This bug was fixed in commit 69744d92 ("BUG/MEDIUM: http: Fix blocked HTTP/1.0 responses when compression is enabled"). To fix the bug, in http_resync_states, when one side is in HTTP_MSG_TUNNEL state but the other one is still in HTTP_MSG_DONE state, we remove remaining analyzers except corresponding _FLT_END one. This patch must be backported in 1.7. --- src/proto_http.c | 4 1 file changed, 4 insertions(+) diff --git a/src/proto_http.c b/src/proto_http.c index 439e57fce..c0e2e1115 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -5618,11 +5618,15 @@ void http_resync_states(struct stream *s) s->req.analysers &= AN_REQ_FLT_END; if (HAS_REQ_DATA_FILTERS(s)) s->req.analysers |= AN_REQ_FLT_XFER_DATA; + if (txn->rsp.msg_state == HTTP_MSG_DONE) +s->res.analysers &= AN_RES_FLT_END; } if (txn->rsp.msg_state == HTTP_MSG_TUNNEL) { s->res.analysers &= AN_RES_FLT_END; if (HAS_RSP_DATA_FILTERS(s)) s->res.analysers |= AN_RES_FLT_XFER_DATA; + if (txn->req.msg_state == HTTP_MSG_DONE) +s->req.analysers &= AN_REQ_FLT_END; } channel_auto_close(>req); channel_auto_read(>req); -- 2.13.5
Re: haproxy-1.8.0, sending a email-alert causes 100% cpu usage, FreeBSD 11.1
Le 28/11/2017 à 07:25, Willy Tarreau a écrit : Hi Pieter, On Mon, Nov 27, 2017 at 09:43:52PM +0100, PiBa-NL wrote: Hi List, I thought i 'reasonably' tested some of 1.8.0's options. Today i put it into 'production' on my secondary cluster node and notice it takes 100% cpu... G.. bad. This sounds like another case of recursive locking. I guess i should have tried such a thing last week. Don't worry, whatever the amount of tests you run, some bugs will always slip through. Anyhow below some gdb and console output. Very useful, I found it : process_chk_conn() takes the lock then calls connect_conn_chk() : 2114 HA_SPIN_LOCK(SERVER_LOCK, >server->lock); 2137 ret = connect_conn_chk(t); connect_conn_chk() then calls tcpcheck_main() : 1548 tcpcheck_main(check); And this one takes the lock again : 2598 HA_SPIN_LOCK(SERVER_LOCK, >server->lock); CCing Emeric as he's the one who covered the checks so he will know best how to fix it. In the mean time, if you don't need threads you can rebuild with "USE_THREAD=" to disable them, but I'd rather wait for a fix. Sorry about that, and thaks for the report. Hi Pieter, Here is a patch that should fix the deadlock. Could you confirm it fixes your bug ? Emeric, this patch should be good, but take a look on it, just to be sure. Thanks -- -- Christopher Faulet >From 5fd4083becd141080ec8cf0923b222e0ae6119af Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Tue, 28 Nov 2017 10:06:29 +0100 Subject: [PATCH] BUG/MEDIUM: tcp-check: Don't lock the server in tcpcheck_main There was a deadlock in tcpcheck_main function. The server's lock was already acquired by the caller (process_chk_conn or wake_srv_chk). This patch must be backported in 1.8. --- src/checks.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/src/checks.c b/src/checks.c index ad25ac094..63747201e 100644 --- a/src/checks.c +++ b/src/checks.c @@ -2595,8 +2595,6 @@ static int tcpcheck_main(struct check *check) struct list *head = check->tcpcheck_rules; int retcode = 0; - HA_SPIN_LOCK(SERVER_LOCK, >server->lock); - /* here, we know that the check is complete or that it failed */ if (check->result != CHK_RES_UNKNOWN) goto out_end_tcpcheck; @@ -2637,7 +2635,7 @@ static int tcpcheck_main(struct check *check) if (s->proxy->timeout.check) t->expire = tick_first(t->expire, t_con); } - goto out_unlock; + goto out; } /* special case: option tcp-check with no rule, a connect is enough */ @@ -2732,7 +2730,7 @@ static int tcpcheck_main(struct check *check) chunk_appendf(, " comment: '%s'", comment); set_server_check_status(check, HCHK_STATUS_SOCKERR, trash.str); check->current_step = NULL; -goto out_unlock; +goto out; } if (check->cs) @@ -2854,7 +2852,7 @@ static int tcpcheck_main(struct check *check) if (s->proxy->timeout.check) t->expire = tick_first(t->expire, t_con); } -goto out_unlock; +goto out; } } /* end 'connect' */ @@ -3059,7 +3057,7 @@ static int tcpcheck_main(struct check *check) if (>current_step->list != head && check->current_step->action == TCPCHK_ACT_EXPECT) __cs_want_recv(cs); - goto out_unlock; + goto out; out_end_tcpcheck: /* collect possible new errors */ @@ -3074,8 +3072,7 @@ static int tcpcheck_main(struct check *check) __cs_stop_both(cs); - out_unlock: - HA_SPIN_UNLOCK(SERVER_LOCK, >server->lock); + out: return retcode; } -- 2.13.6
Re: 1.8.0 stuck in write(threads_sync_pipe[1], "S", 1)
Le 02/12/2017 à 08:23, Максим Куприянов a écrit : Hi! Tonight all of mine haproxy 1.8.0 instances stopped answering. They didn't forward traffic and even didn't answered over socket. They're compiled with threads, but threads are not enabled in they configs (no nbthread option). All of them stuck in same place: # strace -f -p 831919 Process 831919 attached write(2, "S", 1 Here's some debug stuff (from 1-threaded instance): (gdb) bt #0 0x7fef9bd2a330 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:81 #1 0x558dea62275b in thread_want_sync () at src/hathreads.c:74 #2 0x558dea58f548 in srv_register_update (srv=srv@entry=0x558ded691e30) at src/server.c:2596 #3 0x558dea5922f7 in server_recalc_eweight (sv=sv@entry=0x558ded691e30) at src/server.c:1151 #4 0x558dea5c3028 in server_warmup (t=0x558def513120) at src/checks.c:1448 #5 0x558dea619216 in process_runnable_tasks () at src/task.c:229 #6 0x558dea5cf237 in run_poll_loop () at src/haproxy.c:2326 #7 run_thread_poll_loop (data=) at src/haproxy.c:2375 #8 0x558dea53b6fe in main (argc=, argv=0x7ffe03a880d8) at src/haproxy.c:2910 Hi, Thanks for your detailed report. There is a bug in the sync-point, when the same thread requests a synchronization many times. And, it is easier to encountered this bug with only one thread. Could you check the attached patch ? It should fix the bug. -- Christopher Faulet >From b8475f5bf9098b667fabada7b88de33c62b42c35 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Sat, 2 Dec 2017 09:53:24 +0100 Subject: [PATCH] BUG/MAJOR: thread: Be sure to request a sync between threads only once at a time The first thread requesting a synchronization is responsible to write in the "sync" pipe to notify all others. But we must write only once in the pipe between two synchronizations to have exactly one character in the pipe. It is important because we only read 1 character in return when the last thread exits from the sync-point. Here there is a bug. If two threads request a synchronization, only the first writes in the pipe. But, if the same thread requests several times a synchronization before entering in the sync-point (because, for instance, it detects many servers down), it writes as many as characters in the pipe. And only one of them will be read. Repeating this bug many times will block HAProxy on the write because the pipe is full. To fix the bug, we just check if the current thread has already requested a synchronization before trying to notify all others. The patch must be backported in 1.8 --- src/hathreads.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/hathreads.c b/src/hathreads.c index eb9cd3fce..50f3c7701 100644 --- a/src/hathreads.c +++ b/src/hathreads.c @@ -70,6 +70,8 @@ void thread_sync_enable(void) void thread_want_sync() { if (all_threads_mask) { + if (threads_want_sync & tid_bit) + return; if (HA_ATOMIC_OR(_want_sync, tid_bit) == tid_bit) shut_your_big_mouth_gcc(write(threads_sync_pipe[1], "S", 1)); } -- 2.13.6
[PATCH] BUG/MEDIUM: mworker: Close log socket during a reload
Hi Willy, This patch should fix the following bug reported on discourse: https://discourse.haproxy.org/t/freeze-sockets-in-1-8-1-no-http-2/1912 I reproduced the bug described on discourse and my patch fixes it. But, I will ask for confirmation to be sure it is the same bug. -- Christopher Faulet >From a9a69d0a5cc9fcc7be84966fc5861cc17400a849 Mon Sep 17 00:00:00 2001 From: Christopher Faulet <cfau...@haproxy.com> Date: Mon, 18 Dec 2017 14:36:44 +0100 Subject: [PATCH] BUG/MEDIUM: mworker: Close log socket during a reload A log socket (UPD or UNIX) is opened by the master during its startup, when the first log message is sent. So, to prevent FD leaks, we must ensure we correctly close it during a reload. This patch must be backported in 1.8. --- include/proto/log.h | 3 ++- src/haproxy.c | 1 + src/log.c | 16 ++-- 3 files changed, 17 insertions(+), 3 deletions(-) diff --git a/include/proto/log.h b/include/proto/log.h index c92217c9d..a79dac610 100644 --- a/include/proto/log.h +++ b/include/proto/log.h @@ -51,9 +51,10 @@ extern THREAD_LOCAL char *logline_rfc5424; /* - * Initializes some log data. + * (de)Initializes some log data. */ void init_log(); +void deinit_log(); /* Initialize/Deinitialize log buffers used for syslog messages */ diff --git a/src/haproxy.c b/src/haproxy.c index ffd7ea05e..87569ca01 100644 --- a/src/haproxy.c +++ b/src/haproxy.c @@ -2783,6 +2783,7 @@ int main(int argc, char **argv) if (proc == global.nbproc) { if (global.mode & MODE_MWORKER) { mworker_cleanlisteners(); +deinit_log(); deinit_pollers(); if ((!(global.mode & MODE_QUIET) || (global.mode & MODE_VERBOSE)) && diff --git a/src/log.c b/src/log.c index 0b8467f33..3239e7ed6 100644 --- a/src/log.c +++ b/src/log.c @@ -232,6 +232,9 @@ THREAD_LOCAL char *logline_rfc5424 = NULL; * retrieve on the CLI. */ static THREAD_LOCAL char *startup_logs = NULL; +static THREAD_LOCAL int logfdunix = -1; /* syslog to AF_UNIX socket */ +static THREAD_LOCAL int logfdinet = -1; /* syslog to AF_INET socket */ + struct logformat_var_args { char *name; int mask; @@ -1099,8 +1102,6 @@ void __send_log(struct proxy *p, int level, char *message, size_t size, char *sd //.msg_iov = iovec, .msg_iovlen = NB_MSG_IOVEC_ELEMENTS }; - static THREAD_LOCAL int logfdunix = -1; /* syslog to AF_UNIX socket */ - static THREAD_LOCAL int logfdinet = -1; /* syslog to AF_INET socket */ static THREAD_LOCAL char *dataptr = NULL; int fac_level; struct list *logsrvs = NULL; @@ -1344,6 +1345,16 @@ void init_log() } } +void deinit_log() +{ + if (logfdunix >= 0) + close(logfdunix); + if (logfdinet >= 0) + close(logfdinet); + logfdunix = -1; + logfdinet = -1; +} + static int init_log_buffers_per_thread() { return init_log_buffers(); @@ -2420,6 +2431,7 @@ static void __log_init(void) { hap_register_per_thread_init(init_log_buffers_per_thread); hap_register_per_thread_deinit(deinit_log_buffers_per_thread); + hap_register_post_deinit(deinit_log); cli_register_kw(_kws); } /* -- 2.13.6
Re: 1.8.1 backend stays 'DOWN' when dns resolvers and http health checks are used
ption httpchk GET /_check server phoenix phoenix:4000 resolvers docker init-addr libc,last,none check inter 1000 Hi, There have been some fixes since the 1.8.1. One of them could fix your problem: http://git.haproxy.org/?p=haproxy-1.8.git;a=commit;h=80b92902 Could you check with the last 1.8 source snapshot (http://www.haproxy.org/download/1.8/src/snapshot/haproxy-ss-LATEST.tar.gz) ? Thanks -- Christopher Faulet
Re: Diagnose a PD-- status
Hi, Le 02/11/2017 à 17:16, Mildis a écrit : [WARNING] 305/144718 (21260) : HTTP compression failed: unexpected behavior of previous filters This warning is very suspicious. It should not happen. Could you share your configuration and "haproxy -vv" output please ? -- Christopher Faulet