Re: [ANNOUNCE] haproxy-1.9-dev1
Hi. On 03/08/2018 19:42, Aleksandar Lazic wrote: Hi. On 02/08/2018 19:23, Willy Tarreau wrote: Hi, HAProxy 1.9-dev1 was released on 2018/08/02. It added 651 new commits after version 1.9-dev0. Great news and work ;-) The image is also ready. https://hub.docker.com/r/me2digital/haproxy19/ As an attentive reader mentioned is there a old ssl library in centos. Due to this fact I have now added the 1.1.1-pre8 version to this image and as I was on the way I also updated the lua version ;-) I don't think that there is now a more on the edge setup possible expect you build it from git. ### HA-Proxy version 1.9-dev1 2018/08/02 Copyright 2000-2018 Willy Tarreau Build options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -fno-strict-overflow -Wno-unused-label OPTIONS = USE_LINUX_SPLICE=1 USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1 USE_PCRE_JIT=1 USE_TFO=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 -> Built with OpenSSL version : OpenSSL 1.1.1-pre8 (beta) 20 Jun 2018 -> Running on OpenSSL version : OpenSSL 1.1.1-pre8 (beta) 20 Jun 2018 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes -> OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3 Built with Lua version : Lua 5.3.5 Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Encrypted password support via crypt(3): yes Built with multi-threading support. Built with PCRE version : 8.32 2012-11-30 Running on PCRE version : 8.32 2012-11-30 PCRE library supports JIT : yes Built with zlib version : 1.2.7 Running on zlib version : 1.2.7 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with network namespace support. Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. Available filters : [SPOE] spoe [COMP] compression [TRACE] trace ### FYI the dockerfile is there https://gitlab.com/aleks001/haproxy19-centos/blob/master/Dockerfile Regards Aleks Yes I know what some of you are thinking "what, 651 patches for a first development release ?". Last year, 1.8-dev1 was emitted with half that in April, 4 months earlier. But by then we only pushed fixes and some new features to flush the pipe, and that 1.8-dev2 and -dev3 that followed had even more patches once cumulated. Here after 1.8, we've got a longer trail of difficult bugs to deal with and the 1.9 changes were very low level stuff that doesn't bring any functional value, these were mostly some rearchitectures of certain sensitive parts, aimed at building the new features on top of them. So we could have emitted useless and broken versions, but... I don't like to discourage our users. Thus 8 months after 1.9-dev0 was created, here comes the first version really worth testing. Those looking for eye-candy stuff will be a bit disappointed, I prefer to warn. Among the ~300 patches that were not backported to 1.8.x (hence that were not bug fixes), I can see : - a rework of our task scheduler. Now it scales much better with large thread counts. There are 3 levels now, one priority-aware shared between all threads, a lockless priority-aware one per thread, and a per-thread list of already started tasks that can be used as well for I/O. It results in most of the scheduling work being performed without any lock, which scales way better. Another nice benefit of lock removal is that when haproxy has to coexist with another process on the same CPU, the impact on other threads is much lower since the threads are very rarely context-switched with a lock held. - the applets scheduler was killed and replaced by the new scheduler above. Not only the previous applets scheduler could use quite some CPU, it didn't make use of priorities, so many applets could use a lot of CPU bandwidth. I noticed this already with the first attempt at implementing H2 using applets. Now the task's nice value being respected, the CLI is much more responsive even under very high loads, and the stats page can be tuned to have less impact on the traffic. Same for peers and SPOE which we'll see if they can benefit from either a boost or a reduced priority. - a new test suite was introduced, based on "varnish-test" from the Varnish cache. It was extended to support haproxy and we can now write test cases, which are placed into the reg-tests directory. It is very convenient because testing a proxy is a particularly complex task which depends on a lot of elements and varnish-test makes it easier to write reproducible test patterns. - the buffers were completely changed (again). Buffers are redesigned every 5 years it seems. I
Re: [ANNOUNCE] haproxy-1.9-dev1
Hi. On 02/08/2018 19:23, Willy Tarreau wrote: Hi, HAProxy 1.9-dev1 was released on 2018/08/02. It added 651 new commits after version 1.9-dev0. Great news and work ;-) The image is also ready. https://hub.docker.com/r/me2digital/haproxy19/ ### HA-Proxy version 1.9-dev1 2018/08/02 Copyright 2000-2018 Willy Tarreau Build options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -fno-strict-overflow -Wno-unused-label OPTIONS = USE_LINUX_SPLICE=1 USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1 USE_PCRE_JIT=1 USE_TFO=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with OpenSSL version : OpenSSL 1.0.2k-fips 26 Jan 2017 Running on OpenSSL version : OpenSSL 1.0.2k-fips 26 Jan 2017 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2 Built with Lua version : Lua 5.3.4 Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Encrypted password support via crypt(3): yes Built with multi-threading support. Built with PCRE version : 8.32 2012-11-30 Running on PCRE version : 8.32 2012-11-30 PCRE library supports JIT : yes Built with zlib version : 1.2.7 Running on zlib version : 1.2.7 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with network namespace support. Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. Available filters : [SPOE] spoe [COMP] compression [TRACE] trace ### Regards Aleks Yes I know what some of you are thinking "what, 651 patches for a first development release ?". Last year, 1.8-dev1 was emitted with half that in April, 4 months earlier. But by then we only pushed fixes and some new features to flush the pipe, and that 1.8-dev2 and -dev3 that followed had even more patches once cumulated. Here after 1.8, we've got a longer trail of difficult bugs to deal with and the 1.9 changes were very low level stuff that doesn't bring any functional value, these were mostly some rearchitectures of certain sensitive parts, aimed at building the new features on top of them. So we could have emitted useless and broken versions, but... I don't like to discourage our users. Thus 8 months after 1.9-dev0 was created, here comes the first version really worth testing. Those looking for eye-candy stuff will be a bit disappointed, I prefer to warn. Among the ~300 patches that were not backported to 1.8.x (hence that were not bug fixes), I can see : - a rework of our task scheduler. Now it scales much better with large thread counts. There are 3 levels now, one priority-aware shared between all threads, a lockless priority-aware one per thread, and a per-thread list of already started tasks that can be used as well for I/O. It results in most of the scheduling work being performed without any lock, which scales way better. Another nice benefit of lock removal is that when haproxy has to coexist with another process on the same CPU, the impact on other threads is much lower since the threads are very rarely context-switched with a lock held. - the applets scheduler was killed and replaced by the new scheduler above. Not only the previous applets scheduler could use quite some CPU, it didn't make use of priorities, so many applets could use a lot of CPU bandwidth. I noticed this already with the first attempt at implementing H2 using applets. Now the task's nice value being respected, the CLI is much more responsive even under very high loads, and the stats page can be tuned to have less impact on the traffic. Same for peers and SPOE which we'll see if they can benefit from either a boost or a reduced priority. - a new test suite was introduced, based on "varnish-test" from the Varnish cache. It was extended to support haproxy and we can now write test cases, which are placed into the reg-tests directory. It is very convenient because testing a proxy is a particularly complex task which depends on a lot of elements and varnish-test makes it easier to write reproducible test patterns. - the buffers were completely changed (again). Buffers are redesigned every 5 years it seems. I probably find it funny. No I don't in fact. With the introduction of the mux layer, we suffered a bit from the old design mixing input and output areas in the same buffer, as it didn't make any sense there and we had to arbitrarily use either side depending on the data direction, making it impossible to share code between the two sides. Now the buffers are much simpler and the code using them at the various layers was
Re: [ANNOUNCE] haproxy-1.9-dev1
Amazing work. congrats all Baptiste
[ANNOUNCE] haproxy-1.9-dev1
Hi, HAProxy 1.9-dev1 was released on 2018/08/02. It added 651 new commits after version 1.9-dev0. Yes I know what some of you are thinking "what, 651 patches for a first development release ?". Last year, 1.8-dev1 was emitted with half that in April, 4 months earlier. But by then we only pushed fixes and some new features to flush the pipe, and that 1.8-dev2 and -dev3 that followed had even more patches once cumulated. Here after 1.8, we've got a longer trail of difficult bugs to deal with and the 1.9 changes were very low level stuff that doesn't bring any functional value, these were mostly some rearchitectures of certain sensitive parts, aimed at building the new features on top of them. So we could have emitted useless and broken versions, but... I don't like to discourage our users. Thus 8 months after 1.9-dev0 was created, here comes the first version really worth testing. Those looking for eye-candy stuff will be a bit disappointed, I prefer to warn. Among the ~300 patches that were not backported to 1.8.x (hence that were not bug fixes), I can see : - a rework of our task scheduler. Now it scales much better with large thread counts. There are 3 levels now, one priority-aware shared between all threads, a lockless priority-aware one per thread, and a per-thread list of already started tasks that can be used as well for I/O. It results in most of the scheduling work being performed without any lock, which scales way better. Another nice benefit of lock removal is that when haproxy has to coexist with another process on the same CPU, the impact on other threads is much lower since the threads are very rarely context-switched with a lock held. - the applets scheduler was killed and replaced by the new scheduler above. Not only the previous applets scheduler could use quite some CPU, it didn't make use of priorities, so many applets could use a lot of CPU bandwidth. I noticed this already with the first attempt at implementing H2 using applets. Now the task's nice value being respected, the CLI is much more responsive even under very high loads, and the stats page can be tuned to have less impact on the traffic. Same for peers and SPOE which we'll see if they can benefit from either a boost or a reduced priority. - a new test suite was introduced, based on "varnish-test" from the Varnish cache. It was extended to support haproxy and we can now write test cases, which are placed into the reg-tests directory. It is very convenient because testing a proxy is a particularly complex task which depends on a lot of elements and varnish-test makes it easier to write reproducible test patterns. - the buffers were completely changed (again). Buffers are redesigned every 5 years it seems. I probably find it funny. No I don't in fact. With the introduction of the mux layer, we suffered a bit from the old design mixing input and output areas in the same buffer, as it didn't make any sense there and we had to arbitrarily use either side depending on the data direction, making it impossible to share code between the two sides. Now the buffers are much simpler and the code using them at the various layers was significantly simplified. It will even open the way to an easier evolution towards dynamic size buffers in the near future. We found some benefits such as certain operations being doable in zero copy now, which was not possible previously. This has affected a huge amount of areas in the code and will make it a bit more painful to backport fixes to 1.8, but it's not possible to keep a dead code base and expect it to evolve at the same time! - the chunks were replaced by the buffers. The API was not changed yet to avoid adding jokes to the current complexity, but this will be done on an opportunistic basis. This already allowed us to remove some code that already existed in buffers. - the file descriptor cache is now fully lockless. This is the second part of the important performance-oriented changes that happened. I remember observing a 40% performance gain on the connection rate on a 12-core machine compared to 1.8 just with this change. It was quite tricky and we didn't feel confident emitting a development release immediately after to be honest! - the CLI now supports a payload. This will be used to feed some data (maps, certs, anything) from external scripts. For now this payload is limited to a whole buffer, but it will be possible to extend this in the future. - the internal connection and mux API have started to evolve so that we can more easily place some protocol processing at the mux layer. These changes have just begun and we need to make them step by step because they have huge implications on the rest of the work being done in parallel. At the moment we have introduced an rx