Re: Haproxy 1.7.11 log problems
I updated haproxy to 1.7.12 but nothing changed > 20 нояб. 2019 г., в 15:38, Aleksandar Lazic написал(а): > > > On this page is a 1.7.12 listed, is this the repo which you use? > > https://repo.ius.io/6/x86_64/packages/h/ > > Please can you try the 1.7.12. > > Do you know that eol is next year? > https://wiki.centos.org/Download > > Regards > Aleks > > Nov 20, 2019 12:45:37 PM Alexander Kasantsev : > >> I’m on CentOS 6.10, the latest version for me is 1.7.11 from ius repo >> >>> 20 нояб. 2019 г., в 14:17, Aleksandar Lazic >> написал(а): >>> >>> >>> Hi. >>> >>> Please can you use the latest 1.7, latest 1.8 or 2.0 and tell us if the >>> problem still exist. >>> >>> Best regards >>> Aleks >>> >>> Nov 20, 2019 9:52:01 AM Alexander Kasantsev >> : >>> Good day everyone! I’m migrated from haproxy 1.5 to 1.7.11 and I have some troubles with logging I have a following in config file for logging capture request header Host len 200 capture request header Referer len 200 capture request header User-Agent len 200 capture request header Content-Type len 200 capture request header Cookie len 300 log-format %[capture.req.hdr(0),lower]\ %ci\ -\ [%t]\ \"%HM\ %HP\ %HV\"\ %ST\ \"%[capture.req.hdr(3)]\"\ %U\ \"%[capture.req.hdr(1)]\"\ \"%[capture.req.hdr(2)]\"\ \"%[capture.req.hdr(4)]\"\ %Tq\ \"%s\"\ 'NGINX-CACHE-- "-"'\ \"%ts\» Logformat is almost the same with Nginx But is some cases it works incorrectly For example log output Nov 20 10:41:56 lb.loc haproxy[12633]: example.com 81.4.227.173 - [20/Nov/2019:10:41:56.095] "GET /piwik.php H" 200 "-" 2396 "https://example.com/"; "Mozilla/5.0" "some.cookie data" 19 "vm06.lb.rsl.loc" NGINX-CACHE-- "-" "—" Problem is that "GET /piwik.php H" must be "GET /piwik.php HTTP/1.1" its %HV parameter in log-format A part of "HTTP/1.1" randomly cut’s off. It may be "HT" or "HTT" or "HTTP/1." >>
Re: [PATCH] MINOR: contrib/prometheus-exporter: allow to select the exported metrics
Hi Christopher, On Wed, Nov 20, 2019 at 02:56:28PM +0100, Christopher Faulet wrote: > Nice, Thanks for your feedback. It is merged now. And I'm on the backports > for the 2.0. You apparently forgot to backport commit 0d1c2a65e8370a770d01 (MINOR: stats: Report max times in addition of the averages for sessions) 2.0 tree does not build anymore because ST_F_QT_MAX is not defined. -- William
RE: native prometheus exporter: retrieving check_status
>> My only fear for this point would be to make the code too complicated >> and harder to maintain. >> > > And slow down the exporter execution. Moreover, everyone will have a > different > opinion on how to aggregate the stats. My first idea was to sum all servers > counters. But Pierre's reply shown me that it's not what he expects. I agree it's probably too complex and opinionated. Let see how it goes with servers aggregations only, done on prometheus side, since it's a server-related field initially. If we identify issues/bottlenecks with output size we'll reopen this thread. -- Pierre
RE: native prometheus exporter: retrieving check_status
> Ok, so it is a new kind of metric. I mean, not exposed by HAProxy. It would > require an extra loop on all servers for each backend. It is probably doable > for > the check_status. For the code, I don't know. Because it is not exclusive to > HTTP checks. it is also used for SMTP and LDAP checks. At the end, I think a > better idea would be to have a way to get specifics metrics in each scope and > let Prometheus handling the aggregation. This way, everyone is free to choose > how to proceed while limiting the number of metrics exported. Fair enough, as stated on the other thread with William we'll see how it goes doing it this way. If we have issues related to output size we'll start a new discussion. Thanks! -- Pierre
Re: native prometheus exporter: retrieving check_status
Le 19/11/2019 à 17:12, Pierre Cheynier a écrit : * also for `check_status`, there is the case of L7STS and its associated values that are present in another field. Most probably it could benefit from a better representation in a prometheus output (thanks to labels)? We can also export the metrics ST_F_CHECK_CODE. For the use of labels, I have no idea. For now, the labels are static in the exporter. And I don't know if it is pertinent to add dynamic info in labels. If so, what is your idea ? Add a "code" label associated to the check_status metric ? Here again, my maybe-not-so-good idea was to keep the ability to retrieve all the underlying details at backend level, such as: * 100 servers are L7OK * 1 server is L4TOUT * 2 servers are L4CON * 2 servers are L7STS ** 1 due to a HTTP 429 ** 1 due to a HTTP 503 But this is maybe overkill in terms of complexity, we could maybe push more on our ability to retrieve non-maint servers status. Ok, so it is a new kind of metric. I mean, not exposed by HAProxy. It would require an extra loop on all servers for each backend. It is probably doable for the check_status. For the code, I don't know. Because it is not exclusive to HTTP checks. it is also used for SMTP and LDAP checks. At the end, I think a better idea would be to have a way to get specifics metrics in each scope and let Prometheus handling the aggregation. This way, everyone is free to choose how to proceed while limiting the number of metrics exported. -- Christopher Faulet
Re: native prometheus exporter: retrieving check_status
Le 19/11/2019 à 16:48, William Dauchy a écrit : On Tue, Nov 19, 2019 at 03:31:28PM +0100, Christopher Faulet wrote: * also for `check_status`, there is the case of L7STS and its associated values that are present in another field. Most probably it could benefit from a better representation in a prometheus output (thanks to labels)? We can also export the metrics ST_F_CHECK_CODE. For the use of labels, I have no idea. For now, the labels are static in the exporter. And I don't know if it is pertinent to add dynamic info in labels. If so, what is your idea ? Add a "code" label associated to the check_status metric ? we need to be very careful here indeed. It's not very clear in my mind how much values we are talking about, but labels trigger the creation of a new metric of each key/value pair. So it can quickly explode your memory on scrapping side. If there is different metric for each label, it is probably not the right way to do. However, I may have wrong, I'm not a Prometheus expert, far from it :) I will probably start by exporting metrics as present in HAProxy using a mapping to represent the check_status. * what about getting some backend-level aggregation of server metrics, such as the one that was previously mentioned, to avoid retrieving all the server metrics but still be able to get some insights? I'm thinking about an aggregation of some fields at backend level, which was not previously done with the CSV output. It is feasible. But only counters may be aggregated. It may be enabled using a parameter in the query-string. However, it is probably pertinent only when the server metrics are filtered out. Because otherwise, Prometheus can handle the aggregation itself. My only fear for this point would be to make the code too complicated and harder to maintain. And slow down the exporter execution. Moreover, everyone will have a different opinion on how to aggregate the stats. My first idea was to sum all servers counters. But Pierre's reply shown me that it's not what he expects. -- Christopher Faulet
Re: [PATCH] MINOR: contrib/prometheus-exporter: allow to select the exported metrics
Le 20/11/2019 à 13:03, William Dauchy a écrit : On Tue, Nov 19, 2019 at 04:35:47PM +0100, Christopher Faulet wrote: Here is updated patches with the support for "scope" and "no-maint" parameters. If this solution is good enough for you (and if it works :), I will push it. $ curl "http://127.0.0.1:8080/metrics?scope=global&scope=frontend&scope=backend&scope=server"; 151M $ curl "http://127.0.0.1:8080/metrics?scope=global&scope=frontend&scope=backend&scope=server&no-maint"; 13.9M looks very useful from here :) I think you can push this last version! Nice, Thanks for your feedback. It is merged now. And I'm on the backports for the 2.0. -- Christopher Faulet
Re: Haproxy 1.7.11 log problems
On this page is a 1.7.12 listed, is this the repo which you use? https://repo.ius.io/6/x86_64/packages/h/ Please can you try the 1.7.12. Do you know that eol is next year? https://wiki.centos.org/Download Regards Aleks Nov 20, 2019 12:45:37 PM Alexander Kasantsev : > I’m on CentOS 6.10, the latest version for me is 1.7.11 from ius repo > > > 20 нояб. 2019 г., в 14:17, Aleksandar Lazic > написал(а): > > > > > > Hi. > > > > Please can you use the latest 1.7, latest 1.8 or 2.0 and tell us if the > > problem still exist. > > > > Best regards > > Aleks > > > > Nov 20, 2019 9:52:01 AM Alexander Kasantsev > : > > > >> Good day everyone! > >> > >> I’m migrated from haproxy 1.5 to 1.7.11 and I have some troubles with > >> logging > >> > >> I have a following in config file for logging > >> > >> capture request header Host len 200 > >> capture request header Referer len 200 > >> capture request header User-Agent len 200 > >> capture request header Content-Type len 200 > >> capture request header Cookie len 300 > >> log-format %[capture.req.hdr(0),lower]\ %ci\ -\ [%t]\ \"%HM\ %HP\ %HV\"\ > >> %ST\ \"%[capture.req.hdr(3)]\"\ %U\ \"%[capture.req.hdr(1)]\"\ > >> \"%[capture.req.hdr(2)]\"\ \"%[capture.req.hdr(4)]\"\ %Tq\ \"%s\"\ > >> 'NGINX-CACHE-- "-"'\ \"%ts\» > >> > >> > >> Logformat is almost the same with Nginx > >> > >> But is some cases it works incorrectly > >> > >> For example log output > >> > >> Nov 20 10:41:56 lb.loc haproxy[12633]: example.com 81.4.227.173 - > >> [20/Nov/2019:10:41:56.095] "GET /piwik.php H" 200 "-" 2396 > >> "https://example.com/"; "Mozilla/5.0" "some.cookie data" 19 > >> "vm06.lb.rsl.loc" NGINX-CACHE-- "-" "—" > >> > >> Problem is that "GET /piwik.php H" must be "GET /piwik.php HTTP/1.1" > >> its %HV parameter in log-format > >> > >> A part of "HTTP/1.1" randomly cut’s off. It may be "HT" or "HTT" or > >> "HTTP/1." > >> >
Re: [PATCH] MINOR: contrib/prometheus-exporter: allow to select the exported metrics
On Tue, Nov 19, 2019 at 04:35:47PM +0100, Christopher Faulet wrote: > Here is updated patches with the support for "scope" and "no-maint" > parameters. If this solution is good enough for you (and if it works :), I > will push it. $ curl "http://127.0.0.1:8080/metrics?scope=global&scope=frontend&scope=backend&scope=server"; 151M $ curl "http://127.0.0.1:8080/metrics?scope=global&scope=frontend&scope=backend&scope=server&no-maint"; 13.9M looks very useful from here :) I think you can push this last version! -- William
Re: Haproxy 1.7.11 log problems
I’m on CentOS 6.10, the latest version for me is 1.7.11 from ius repo > 20 нояб. 2019 г., в 14:17, Aleksandar Lazic написал(а): > > > Hi. > > Please can you use the latest 1.7, latest 1.8 or 2.0 and tell us if the > problem still exist. > > Best regards > Aleks > > Nov 20, 2019 9:52:01 AM Alexander Kasantsev : > >> Good day everyone! >> >> I’m migrated from haproxy 1.5 to 1.7.11 and I have some troubles with logging >> >> I have a following in config file for logging >> >> capture request header Host len 200 >> capture request header Referer len 200 >> capture request header User-Agent len 200 >> capture request header Content-Type len 200 >> capture request header Cookie len 300 >> log-format %[capture.req.hdr(0),lower]\ %ci\ -\ [%t]\ \"%HM\ %HP\ %HV\"\ >> %ST\ \"%[capture.req.hdr(3)]\"\ %U\ \"%[capture.req.hdr(1)]\"\ >> \"%[capture.req.hdr(2)]\"\ \"%[capture.req.hdr(4)]\"\ %Tq\ \"%s\"\ >> 'NGINX-CACHE-- "-"'\ \"%ts\» >> >> >> Logformat is almost the same with Nginx >> >> But is some cases it works incorrectly >> >> For example log output >> >> Nov 20 10:41:56 lb.loc haproxy[12633]: example.com 81.4.227.173 - >> [20/Nov/2019:10:41:56.095] "GET /piwik.php H" 200 "-" 2396 >> "https://example.com/"; "Mozilla/5.0" "some.cookie data" 19 "vm06.lb.rsl.loc" >> NGINX-CACHE-- "-" "—" >> >> Problem is that "GET /piwik.php H" must be "GET /piwik.php HTTP/1.1" >> its %HV parameter in log-format >> >> A part of "HTTP/1.1" randomly cut’s off. It may be "HT" or "HTT" or >> "HTTP/1." >>
Re: Haproxy 1.7.11 log problems
Hi. Please can you use the latest 1.7, latest 1.8 or 2.0 and tell us if the problem still exist. Best regards Aleks Nov 20, 2019 9:52:01 AM Alexander Kasantsev : > Good day everyone! > > I’m migrated from haproxy 1.5 to 1.7.11 and I have some troubles with logging > > I have a following in config file for logging > > capture request header Host len 200 > capture request header Referer len 200 > capture request header User-Agent len 200 > capture request header Content-Type len 200 > capture request header Cookie len 300 > log-format %[capture.req.hdr(0),lower]\ %ci\ -\ [%t]\ \"%HM\ %HP\ %HV\"\ > %ST\ \"%[capture.req.hdr(3)]\"\ %U\ \"%[capture.req.hdr(1)]\"\ > \"%[capture.req.hdr(2)]\"\ \"%[capture.req.hdr(4)]\"\ %Tq\ \"%s\"\ > 'NGINX-CACHE-- "-"'\ \"%ts\» > > > Logformat is almost the same with Nginx > > But is some cases it works incorrectly > > For example log output > > Nov 20 10:41:56 lb.loc haproxy[12633]: example.com 81.4.227.173 - > [20/Nov/2019:10:41:56.095] "GET /piwik.php H" 200 "-" 2396 > "https://example.com/"; "Mozilla/5.0" "some.cookie data" 19 "vm06.lb.rsl.loc" > NGINX-CACHE-- "-" "—" > > Problem is that "GET /piwik.php H" must be "GET /piwik.php HTTP/1.1" > its %HV parameter in log-format > > A part of "HTTP/1.1" randomly cut’s off. It may be "HT" or "HTT" or "HTTP/1." >
Re: master-worker no-exit-on-failure with SO_REUSEPORT and a port being already in use
On Wed, Nov 20, 2019 at 10:19:20AM +0100, Christian Ruppert wrote: > Hi William, > > thanks for the patch. I'll test it later today. What I actually wanted to > achieve is: https://cbonte.github.io/haproxy-dconv/2.0/management.html#4 Then > HAProxy tries to bind to all listening ports. If some fatal errors happen > (eg: address not present on the system, permission denied), the process quits > with an error. If a socket binding fails because a port is already in use, > then the process will first send a SIGTTOU signal to all the pids specified > in the "-st" or "-sf" pid list. This is what is called the "pause" signal. It > instructs all existing haproxy processes to temporarily stop listening to > their ports so that the new process can try to bind again. During this time, > the old process continues to process existing connections. If the binding > still fails (because for example a port is shared with another daemon), then > the new process sends a SIGTTIN signal to the old processes to instruct them > to resume operations just as if nothing happened. The old processes will then > restart listening to the ports and continue to accept connections. Not that > this mechanism is system > > In my test case though it failed to do so. Well, it only works with HAProxy processes, not with other processes. There is no mechanism to ask a process which is neither an haproxy process nor a process which use SO_REUSEPORT. With HAProxy processes it will bind with SO_REUSEPORT, and will only use the SIGTTOU/SIGTTIN signals if it fails to do so. This part of the documentation is for HAProxy without master-worker mode in master-worker mode, once the master is launched successfully it is never supposed to quit upon a reload (kill -USR2). During a reload in master-worker mode, the master will do a -sf . If the reload failed for any reason (bad configuration, unable to bind etc.), the behavior is to keep the previous workers. It only tries to kill the workers if the reload succeed. So this is the default behavior. -- William Lallemand
Re: travis-ci: should we drop openssl-1.1.0 and replace it with 3.0 ?
On Tue, Nov 19, 2019 at 11:57:56PM +0100, Lukas Tribus wrote: > Testing and implementing build fixes for APIs while they are under active > development not only takes away precious dev time, it's also causes our own > code to be messed up with workarounds possibly only needed for specific > openssl development code at one point in time. This actually is a pretty valid point I hadn't thought about and which we experienced already in the past. It's not rare that a change gets reverted in other projects, and wasting time working around it just to see it finally cancelled is not cool. With all this said, I tend to see the CI as a way to lower the number of surprizes. This means that the most relevant stuff to test there is what we can reasonably expect to encounter in field. If some mainstream distros ship with specific openssl versions and they take care of the support themselves, it seems reasonable to keep these versions. That does not mean we have to test all combinations, as we can reasonably expect that testing a wide enough spectrum increases the likelihood that what is located between both extremities will also work. So if 1.1.0 is still shipped and maintained in relevant distros, we can keep it. Just my two cents, Willy
Re: master-worker no-exit-on-failure with SO_REUSEPORT and a port being already in use
Hi William, thanks for the patch. I'll test it later today. What I actually wanted to achieve is: https://cbonte.github.io/haproxy-dconv/2.0/management.html#4 Then HAProxy tries to bind to all listening ports. If some fatal errors happen (eg: address not present on the system, permission denied), the process quits with an error. If a socket binding fails because a port is already in use, then the process will first send a SIGTTOU signal to all the pids specified in the "-st" or "-sf" pid list. This is what is called the "pause" signal. It instructs all existing haproxy processes to temporarily stop listening to their ports so that the new process can try to bind again. During this time, the old process continues to process existing connections. If the binding still fails (because for example a port is shared with another daemon), then the new process sends a SIGTTIN signal to the old processes to instruct them to resume operations just as if nothing happened. The old processes will then restart listening to the ports and continue to accept connections. Not that this mechanism is system In my test case though it failed to do so. On 2019-11-19 17:27, William Lallemand wrote: On Tue, Nov 19, 2019 at 04:19:26PM +0100, William Lallemand wrote: > I then add another bind for port 80, which is in use by squid already > and try to reload HAProxy. It takes some time until it failes: > > Nov 19 14:39:21 894a0f616fec haproxy[2978]: [WARNING] 322/143921 (2978) > : Reexecuting Master process > ... > Nov 19 14:39:28 894a0f616fec haproxy[2978]: [ALERT] 322/143922 (2978) : > Starting frontend somefrontend: cannot bind socket [0.0.0.0:80] > ... > Nov 19 14:39:28 894a0f616fec systemd[1]: haproxy.service: Main process > exited, code=exited, status=1/FAILURE > > The reload itself is still running (systemd) and will timeout after > about 90s. After that, because of the Restart=always, I guess, it ends > up in a restart loop. > > So I would have expected that the master process will fallback to the > old process and proceed with the old child until the problem has been > fixed. > The patch in attachment fixes a bug where haproxy could reexecute itself in waitpid mode with -sf -1. I'm not sure this is your bug, but if this is the case you should see haproxy in waitpid mode, then the master exiting with the usage message in your logs. -- Regards, Christian Ruppert
Haproxy 1.7.11 log problems
Good day everyone! I’m migrated from haproxy 1.5 to 1.7.11 and I have some troubles with logging I have a following in config file for logging capture request header Host len 200 capture request header Referer len 200 capture request header User-Agent len 200 capture request header Content-Type len 200 capture request header Cookie len 300 log-format %[capture.req.hdr(0),lower]\ %ci\ -\ [%t]\ \"%HM\ %HP\ %HV\"\ %ST\ \"%[capture.req.hdr(3)]\"\ %U\ \"%[capture.req.hdr(1)]\"\ \"%[capture.req.hdr(2)]\"\ \"%[capture.req.hdr(4)]\"\ %Tq\ \"%s\"\ 'NGINX-CACHE-- "-"'\ \"%ts\» Logformat is almost the same with Nginx But is some cases it works incorrectly For example log output Nov 20 10:41:56 lb.loc haproxy[12633]: example.com 81.4.227.173 - [20/Nov/2019:10:41:56.095] "GET /piwik.php H" 200 "-" 2396 "https://example.com/"; "Mozilla/5.0" "some.cookie data" 19 "vm06.lb.rsl.loc" NGINX-CACHE-- "-" "—" Problem is that "GET /piwik.php H" must be "GET /piwik.php HTTP/1.1" its %HV parameter in log-format A part of "HTTP/1.1" randomly cut’s off. It may be "HT" or "HTT" or "HTTP/1."