Re: resolvers hold valid clarification
I dug deeper in the code and found out that server health checks also call dns resolution. The hold.valid was used before the change at https://github.com/haproxy/haproxy/commit/f50e1ac4442be41ed8b9b7372310d1d068b85b33 to trigger the resolution. After the change it is triggered no more. So the resolution if driven only by timeout resolve now. In other words hold.valid has not much meaning. When a server is in layer 4 connection error there are tasks inserted into scheduler queue which when executed never do any job after the change. This is suboptimal but I guess are not many installations with servers in this state for a long time. Can anyone confirm my understating of this setting ? śr., 22 wrz 2021 o 19:52 Michał Pasierb napisał(a): > I would like to clarify how *hold valid* is used by resolvers. I have > this configuration: > > resolvers mydns > nameserver dns1 192.168.122.202:53 > accepted_payload_size 8192 > > timeout resolve 5s > timeout retry 2s > resolve_retries 3 > > hold other 30s > hold refused120s > hold nx 30s > hold timeout10s > hold valid 1h > hold obsolete 0s > > The *valid* setting is a bit confusing. I can not find good explanation > of it in the documentation. From what I see in the code of version 2.0.25, > it is used only when doing DNS queries in tcp/http path: > > http-request do-resolve(txn.myip,mydns,ipv4) str(services.example.com) > > only when requests arrive in parallel, I see less queries to DNS servers > than http requests. When requests are done in sequence, I see the same > count of DNS requests as http requests. For example when I send 3000 > requests to HAProxy with 3 clients in parallel, there are about 2600 > requests to DNS servers. > > So it doesn't look like a proper cache to me. Whole HAProxy becomes > unusable 10 seconds (hold timeout) after DNS servers stop responding > because every server which is using DNS SRV record is put to maintenance > state due to resolution error. > > Is this proper assessment of current state and is this what was intended ? > > Regards, > Michal >
resolvers hold valid clarification
I would like to clarify how *hold valid* is used by resolvers. I have this configuration: resolvers mydns nameserver dns1 192.168.122.202:53 accepted_payload_size 8192 timeout resolve 5s timeout retry 2s resolve_retries 3 hold other 30s hold refused120s hold nx 30s hold timeout10s hold valid 1h hold obsolete 0s The *valid* setting is a bit confusing. I can not find good explanation of it in the documentation. From what I see in the code of version 2.0.25, it is used only when doing DNS queries in tcp/http path: http-request do-resolve(txn.myip,mydns,ipv4) str(services.example.com) only when requests arrive in parallel, I see less queries to DNS servers than http requests. When requests are done in sequence, I see the same count of DNS requests as http requests. For example when I send 3000 requests to HAProxy with 3 clients in parallel, there are about 2600 requests to DNS servers. So it doesn't look like a proper cache to me. Whole HAProxy becomes unusable 10 seconds (hold timeout) after DNS servers stop responding because every server which is using DNS SRV record is put to maintenance state due to resolution error. Is this proper assessment of current state and is this what was intended ? Regards, Michal
no log on http-keep-alive race
Hello, I have situation where HAProxy receives a request but doesn't log it. I would like to know if this is desired behaviour or this is an actual bug. To replicate this issue: 1) create HAProxy config which doesn't disable http-keep-alive 2) on backend server set http-keep-alive timeout to a low value of 2 seconds 3) run a script which sends requests in a loop and waits say 1.996 seconds between each iteration >From time to time, there will be no response from HAProxy. This is understandable. Connecting directly to backend server, without using HAProxy, gives same result. This is all desired. However, there is no log indicating that HAProxy was processing a request and that it failed on server side. This issue has already been discussed at https://discourse.haproxy.org/t/haproxy-1-7-11-intermittently-closes-connection-sends-empty-response-on-post-requests/3052/5 I understand this configuration is clearly wrong because the timeout on server should be higher than on HAProxy. In this scenario, I'm responsible for HAProxy configuration and somebody else is responsible for configuration of servers in backends. I can not know if every server is correctly configured therefore I would like to enable logging of such requests. I have tried version 1.7, 1.8 and 2.0 with different options like logasap. All behave the same. Regards, Michael
Re: confused by HAProxy log line
Hello, I did not mention it but all servers in c_backend have a httpchk configured. There is nothing in the HAProxy logs indicating the servers or the backend was not available during the time of the request. Indeed the stats page shows only 4 health checks did not pass on each server since HAProxy started and I know they were triggered by application deployment to the servers. This is rock solid. The servers in the backend handle thousands of requests during a day and log lines like these occur about 50 times a day. I tried to replicate the termination state in my lab but I failed. I got only CC--, CR-- and CD-- and http status is always set correctly (not to -1). The origin log line has CH-- which indicates HAProxy was waiting for reponse headers from a server so (this implies?) it connected somewhere. So why server is marked as ? Regards, Michal On Thu, Oct 11, 2018 at 1:00 PM Aleksandar Lazic wrote: > Am 11.10.2018 um 10:33 schrieb Michał Pasierb: > > Hello, > > > > I have a problem understanding a particular log line from HAProxy 1.7.11 > in > > production system. My clients report problems from time to time. They > make > > another request and all is OK. This is the log format used: > > > > log-format %tr\ %ci:%cp\ %ft\ %b/%s\ %TR/%Tw/%Tc/%Tr/%Ta\ %ST\ %B\ %CC\ > %CS\ > > %tsc\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %hr\ %hs\ %{+Q}r\ %ID\ %U\ > %[res.comp] > > > > and the problematic log line: > > > > 02/Oct/2018:09:55:14.997 :46007 http-in c_backend/ > > 44/-1/-1/-1/44 -1 25 - - CHNN 143/36/2/0/3 0/0 {} {} "PUT /api/xyz/status > > HTTP/1.1" 764 0 > > > > I can see that c_backend was properly selected based on acls. I can see > that > > client sent 764 bytes to HAProxy. The termination code is CH - client > aborted > > connection while HAProxy was waiting for headers from server. What > server ? > > There is and connection time of -1. There were 3 connection > retries. > > HAProxy got 25 bytes from a server. The config contains: > > > > defaults > > retries 3 > > optionredispatch > > > > Yet the retries is not +3 so it seems the redispatch did not take place. > > > > This is all confusing evidence. Can you explain what really happened ? > > None of the c_backend servers are reachable. > I assume that the -1 value is the status_code. > > Cite from > https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#8.2.3 > > ### > > - "status_code" is the HTTP status code returned to the client. This > status > is generally set by the server, but it might also be set by haproxy > when > the server cannot be reached or when its response is blocked by > haproxy. > > ... > > - "server_name" is the name of the last server to which the connection > was > sent, which might differ from the first one if there were connection > errors > and a redispatch occurred. Note that this server belongs to the backend > which processed the request. If the request was aborted before > reaching a > server, "" is indicated instead of a server name. If the > request was > intercepted by the stats subsystem, "" is indicated instead. > ### > > I suggest to use some checks for the c_backend for example > > > https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#4-option%20httpchk > > > Thanks, > > Michal > > Best regards > Aleks >
confused by HAProxy log line
Hello, I have a problem understanding a particular log line from HAProxy 1.7.11 in production system. My clients report problems from time to time. They make another request and all is OK. This is the log format used: log-format %tr\ %ci:%cp\ %ft\ %b/%s\ %TR/%Tw/%Tc/%Tr/%Ta\ %ST\ %B\ %CC\ %CS\ %tsc\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %hr\ %hs\ %{+Q}r\ %ID\ %U\ %[res.comp] and the problematic log line: 02/Oct/2018:09:55:14.997 :46007 http-in c_backend/ 44/-1/-1/-1/44 -1 25 - - CHNN 143/36/2/0/3 0/0 {} {} "PUT /api/xyz/status HTTP/1.1" 764 0 I can see that c_backend was properly selected based on acls. I can see that client sent 764 bytes to HAProxy. The termination code is CH - client aborted connection while HAProxy was waiting for headers from server. What server ? There is and connection time of -1. There were 3 connection retries. HAProxy got 25 bytes from a server. The config contains: defaults retries 3 optionredispatch Yet the retries is not +3 so it seems the redispatch did not take place. This is all confusing evidence. Can you explain what really happened ? Thanks, Michal
gzip compression and logging response size
Hi, I would like to log size of response before and after compression. Is it possible ? I see this question was asked before at http://thread.gmane.org/gmane.comp.web.haproxy/10008 without any resolution. Regards, Michal
regression in 1.6.13 - wrong http status in logs and stats
Hi, I would like to report a regression in HAProxy 1.6.13 after upgrading from 1.6.9 in production :( Reproduce with config: --- global log 127.0.0.1 local0 log 127.0.0.1 local1 notice maxconn 4096 chroot /usr/share/haproxy uid 99 gid 99 daemon defaults log global mode http option httplog option dontlognull option redispatch stats enable stats refresh 5s timeout connect 5s timeout client 30s timeout server 30s balance roundrobin option forwardfor listen Statistics bind 192.168.122.202: mode http stats refresh 5s stats uri / frontend http-in bind 192.168.122.202:9000 acl is_root_path path / redirect location /my_custom_page if is_root_path backend nomatch redirect location http://www.google.com --- Send a request with curl -v 192.168.122.202:9000/ - response is OK and has status code 302 but logs and statistics have 503 instead: 192.168.122.202:37880 [12/Jul/2017:15:54:49.573] http-in http-in/ 0/-1/-1/-1/2 503 305 - - LR-- 0/0/0/0/0 0/0 "GET / HTTP/1.1" git bisect shows this commit to be the culprit: commit b12d699543adb84fa543297d12b64fce7ec94803 Author: Christopher FauletDate: Tue Mar 28 11:51:33 2017 +0200 BUG/MINOR: http: Fix conditions to clean up a txn and to handle the next request I also tested 1.7.8 and 1.8-dev2 - they are OK. So it seems it is a backport issue. Regards, Michal
HTTP keep-alive reuse count & max-age
Hi, Is it possible to influence how HAProxy handles HTTP keep-alives to backend servers ? I want it to close TCP connection after x many HTTP requests. Also a max-age time for single TCP connection cloud be useful. These features are available in F5 LTM OneConnect ( https://support.f5.com/kb/en-us/solutions/public/7000/200/sol7208.html) as *Maximum Reuse* and *Maximum Age*. Goal is to move all TIME_WAITs to HAProxy host instead of creating them on backend servers. Thanks, Michal
cookie prefix & option httpclose
Hi, I have a question about cookie prefix mode. I use configuration like this: backend b1 balance roundrobin cookie JSESSIONID prefix server s1 192.168.122.1:8080 cookie c1 Now the docs https://cbonte.github.io/haproxy-dconv/configuration-1.6.html#4-cookie for prefix say: Since all requests and responses are subject to being modified, this mode requires the HTTP close mode. Behavior for version 1.4 is consistent with docs. The cookie was not prefixed for second request within one http keepalive session so I needed to add "option httpclose" to modify all requests. Now I did the same setup with version 1.5 and found out that even without "option httpclose" both requests have the cookie prefixed. My dillema - are the docs inconsistent and this section should be removed or am I missing something ? Thanks, Michal