Re: V1.9 SSL engine and ssl-mode-async is unstable
Hi. Am 25-01-2019 08:55, schrieb Kevin Zhu: HI HAProxy Team,: I am trying to use Intel qat work with HAProxy-1.9.0, but it work very unstable. and i had other try HAProxy-1.8.16 and it work will, How can i find what is wrong? 1.8.16 and 1.9.0 use same hardwave and system to running and compile, and use the same config file, the attach file is config file Please can you explain "very unstable" a little bit more. Can you try 1.9.2/3 ? Do you have any errors or warnings in the logs? Maybe you can use loglevel debug? Thanks of any help. Best regards Regards Aleks haproxy.conf Description: Binary data
Re: h1-client to h2-server host header / authority conversion failure.?
Hi List. Am 25-01-2019 01:01, schrieb PiBa-NL: Hi List, Attached a regtest which i 'think' should pass. ** s1 0.0 === expect tbl.dec[1].key == ":authority" s1 0.0 EXPECT tbl.dec[1].key (host) == ":authority" failed It seems to me the Host <> Authority conversion isn't happening properly.? But maybe i'm just making a mistake in the test case... I was using HA-Proxy version 2.0-dev0-f7a259d 2019/01/24 with this test. The test was inspired by the attempt to connect to mail.google.com , as discussed in the "haproxy 1.9.2 with boringssl" mail thread.. Not sure if this is the main problem, but it seems suspicious to me.. That's one of the reason why I love this community ;-) As I'm just one of this Community, I want to say, Thanks all on the list to be part of HAProxy ;-). Regards, PiBa-NL (Pieter) Regards Aleks
1.9.3 delayed
Hi guys, I know I said I'd issue 1.9.3 this week, but after having worked on addressing ugly H2 issues, I've now spent the whole day on a severe crash bug affecting server-side idle connections and at the end of the day with numerous traces and code changes I'm still back to the same point making no progress and starting to think I'm the one stupid not capable of reading the code. Given this bug manifests itself as memory corruption then crashes, I would really like to at least understand it before issuing 1.9.3, because at this point I don't trust the code I was about to release nor any bug report that could come from it. I've checked the queue and there isn't an emergency to emit a version now so better get the dirty things fixed and have a cleaner release than force everyone to upgrade every week. Thanks, Willy
Re: h1-client to h2-server host header / authority conversion failure.?
Hi Pieter, On Fri, Jan 25, 2019 at 01:01:19AM +0100, PiBa-NL wrote: > Hi List, > > Attached a regtest which i 'think' should pass. > > ** s1 0.0 === expect tbl.dec[1].key == ":authority" > s1 0.0 EXPECT tbl.dec[1].key (host) == ":authority" failed > > It seems to me the Host <> Authority conversion isn't happening properly.? > But maybe i'm just making a mistake in the test case... > > I was using HA-Proxy version 2.0-dev0-f7a259d 2019/01/24 with this test. > > The test was inspired by the attempt to connect to mail.google.com , as > discussed in the "haproxy 1.9.2 with boringssl" mail thread.. Not sure if > this is the main problem, but it seems suspicious to me.. It's not as simple, :authority is only required for CONNECT and is optional for other methods with Host as a fallback. Clients are encouraged to use it instead of the Host header field, according to paragraph 8.1.2.3, but there is nothing indicating that a gateway may nor should build one from scratch when translating HTTP/1.1 to HTTP/2. In fact the authority part is generally not present in the URIs we receive as a gateway, so what we'd put there would be completely reconstructed from the host header field. I don't even know if all servers are fine with authority only instead of Host. Please note, I'm not against changing this, I just want to be sure we actually fix something and that we don't break anything. Thus if you have any info indicating there is an issue with this one missing, it could definitely help. Thanks! Willy
Re: [PATCH] runtime do-resolve http action
On Fri, Jan 25, 2019 at 03:09:52PM +0100, Baptiste wrote: > Hi Willy, > > Thanks for the review!!! > I fixed most of the problems, but I have a 3 points I'd like to discuss: > > > + If an IP address can be found, it is stored into . If any kind of > > > + error occurs, then is not set. > > > > Just to be sure, it is not set or not modified ? I guess the latter, which > > is fine. > > > > Yes, not set. So '-m found' can be used. So you actually *remove* the variable if you don't get a response, that's it ? I would have possibly found it more convenient to just stay on the not modified approach so that you could possibly chain multiple do-resolve actions and hope that at least one of them could pick the response. Think about environments where you have multiple sets of resolvers (internal, admin, internet) and for unkonwn names you don't know which onee to ask so you ask all of them with 3 different rules. > > > + struct sample *smp; > > > + > > > + conn_get_from_addr(cli_conn); > > > + > > > + smp = sample_fetch_as_type(px, sess, s, > > SMP_OPT_DIR_REQ|SMP_OPT_FINAL, rule->arg.dns.expr, SMP_T_STR); > > > + if (smp) { > > > + char *fqdn; > > > + > > > + fqdn = smp->data.u.str.area; > > > + if (action_prepare_for_resolution(s, fqdn) == -1) { > > > + ha_alert("Can't create DNS resolution for > > server 'http request action'\n"); > > > > Please don't send runtime alerts. We've tried hard to clean them up so > > that they remain only during startup. > > > > In this function, I have a proxy structure. Should I use send_log() on it > to report the error? You could but then it'd be better to perform some form of rate-limiting. It is possible that the same reason causes the function to fail in loops for all requests and it's not very cool to spam logs with info that are already present in the request's failure anyway. In general an alert log is made so that someone can do something about it. What could be done however is to emit this error once if it's a matter of config, and to increment a counter reported in "show info". We already do this at some places, I just don't remember which ones :-) > > > + case ACT_HTTP_DO_RESOLVE: > > > case ACT_CUSTOM: > > > if ((s->req.flags & CF_READ_ERROR) || > > > ((s->req.flags & (CF_SHUTR|CF_READ_NULL)) && > > > > Suddenly that makes me wonder : why is it needed to have a dedicated > > action since it uses the generic infrastructure with ACT_CUSTOM ? > > > > I think this must have been one of the first thing I did during my > development phase so I would be able to "isolate" my code when needed. > Now you said it, and I step back a bit, I also consider there is no value > in this action, appart being clear on the action name and gives us the > ability to be very cautious if we update the behavior of ACT_CUSTOM in the > future. > I can remove ACT_HTTP_DO_RESOLVE and add a comment in ACT_CUSTOM saying > that the do-resolve action relies on this code, just in case. Normally the vast majority of actions are already in ACT_CUSTOM nowadays. The other ones are just historical exceptions. Please have a look at http_req_actions to see how to declare yours. In short you'll have to add something like this to dns.c (please excuse the copy-paste which will not work, but you'll get the idea) : static struct action_kw_list http_req_dns_actions = { .kw = { { "do-resolve", parse_http_req_do_resolve }, { NULL, NULL } } }; INITCALL1(STG_REGISTER, http_req_keywords_register, _req_dns_actions); And you're done, more or less a few includes of course :-) Cheers, Willy
Re: [PATCH] runtime do-resolve http action
Hi Willy, Thanks for the review!!! I fixed most of the problems, but I have a 3 points I'd like to discuss: > + If an IP address can be found, it is stored into . If any kind of > > + error occurs, then is not set. > > Just to be sure, it is not set or not modified ? I guess the latter, which > is fine. > Yes, not set. So '-m found' can be used. > > + struct sample *smp; > > + > > + conn_get_from_addr(cli_conn); > > + > > + smp = sample_fetch_as_type(px, sess, s, > SMP_OPT_DIR_REQ|SMP_OPT_FINAL, rule->arg.dns.expr, SMP_T_STR); > > + if (smp) { > > + char *fqdn; > > + > > + fqdn = smp->data.u.str.area; > > + if (action_prepare_for_resolution(s, fqdn) == -1) { > > + ha_alert("Can't create DNS resolution for > server 'http request action'\n"); > > Please don't send runtime alerts. We've tried hard to clean them up so > that they remain only during startup. > In this function, I have a proxy structure. Should I use send_log() on it to report the error? > > + case ACT_HTTP_DO_RESOLVE: > > case ACT_CUSTOM: > > if ((s->req.flags & CF_READ_ERROR) || > > ((s->req.flags & (CF_SHUTR|CF_READ_NULL)) && > > Suddenly that makes me wonder : why is it needed to have a dedicated > action since it uses the generic infrastructure with ACT_CUSTOM ? > I think this must have been one of the first thing I did during my development phase so I would be able to "isolate" my code when needed. Now you said it, and I step back a bit, I also consider there is no value in this action, appart being clear on the action name and gives us the ability to be very cautious if we update the behavior of ACT_CUSTOM in the future. I can remove ACT_HTTP_DO_RESOLVE and add a comment in ACT_CUSTOM saying that the do-resolve action relies on this code, just in case. Baptiste
Re: H2 Server Connection Resets (1.9.2)
Hi Luke, On Fri, Jan 25, 2019 at 08:08:22AM +, Luke Seelenbinder wrote: > Hi Willy, > > > OK so instead of sending you a boring series, I can propose you to run > > a test on 2.0-dev, which contains all the fixes I had to go through > > because of tiny issues everywhere related to this. If you're using git, > > just clone the master and checkout commit f7a259d46f8. > > you can simply wait for the next nightly snapshot. > > Sounds good. My compilation playbook uses tarballs, so I'll just use the last > nightly. I assume I should wait for these fixes to be backported (1.9.3?) > before trying anything in production? As you like. My first rule is never to make people take risks they're not willing to take. It's perfectly OK to me if you don't feel confident with 2.0-dev in prod. I'm going to perform the 1.9 backports. If you're interested in testing them from the branch before I release it today, just let me know. > > But now you have a new server parameter called > > "max-reuse". This allows to limit the number of times a server connection > > is reused. For example you can set it to 990 when you know that the > > server limits to 1000. > > That's great! I didn't expect to get a new configuration option. I'll > definitely make sure these are in sync across our infrastructure. Even without the option it will work better than before, but the option is there to completely void any risk of hitting the limit too late. > > Regarding the fact that in your case the client's close seems to cause > > the server-side issue, I couldn't yet reproduce it though I have a few > > theories about it. One of them would be an unexpected response from > > the server causing the connection to turn to an error state. The other > > one would be that we'd incorrectly abort our stream and/or session and > > bring the connection down with us. I'll submit these theories to Olivier > > once he's back so that he can tell me I'm saying crap regarding some of > > them and we can focus on what remains :-) > > Sounds good. I'll report back my results from the latest snapshot and we can > go from there. Perhaps the client causing the issues was a red herring for > the server-side bugs. I hadn't thought about it but it could also be, indeed. > Thanks again for deep-diving and resolving this! I won't ask how many hours > it took to find all these small edge cases. . . Usually you start from a bug report, you find a hook in the code which starts to explain it, and you walk along the thread discovering that a lot of places are wrong together and once perfectly aligned cause crazy things to happen. Of course there's the solution of putting some brown paper bag on top of the most visible one, but in this project we prefer to address the causes than the consequences ;-) So yes it sometimes takes time and caffeine, and often delays releases because it's always hard to accept to release something with known unfixed issues in it. Cheers, Willy
Re: H2 Server Connection Resets (1.9.2)
Hi Willy, > OK so instead of sending you a boring series, I can propose you to run > a test on 2.0-dev, which contains all the fixes I had to go through > because of tiny issues everywhere related to this. If you're using git, > just clone the master and checkout commit f7a259d46f8. > you can simply wait for the next nightly snapshot. Sounds good. My compilation playbook uses tarballs, so I'll just use the last nightly. I assume I should wait for these fixes to be backported (1.9.3?) before trying anything in production? > But now you have a new server parameter called > "max-reuse". This allows to limit the number of times a server connection > is reused. For example you can set it to 990 when you know that the > server limits to 1000. That's great! I didn't expect to get a new configuration option. I'll definitely make sure these are in sync across our infrastructure. > Regarding the fact that in your case the client's close seems to cause > the server-side issue, I couldn't yet reproduce it though I have a few > theories about it. One of them would be an unexpected response from > the server causing the connection to turn to an error state. The other > one would be that we'd incorrectly abort our stream and/or session and > bring the connection down with us. I'll submit these theories to Olivier > once he's back so that he can tell me I'm saying crap regarding some of > them and we can focus on what remains :-) Sounds good. I'll report back my results from the latest snapshot and we can go from there. Perhaps the client causing the issues was a red herring for the server-side bugs. Thanks again for deep-diving and resolving this! I won't ask how many hours it took to find all these small edge cases. . . Best, Luke — Luke Seelenbinder Stadia Maps | Founder stadiamaps.com ‐‐‐ Original Message ‐‐‐ On Thursday, January 24, 2019 7:55 PM, Willy Tarreau wrote: > Hi Luke, > > On Wed, Jan 23, 2019 at 05:16:04PM +, Luke Seelenbinder wrote: > > > Hi Willy, > > This is all very good to hear. I'm glad you were able to get to the bottom > > of > > it all! > > Feel free to send along patches if you want me to test before the 1.9.3 > > release. I'm more than happy to do so. > > OK so instead of sending you a boring series, I can propose you to run > a test on 2.0-dev, which contains all the fixes I had to go through > because of tiny issues everywhere related to this. If you're using git, > just clone the master and checkout commit f7a259d46f8. > you can simply wait for the next nightly snapshot. > > Just let me know if that's OK for you. > > I found a number of issues that were causing server aborts, mainly > due to the late GOAWAY frame. Once we hit this one, the connection > is quickly closed by the server, causing our output packets to be > rejected and the connection to be in error. I have not yet investigated > in details to see if the close happens after we got the last data or in > the middle though. But now you have a new server parameter called > "max-reuse". This allows to limit the number of times a server connection > is reused. For example you can set it to 990 when you know that the > server limits to 1000. > > On the tests I've run here, I managed to address all the problems > related to excessive use of idle connections resulting in too many > streams being sent. In addition most of the rare cases that still > happen when you don't have max-reuse are properly handled as a retry. > > Regarding the fact that in your case the client's close seems to cause > the server-side issue, I couldn't yet reproduce it though I have a few > theories about it. One of them would be an unexpected response from > the server causing the connection to turn to an error state. The other > one would be that we'd incorrectly abort our stream and/or session and > bring the connection down with us. I'll submit these theories to Olivier > once he's back so that he can tell me I'm saying crap regarding some of > them and we can focus on what remains :-) > > Regards, > Willy publickey - luke.seelenbinder@stadiamaps.com - 0xB23C1E8A.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature