Re: Building HAProxy 1.8 fails on Solaris

2018-07-20 Thread Olivier Houchard
Hi,

On Sat, Jul 21, 2018 at 12:51:53AM +0200, Lukas Tribus wrote:
> Hello,
> 
> On Fri, 20 Jul 2018 at 15:58, Olivier Houchard  wrote:
> >
> > Hi LuKas,
> >
> > On Fri, Jul 20, 2018 at 01:53:35PM +0200, Lukas Tribus wrote:
> > > Hello Oliver,
> > >
> > > On Fri, 20 Jul 2018 at 11:55, Olivier Houchard 
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Fri, Jul 20, 2018 at 12:22:20AM +, Thrawn wrote:
> > > > >  So...is there a way to adapt this patch so it won't cause random SSL
> > > errors and is suitable to apply to the trunk? We don't really want to run 
> > > a
> > > customised build in production...
> > > >
> > > > You don't need the patch, just using USE_PTHREAD_PSHARED=yes should be
> > > enough.
> > >
> > > I don't really understand, are you saying that the spinlock introduced
> > > in cd1a526a doesn't work properly (as in: causes random SSL errors)? How
> > > does this work on FreeBSD and OpenBSD? This sounds like a supported
> > > configuration on a supported OS causes random SSL errrors when in
> > > multiprocess mode, but I imagine I got something wrong here.
> > >
> > > Please help me understand this issue.
> > >
> >
> > No, it works fine, but if you compile without USE_THREADS, the HA_ATOMIC*
> > macroes I used in my patch are in fact not atomic at all, so that may cause
> > random and unpredictable failures if the SSL cache code use them.
> 
> Ok, I guess my question is, how can we improve this so that SSL cache
> doesn't unpredictably fail?
> Threading is a huge feature, but behaving unpredictably when threading
> is disabled (or unsupported on the specific OS) is bad.
> 
> Can't the code handle the SSL cache like it did before threading was
> introduced, when threading is disabled?
> 
> 

My patch won't be applied, so it'll behave the exact same way it used to.
Reading the 1.6 code, the calls to __sync_lock* were already there, so
it probably did not compile on Solaris with a gcc older than 4.1, unless
USE_PTHREAD_PSHARED was defined.

Regards,

Olivier



Re: Building HAProxy 1.8 fails on Solaris

2018-07-20 Thread Lukas Tribus
Hello,

On Fri, 20 Jul 2018 at 15:58, Olivier Houchard  wrote:
>
> Hi LuKas,
>
> On Fri, Jul 20, 2018 at 01:53:35PM +0200, Lukas Tribus wrote:
> > Hello Oliver,
> >
> > On Fri, 20 Jul 2018 at 11:55, Olivier Houchard 
> > wrote:
> > >
> > > Hi,
> > >
> > > On Fri, Jul 20, 2018 at 12:22:20AM +, Thrawn wrote:
> > > >  So...is there a way to adapt this patch so it won't cause random SSL
> > errors and is suitable to apply to the trunk? We don't really want to run a
> > customised build in production...
> > >
> > > You don't need the patch, just using USE_PTHREAD_PSHARED=yes should be
> > enough.
> >
> > I don't really understand, are you saying that the spinlock introduced
> > in cd1a526a doesn't work properly (as in: causes random SSL errors)? How
> > does this work on FreeBSD and OpenBSD? This sounds like a supported
> > configuration on a supported OS causes random SSL errrors when in
> > multiprocess mode, but I imagine I got something wrong here.
> >
> > Please help me understand this issue.
> >
>
> No, it works fine, but if you compile without USE_THREADS, the HA_ATOMIC*
> macroes I used in my patch are in fact not atomic at all, so that may cause
> random and unpredictable failures if the SSL cache code use them.

Ok, I guess my question is, how can we improve this so that SSL cache
doesn't unpredictably fail?
Threading is a huge feature, but behaving unpredictably when threading
is disabled (or unsupported on the specific OS) is bad.

Can't the code handle the SSL cache like it did before threading was
introduced, when threading is disabled?


Thanks,
Lukas



Re: haproxy 1.8.12 / 1.9- 20180623 / stopping process hangs with threads (100% cpu) on -sf reload / FreeBSD

2018-07-20 Thread Willy Tarreau
On Fri, Jul 20, 2018 at 09:56:23PM +0200, PiBa-NL wrote:
> Thanks Christopher & Willy,
> 
> Op 20-7-2018 om 14:26 schreef Willy Tarreau:
> > > Op 20-7-2018 om 10:43 schreef Christopher Faulet:
> > OK finally I've merged it because it obviously fixes a bug.
> > Willy
> 
> Confirmed fixed with current master's:
> HA-Proxy version 1.8.12-5e100b4 2018/07/20
> HA-Proxy version 1.9-dev0-842ed9b 2018/07/20
> 
> (Well at least my reproduction doesn't work anymore.. while it did quite
> easily before.) So thats good ;)

Excellent, thanks for confirming Pieter.
Have a nice week-end!

Willy



Re: haproxy 1.8.12 / 1.9- 20180623 / stopping process hangs with threads (100% cpu) on -sf reload / FreeBSD

2018-07-20 Thread PiBa-NL

Thanks Christopher & Willy,

Op 20-7-2018 om 14:26 schreef Willy Tarreau:

Op 20-7-2018 om 10:43 schreef Christopher Faulet:

OK finally I've merged it because it obviously fixes a bug.
Willy


Confirmed fixed with current master's:
HA-Proxy version 1.8.12-5e100b4 2018/07/20
HA-Proxy version 1.9-dev0-842ed9b 2018/07/20

(Well at least my reproduction doesn't work anymore.. while it did quite 
easily before.) So thats good ;)


Regards,
PiBa-NL (Pieter)




Re: Building HAProxy 1.8 fails on Solaris

2018-07-20 Thread Olivier Houchard
Hi LuKas,

On Fri, Jul 20, 2018 at 01:53:35PM +0200, Lukas Tribus wrote:
> Hello Oliver,
> 
> On Fri, 20 Jul 2018 at 11:55, Olivier Houchard 
> wrote:
> >
> > Hi,
> >
> > On Fri, Jul 20, 2018 at 12:22:20AM +, Thrawn wrote:
> > >  So...is there a way to adapt this patch so it won't cause random SSL
> errors and is suitable to apply to the trunk? We don't really want to run a
> customised build in production...
> >
> > You don't need the patch, just using USE_PTHREAD_PSHARED=yes should be
> enough.
> 
> I don't really understand, are you saying that the spinlock introduced
> in cd1a526a doesn't work properly (as in: causes random SSL errors)? How
> does this work on FreeBSD and OpenBSD? This sounds like a supported
> configuration on a supported OS causes random SSL errrors when in
> multiprocess mode, but I imagine I got something wrong here.
> 
> Please help me understand this issue.
> 

No, it works fine, but if you compile without USE_THREADS, the HA_ATOMIC*
macroes I used in my patch are in fact not atomic at all, so that may cause
random and unpredictable failures if the SSL cache code use them.

Regards,

Olivier



Proxified TCP connections with no applicative test possible.

2018-07-20 Thread Thomas Martin
Hello,

I'm trying to setup haproxy for a kind a of weird situation.

Here is my architecture:
- Server S0 and S1 can connect to our client services (which we want
to be proxified)
- Server C0 is in a dedicated network and can't access our client FIX
servers directly. He needs to use S0 proxies (S0 and S1).
- Our client's services:
-- are not HTTP services,
-- doesn't allow more than one connection at a time: if another
simultaneous connection is done, it will be open (TCP speaking) but
none of it's applicative requests will be processed.


So my goal is to have my application connected to C0's haproxy, which
connect itself to S0 or S1 haproxy.


Here is my setup:
- On S0 and S1:
frontend f_FIX_SERVER
bind 10.10.10.{X,Y}:1
mode tcp
default_backend b_FIX_SERVER
backend b_FIX_SERVER
mode tcp
server fix_1 10.11.11.11:3129 check
server fix_2 10.11.11.11:3130 check backup

- On C0
frontend f_FIX_CLIENT
bind 127.0.0.1:2
mode tcp
default_backend b_FIX_CLIENT
backend b_FIX_CLIENT
mode tcp
#S0
server fix 10.10.10.X:1 check
#S1
server fix 10.10.10.Y:1 check backup


In case of S0 going DOWN, C0 will use S1 as expected (which is perfect).

But if my client's services goes unreachable on S0, while S0 is still
running, haproxy on C0 will NOT use S1 as haproxy on S0 is still
responding correctly (from C0 haproxy point of view).

I tried to play with "tcp-request connection reject" (and acl with
"nbsrv") on S0 and S1 but without any success.


Do you have any advice to help me?
Am I missing something obvious ?


Thanks for reading.

Regards,
Thomas



Re: Connections stuck in CLOSE_WAIT state with h2

2018-07-20 Thread Milan Petruželka
On Fri, 20 Jul 2018 at 13:37, Willy Tarreau  wrote:

> Thank you Janusz for testing. I've also merged the other patch that could
> lead to some CLOSE_WAIT that we tested a few months ago. It was different
> but maybe you were suffering from the two causes.
>

I've applied both patches to vanilla haproxy 1.8.12. I'll leave it running
and report back.

BUG/MEDIUM: h2: never leave pending data in the output buffer on close
BUG/MEDIUM: h2: make sure the last stream closes the connection after a
timeout

Milan


Re: [PATCH] MINOR: Generate sha256 checksums in publish-release

2018-07-20 Thread Willy Tarreau
On Fri, Jul 20, 2018 at 01:45:51PM +0200, Tim Düsterhus wrote:
> Perfect. Could you possibly retroactively add a .sha256 file for the
> newest version of each currently supported branch (i.e. 1.8.12, 1.7.11,
> 1.6.14 and 1.5.19)? This would allow me to update a few scripts, without
> having to special case branches with and without sha256.

OK, given that you wrote the patch and the effort of enumerating the
versions, I didn't have to think too much so I've just done it :-)

Cheers,
Willy



Re: haproxy 1.8.12 / 1.9- 20180623 / stopping process hangs with threads (100% cpu) on -sf reload / FreeBSD

2018-07-20 Thread Willy Tarreau
On Fri, Jul 20, 2018 at 10:43:13AM +0200, Christopher Faulet wrote:
> Damn! I forgot to check that. We talked about it 30 min ago though!

OK finally I've merged it because it obviously fixes a bug and William
wants to prepare another release which I think is a good idea. We can
revisit this later if it's not enough anyway.

Willy



Re: [PATCH] BUG/MINOR: http: Set brackets for the unlikely macro at the right place

2018-07-20 Thread Willy Tarreau
Hi Tim,

On Fri, Jul 20, 2018 at 01:40:28PM +0200, Tim Düsterhus wrote:
> Am 20.07.2018 um 10:42 schrieb Willy Tarreau:
> > This one is very interesting because depending on the compiler version
> > the unlikely macro differs and may never detect the failure by never
> > returning < 0! It's very unlikely to meet this case though.
> 
> In fact this already caused a crash:
> http://git.haproxy.org/?p=haproxy.git;a=commit;h=45be38c9c7ba2b20806f2b887876db4fb5b9457c
> 
> You might want to review all the `unlikely` / `likely` invocations, now
> that it happened twice.

You're right. I've just checked with coccinelle after a few trial and
errors (I'm terribly bad at it) and managed to spot only these two ones,
which is not a guarantee that it's OK but at least we don't seem to have
other occurrences using the same format.

Willy



Re: Building HAProxy 1.8 fails on Solaris

2018-07-20 Thread Lukas Tribus
Hello Oliver,

On Fri, 20 Jul 2018 at 11:55, Olivier Houchard 
wrote:
>
> Hi,
>
> On Fri, Jul 20, 2018 at 12:22:20AM +, Thrawn wrote:
> >  So...is there a way to adapt this patch so it won't cause random SSL
errors and is suitable to apply to the trunk? We don't really want to run a
customised build in production...
>
> You don't need the patch, just using USE_PTHREAD_PSHARED=yes should be
enough.

I don't really understand, are you saying that the spinlock introduced
in cd1a526a doesn't work properly (as in: causes random SSL errors)? How
does this work on FreeBSD and OpenBSD? This sounds like a supported
configuration on a supported OS causes random SSL errrors when in
multiprocess mode, but I imagine I got something wrong here.

Please help me understand this issue.


Thanks,
Lukas


Re: [PATCH] MINOR: Generate sha256 checksums in publish-release

2018-07-20 Thread Tim Düsterhus
Willy,

Am 20.07.2018 um 10:50 schrieb Willy Tarreau:
> Makes sense, now merge.
> 

Perfect. Could you possibly retroactively add a .sha256 file for the
newest version of each currently supported branch (i.e. 1.8.12, 1.7.11,
1.6.14 and 1.5.19)? This would allow me to update a few scripts, without
having to special case branches with and without sha256.

Best regards
Tim Düsterhus



Re: [PATCH] BUG/MINOR: http: Set brackets for the unlikely macro at the right place

2018-07-20 Thread Tim Düsterhus
Hi

Am 20.07.2018 um 10:42 schrieb Willy Tarreau:
> This one is very interesting because depending on the compiler version
> the unlikely macro differs and may never detect the failure by never
> returning < 0! It's very unlikely to meet this case though.

In fact this already caused a crash:
http://git.haproxy.org/?p=haproxy.git;a=commit;h=45be38c9c7ba2b20806f2b887876db4fb5b9457c

You might want to review all the `unlikely` / `likely` invocations, now
that it happened twice.

Best regards
Tim Düsterhus



Re: Connections stuck in CLOSE_WAIT state with h2

2018-07-20 Thread Willy Tarreau
On Fri, Jul 20, 2018 at 01:30:00PM +0200, Janusz Dziemidowicz wrote:
> I've been running 1.8.12 with this patch for an hour. It seems that it
> helped somewhat, but not entirely. After an hour I still see about 10
> CLOSE_WAIT sockets. The number seems to grow a lot slower, but still
> grows (and some of them have been sitting in CLOSE_WAIT for over 30
> minutes).

Thank you Janusz for testing. I've also merged the other patch that could
lead to some CLOSE_WAIT that we tested a few months ago. It was different
but maybe you were suffering from the two causes.

> Since I'm also affected by SPDY_PROTOCOL_ERROR I mentioned earlier I
> must disable h2 now.

OK. If you happen to figure that it's caused by haproxy, do not hesitate
to provide any information so that we can debug this one further.

Thanks!
Willy



Re: Connections stuck in CLOSE_WAIT state with h2

2018-07-20 Thread Janusz Dziemidowicz
czw., 19 lip 2018 o 11:14 Willy Tarreau  napisał(a):
>
> Hi Milan, Janusz,
>
> I suspect I managed to reliably trigger the issue you were facing and
> found a good explanation for it. It is caused by unprocessed bytes at
> the end of the H1 stream. I manage to reproduce it if I chain two layers
> of haproxy with the last one sending stats. It does not happen with a
> single layer, so the scheduling matters a lot (I think it's important
> that the final CRLF is present in the same response packet as the final
> chunk).
>
> Could you please try the attached patch to see if you're on the same
> issue ?

I've been running 1.8.12 with this patch for an hour. It seems that it
helped somewhat, but not entirely. After an hour I still see about 10
CLOSE_WAIT sockets. The number seems to grow a lot slower, but still
grows (and some of them have been sitting in CLOSE_WAIT for over 30
minutes).
Since I'm also affected by SPDY_PROTOCOL_ERROR I mentioned earlier I
must disable h2 now.

-- 
Janusz Dziemidowicz



Re: [PATCH] MEDIUM: proxy_protocol: Send IPv4 addresses when possible

2018-07-20 Thread Lukas Tribus
Hello,


On Wed, 18 Jul 2018 at 14:30, Willy Tarreau  wrote:
>
> Hi Tim,
>
> On Wed, Jul 18, 2018 at 01:48:01PM +0200, Tim Düsterhus wrote:
> > This would solve the issue for my use case and should not break anything
> > (a few UNKNOWNs will become TCP6 then).
>
> OK.
>
> > I can rework the patch, if technical changes look good to you. There's a
> > ton of memcpy in there to not destroy the data structures needed for the
> > actual communication: make_proxy_line() now always operates on a copy of
> > `struct connection remote`.
>
> I don't see why a copy could be needed, given that you should never have
> to modify anything in the source. Or maybe it's because it was more
> convenient to store the remapped addresses ? In this case I think that
> it's still cleaner to have two local struct sockaddr_storage that you
> can use for the operations. Another option, which I *thought* we had but
> am probably wrong on this, was to first check the source+destination
> address classes before deciding what to emit. (or maybe it's done in the
> PPv2 format) :
>
>  if (src_is_v4 && dst_is_v4)
>   /* emit_v4(src, dst) */
>  else if (src_is_v6 && dst_is_v6)
>   /* emit_v6(src, dst) */
>  else if (src_is_v6 && dst_is_v4)
>   /* emit_v6(src, 4to6(dst)) */
>  else if (src_is_v4 && dst_is_v6)
>   /* emit_v6(4to6(src), dst) */
>  else
>   /* emit_unknown()*/
>
> So in practice you always need only one temporary sockaddr_storage for
> a conversion.
>
> I'm personally fine with something roughly like this. Lukas, I'm interested
> in your opinion on this one, as I *think* it addresses the issues without
> introducing new ones.

However if I read the pseudo-code correctly, we will still transform a
v4 src to v6 src, when the destination is v6? I don't like
transforming the address family at all - but I realize the proxy
protocol needs a single AF for both source and destination, which
means we need to compromise and the proposed solution is the best
compromise as far as I can see.

Agreed.

cheers,
lukas



Re: Building HAProxy 1.8 fails on Solaris

2018-07-20 Thread Olivier Houchard
Hi,

On Fri, Jul 20, 2018 at 12:22:20AM +, Thrawn wrote:
>  So...is there a way to adapt this patch so it won't cause random SSL errors 
> and is suitable to apply to the trunk? We don't really want to run a 
> customised build in production...

You don't need the patch, just using USE_PTHREAD_PSHARED=yes should be enough.

Regards,

Olivier

> On Wednesday, 18 July 2018, 10:45:38 pm AEST, Olivier Houchard 
>  wrote:  
>  
>  Hi,
> 
> On Wed, Jul 18, 2018 at 02:17:59AM +, Thrawn wrote:
> > Mea culpa, I applied the patch incorrectly. After fixing that, I can 
> > successfully build with 'USE_THREAD=' but without 'USE_PTHREAD_PSHARED=yes' 
> > (although from what Olivier said, I probably shouldn't do that).  On 
> > Wednesday, 18 July 2018, 12:10:57 pm AEST, Thrawn 
> >  wrote:  
> 
> Yeah the patch may get you to experience random errors if you share a SSL
> cache across multiple process, USE_PTHREAD_PSHARED=yes should build as well,
> and should do the right thing.
> 
> Regards,
> 
> Olivier
>   



Re: Empty reply from server

2018-07-20 Thread Serge Reynier

Hi,

Yes i have setuped logging :

 listen serveurs_ssp-dispatcher
  bind x.x.x.x:80
  log x.x.x.x local0
  no option log-separate-errors
  option httplog
  option logasap
  no option log-separate-errors
  capture request  header X-Forwarded-For len 50
  capture request  header Host len 50
  capture request  header Content-Length len 10
  capture request  header Referer len 100
  capture response header Server len 20
  capture response header Content-Length len 10
  capture response header Cache-Control len 8
  capture response header Via len 20
  capture response header Location len 100
  cookie DISPATCHER insert nocache
  capture cookie phpAds_capAd_c3 len 50
  option httpchk get /alive HTTP/1.0\r\nHost:\ dispatcher.adxcore.com
  maxconn 10
  timeout client 20s
  server dispatcher1.prod.ssp.adxcore.com 149.202.69.147:80 cookie DIS1 
check inter 10s weight 30


I captured this log :

Jul 19 15:59:44 testvm.adthink.com haproxy[5904]: 82.64.34.197 
1532015983 dispatcher1.prod.ssp.adxcore.com 200 +615 
{|dispatcher.adxcore.com||} {Apache|5870|must-rev||} "GET 
/a/hb/prebid/v1/?zoneid=181948=1=15=1_flash=0=0czliw0ln73p=0=www.chefsimon.com=0.9610747275344138_start=1531471091736_capzone3=_capzoneformat3=_blockzone3==tututoto20bffceda9209c3ec636e8b37898061004aa97188=0=node_consent=& 
HTTP/1.1"


But with my browser or curl i have "Empty reply from server"

Best regards,
Serge.

Le 19/07/2018 à 23:55, Aleksandar Lazic a écrit :

On 18/07/2018 15:56, Serge Reynier wrote:

Hi,

We are having some issues recently when calling a specific url, we 
are getting the following error on the client side :


* "Empty reply from server"


I don't understand this statement:

The problem is that Haproxy is not logging any of these errors, and 
more of that it shows a html 200 code.


Do you have setuped logging in haproxy?
Please can you share the config, thanks.

Normally you should see 0 length in the haproxy http log.

After some research, it seems that this error pops out because of bad 
headers returned by our code.


So we would like to know if there's any way to log the "Empty reply 
from server" error on the haproxy side ?


_/We are using Haproxy 1.8.12./_

Best regards,

Serge.


Best regards
Aleks





Re: Suppression de l’extension

2018-07-20 Thread --
Hello,
 
In fact I just want the display of the .html extension on my site no longer 
displayed
 
I use haproxy with Nginx, I can make url rewrite with Nginx that works well:
 
server {
  rewrite ^(/.*)\.html(\?.*)?$ $1$2 permanent;
 
  index index.html;
  try_files $uri.html $uri/ $uri =404;
}
 
 
But since Nginx runs on port 8889 (on the same machine as Haproxy) it redirects 
me to this port and I lose the connection (haproxy is listening on port 443)
 
I wish to do this with Haproxy, is it possible?
 
example:
 
https://site.com/index.html -> https://site.com/index (the resource without the 
.html does not exist but i want it to be displayed like this in the browser)
 
 
Thank you

Envoyé de mon iPhone

> Le 19 juil. 2018 à 23:48, Aleksandar Lazic  a écrit :
> 
> Hi.
> 
>> On 19/07/2018 15:09, -- wrote:
>> Hello,
>> 
>> Je souhaite supprimer l’extension présentée par mon serveur nginx mais
>> depuis Haproxy
>> 
>> Type
>> 
>> A.com/index.html en A.com/index
>> 
>> Est ce possible ?
> 
> Maybe, but please can you ask in English, thanks.
> 
> But let me try to interpret you question.
> 
> reqrep ^([^\ :]*)\ /index.html \1\ /index
> 
> https://cbonte.github.io/haproxy-dconv/1.8/configuration.html#4.2-reqrep
> 
>> Merci
>> 
>> Envoyé de mon iPhone
> 
> Best regards
> Aleksg


Re: [PATCH] MINOR: Generate sha256 checksums in publish-release

2018-07-20 Thread Willy Tarreau
Hi Tim,

On Thu, Jul 19, 2018 at 11:57:56PM +0200, Tim Duesterhus wrote:
> Currently only md5 signatures are generated. While md5
> still is not broken with regard to preimage attacks, sha256
> clearly is the current secure solution.
> 
> This patch should be backported to all supported branches.

Makes sense, now merge.

Thank you!
Willy



Re: [PATCH] BUG/MINOR: build: Fix compilation with debug mode enabled

2018-07-20 Thread Willy Tarreau
On Fri, Jul 20, 2018 at 10:31:33AM +0200, Christopher Faulet wrote:
> Hi Willy,
> 
> Here is a patch to fix the compilation of HAProxy with the debug mode
> enabled. Some debug messages were still using the old buffers api.

It was left as an API conversion exercise for the reader :-)

Bah as you and Cyril regularly notice, I don't build with DEBUG_ALL
which is too verbose for me, and I regularly break it :-/

Sorry about this, patch merged, thanks!
Willy



Re: haproxy 1.8.12 / 1.9- 20180623 / stopping process hangs with threads (100% cpu) on -sf reload / FreeBSD

2018-07-20 Thread Christopher Faulet

Le 20/07/2018 à 10:38, Willy Tarreau a écrit :

On Fri, Jul 20, 2018 at 10:27:42AM +0200, Christopher Faulet wrote:

In thread_sync_barrier, we exit when all threads have set their own bit in the
barrier mask. It is done by comparing it to all_threads_mask. But we must not
use a simple equality to do so, becaue all_threads_mask may change. Since commit
ba86c6c25 ("MINOR: threads: Be sure to remove threads from all_threads_mask on
exit"), when a thread exit, its bit is removed from all_threads_mask. Instead,
we must use a bitwise AND to test is all bits of all_threads_mask are set.


I agree on the principle, however it may still not work because
all_threads_mask is not marked volatile, so the compiler will most of
the time fetch the value into a register and will loop on it, keeping
the wrong value all the time. The case where it matters is when the
all_threads_mask is fetched and cached when entering the function,
then loses the value while barrier still has this bit set.

Thus I think that adding only this to your patch will definitely kill
this bug :


Damn! I forgot to check that. We talked about it 30 min ago though!

--
Christopher Faulet



Re: [PATCH] BUG/MINOR: http: Set brackets for the unlikely macro at the right place

2018-07-20 Thread Willy Tarreau
On Fri, Jul 20, 2018 at 10:33:52AM +0200, Christopher Faulet wrote:
> Hi Willy,
> 
> Here is a little patch to fix brackets place of the unlikely macro in http
> code. It must be backported in 1.8.

This one is very interesting because depending on the compiler version
the unlikely macro differs and may never detect the failure by never
returning < 0! It's very unlikely to meet this case though.

thanks!
Willy



Re: haproxy 1.8.12 / 1.9- 20180623 / stopping process hangs with threads (100% cpu) on -sf reload / FreeBSD

2018-07-20 Thread Willy Tarreau
On Fri, Jul 20, 2018 at 10:27:42AM +0200, Christopher Faulet wrote:
> In thread_sync_barrier, we exit when all threads have set their own bit in the
> barrier mask. It is done by comparing it to all_threads_mask. But we must not
> use a simple equality to do so, becaue all_threads_mask may change. Since 
> commit
> ba86c6c25 ("MINOR: threads: Be sure to remove threads from all_threads_mask on
> exit"), when a thread exit, its bit is removed from all_threads_mask. Instead,
> we must use a bitwise AND to test is all bits of all_threads_mask are set.

I agree on the principle, however it may still not work because
all_threads_mask is not marked volatile, so the compiler will most of
the time fetch the value into a register and will loop on it, keeping
the wrong value all the time. The case where it matters is when the
all_threads_mask is fetched and cached when entering the function,
then loses the value while barrier still has this bit set.

Thus I think that adding only this to your patch will definitely kill
this bug :

diff --git a/include/common/hathreads.h b/include/common/hathreads.h
index 5c4ceca..274f988 100644
--- a/include/common/hathreads.h
+++ b/include/common/hathreads.h
@@ -260,7 +260,7 @@ void thread_exit_sync(void);
 int  thread_no_sync(void);
 int  thread_need_sync(void);
 
-extern unsigned long all_threads_mask;
+extern volatile unsigned long all_threads_mask;
 
 #define ha_sigmask(how, set, oldset)  pthread_sigmask(how, set, oldset)
 
diff --git a/src/hathreads.c b/src/hathreads.c
index 5db3c21..aada8ed 100644
--- a/src/hathreads.c
+++ b/src/hathreads.c
@@ -31,7 +31,7 @@ void thread_sync_io_handler(int fd)
 static HA_SPINLOCK_T sync_lock;
 static int   threads_sync_pipe[2];
 static unsigned long threads_want_sync = 0;
-unsigned long all_threads_mask  = 0;
+volatile unsigned long all_threads_mask  = 0;
 
 #if defined(DEBUG_THREAD) || defined(DEBUG_FULL)
 struct lock_stat lock_stats[LOCK_LABELS];

Pieter, your feedback is welcome, as usual :-)

Don't bother respinning the patch Christopher, I can add this myself
after Pieter's test.

Cheers,
Willy



[PATCH] BUG/MINOR: http: Set brackets for the unlikely macro at the right place

2018-07-20 Thread Christopher Faulet

Hi Willy,

Here is a little patch to fix brackets place of the unlikely macro in 
http code. It must be backported in 1.8.


--
Christopher Faulet
>From fc53818025c8681c800e807960af2a6859006014 Mon Sep 17 00:00:00 2001
From: Christopher Faulet 
Date: Fri, 20 Jul 2018 09:54:26 +0200
Subject: [PATCH] BUG/MINOR: http: Set brackets for the unlikely macro at the
 right place

When test on the header "Early-Data" is made, the unlikely macro must encompass
the condition.

This patch must be backported in 1.8.
---
 src/proto_http.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/proto_http.c b/src/proto_http.c
index 117d27404..56035b9cb 100644
--- a/src/proto_http.c
+++ b/src/proto_http.c
@@ -3526,7 +3526,7 @@ int http_process_req_common(struct stream *s, struct channel *req, int an_bit, s
 		ci_head(>req), >hdr_idx, )) {
 			if (unlikely(http_header_add_tail2(>req,
 			>hdr_idx, "Early-Data: 1",
-			strlen("Early-Data: 1"))) < 0) {
+			strlen("Early-Data: 1")) < 0)) {
 goto return_bad_req;
 			 }
 		}
-- 
2.17.1



[PATCH] BUG/MINOR: build: Fix compilation with debug mode enabled

2018-07-20 Thread Christopher Faulet

Hi Willy,

Here is a patch to fix the compilation of HAProxy with the debug mode 
enabled. Some debug messages were still using the old buffers api.


--
Christopher Faulet
>From deb61c7822acbfcc7fe0ff611eee51fd57773e72 Mon Sep 17 00:00:00 2001
From: Christopher Faulet 
Date: Fri, 20 Jul 2018 10:16:29 +0200
Subject: [PATCH] BUG/MINOR: build: Fix compilation with debug mode enabled

It remained some fragments of the old buffers API in debug messages, here and
there.
---
 src/backend.c|  4 ++--
 src/cli.c|  4 ++--
 src/flt_spoe.c   |  2 +-
 src/proto_http.c | 24 
 src/stream.c | 28 ++--
 src/tcp_rules.c  |  8 
 6 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/src/backend.c b/src/backend.c
index e94c4c9b3..b2b7ba580 100644
--- a/src/backend.c
+++ b/src/backend.c
@@ -1399,13 +1399,13 @@ int tcp_persist_rdp_cookie(struct stream *s, struct channel *req, int an_bit)
 	uint32_t addr;
 	char *p;
 
-	DPRINTF(stderr,"[%u] %s: stream=%p b=%p, exp(r,w)=%u,%u bf=%08x bh=%d analysers=%02x\n",
+	DPRINTF(stderr,"[%u] %s: stream=%p b=%p, exp(r,w)=%u,%u bf=%08x bh=%lu analysers=%02x\n",
 		now_ms, __FUNCTION__,
 		s,
 		req,
 		req->rex, req->wex,
 		req->flags,
-		req->buf->i,
+		ci_data(req),
 		req->analysers);
 
 	if (s->flags & SF_ASSIGNED)
diff --git a/src/cli.c b/src/cli.c
index 0d3c95c22..d02842960 100644
--- a/src/cli.c
+++ b/src/cli.c
@@ -764,9 +764,9 @@ static void cli_io_handler(struct appctx *appctx)
 	}
 
  out:
-	DPRINTF(stderr, "%s@%d: st=%d, rqf=%x, rpf=%x, rqh=%d, rqs=%d, rh=%d, rs=%d\n",
+	DPRINTF(stderr, "%s@%d: st=%d, rqf=%x, rpf=%x, rqh=%lu, rqs=%lu, rh=%lu, rs=%lu\n",
 		__FUNCTION__, __LINE__,
-		si->state, req->flags, res->flags, req->buf->i, req->buf->o, res->buf->i, res->buf->o);
+		si->state, req->flags, res->flags, ci_data(req), co_data(req), ci_data(res), co_data(res));
 }
 
 /* This is called when the stream interface is closed. For instance, upon an
diff --git a/src/flt_spoe.c b/src/flt_spoe.c
index 63482f570..73e8f623a 100644
--- a/src/flt_spoe.c
+++ b/src/flt_spoe.c
@@ -2264,7 +2264,7 @@ spoe_encode_messages(struct stream *s, struct spoe_context *ctx,
 		agent->id, __FUNCTION__, s,
 		((ctx->flags & SPOE_CTX_FL_FRAGMENTED) ? "last fragment of" : "unfragmented"),
 		ctx->spoe_appctx, (agent->rt[tid].frame_size - FRAME_HDR_SIZE),
-		p - ctx->buffer->p);
+		p - b_head(>buffer));
 
 	b_set_data(>buffer, p - b_head(>buffer));
 	ctx->frag_ctx.curmsg = NULL;
diff --git a/src/proto_http.c b/src/proto_http.c
index 56035b9cb..6fa38dd23 100644
--- a/src/proto_http.c
+++ b/src/proto_http.c
@@ -1613,13 +1613,13 @@ int http_wait_for_request(struct stream *s, struct channel *req, int an_bit)
 	struct http_msg *msg = >req;
 	struct hdr_ctx ctx;
 
-	DPRINTF(stderr,"[%u] %s: stream=%p b=%p, exp(r,w)=%u,%u bf=%08x bh=%d analysers=%02x\n",
+	DPRINTF(stderr,"[%u] %s: stream=%p b=%p, exp(r,w)=%u,%u bf=%08x bh=%lu analysers=%02x\n",
 		now_ms, __FUNCTION__,
 		s,
 		req,
 		req->rex, req->wex,
 		req->flags,
-		req->buf.i,
+		ci_data(req),
 		req->analysers);
 
 	/* we're speaking HTTP here, so let's speak HTTP to the client */
@@ -3477,13 +3477,13 @@ int http_process_req_common(struct stream *s, struct channel *req, int an_bit, s
 		goto return_prx_yield;
 	}
 
-	DPRINTF(stderr,"[%u] %s: stream=%p b=%p, exp(r,w)=%u,%u bf=%08x bh=%d analysers=%02x\n",
+	DPRINTF(stderr,"[%u] %s: stream=%p b=%p, exp(r,w)=%u,%u bf=%08x bh=%lu analysers=%02x\n",
 		now_ms, __FUNCTION__,
 		s,
 		req,
 		req->rex, req->wex,
 		req->flags,
-		req->buf.i,
+		ci_data(req),
 		req->analysers);
 
 	/* just in case we have some per-backend tracking */
@@ -3749,13 +3749,13 @@ int http_process_request(struct stream *s, struct channel *req, int an_bit)
 		return 0;
 	}
 
-	DPRINTF(stderr,"[%u] %s: stream=%p b=%p, exp(r,w)=%u,%u bf=%08x bh=%d analysers=%02x\n",
+	DPRINTF(stderr,"[%u] %s: stream=%p b=%p, exp(r,w)=%u,%u bf=%08x bh=%lu analysers=%02x\n",
 		now_ms, __FUNCTION__,
 		s,
 		req,
 		req->rex, req->wex,
 		req->flags,
-		req->buf.i,
+		ci_data(req),
 		req->analysers);
 
 	/*
@@ -4909,13 +4909,13 @@ int http_request_forward_body(struct stream *s, struct channel *req, int an_bit)
 	struct http_msg *msg = >txn->req;
 	int ret;
 
-	DPRINTF(stderr,"[%u] %s: stream=%p b=%p, exp(r,w)=%u,%u bf=%08x bh=%d analysers=%02x\n",
+	DPRINTF(stderr,"[%u] %s: stream=%p b=%p, exp(r,w)=%u,%u bf=%08x bh=%lu analysers=%02x\n",
 		now_ms, __FUNCTION__,
 		s,
 		req,
 		req->rex, req->wex,
 		req->flags,
-		req->buf.i,
+		ci_data(req),
 		req->analysers);
 
 	if (unlikely(msg->msg_state < HTTP_MSG_BODY))
@@ -5145,7 +5145,7 @@ int http_wait_for_response(struct stream *s, struct channel *rep, int an_bit)
 	int cur_idx;
 	int n;
 
-	DPRINTF(stderr,"[%u] %s: stream=%p b=%p, exp(r,w)=%u,%u bf=%08x bh=%d analysers=%02x\n",
+	DPRINTF(stderr,"[%u] %s: stream=%p b=%p, exp(r,w)=%u,%u bf=%08x bh=%lu analysers=%02x\n",
 		now_ms, __FUNCTION__,
 		s,
 		rep,
@@ 

Re: haproxy 1.8.12 / 1.9- 20180623 / stopping process hangs with threads (100% cpu) on -sf reload / FreeBSD

2018-07-20 Thread Christopher Faulet

Le 17/07/2018 à 19:39, PiBa-NL a écrit :

Hi Christopher,

Op 17-7-2018 om 10:09 schreef Christopher Faulet:

Could you try to revert the following commit please ?

  * ba86c6c25 MINOR: threads: Be sure to remove threads from
all_threads_mask on exit


Without this specific commit the termination of the old process works
'properly'.
That is.. for testing i used 1.9 snapshot of 20180714 and included a
little patch to remove the 'atomic and'.. which is basically what that
commit added..
   #ifdef USE_THREAD
-    HA_ATOMIC_AND(_threads_mask, ~tid_bit);
   if (tid > 0)
       pthread_exit(NULL);
   #endif

Also snapshot of 20180622 +
'0461-BUG-MEDIUM-threads-Use-the-sync-point-to-che-1.9-dev0.patch' works
okay.

Though i guess just reverting that line is not the right fix ;).



Hi,

Thanks Pieter. Sorry for the delay, I was busy on something else...

The bug is really in the sync point. Reverting the patch was just an 
easy way to spot it. Here is the right fix. It should be backported in 1.8.


Thanks,
--
Christopher Faulet
>From 7da071dfe4cff3cc4cf7a195d4f5e056c7281803 Mon Sep 17 00:00:00 2001
From: Christopher Faulet 
Date: Fri, 20 Jul 2018 09:31:53 +0200
Subject: [PATCH] BUG/MEDIUM: threads: Fix the exit condition of the thread
 barrier

In thread_sync_barrier, we exit when all threads have set their own bit in the
barrier mask. It is done by comparing it to all_threads_mask. But we must not
use a simple equality to do so, becaue all_threads_mask may change. Since commit
ba86c6c25 ("MINOR: threads: Be sure to remove threads from all_threads_mask on
exit"), when a thread exit, its bit is removed from all_threads_mask. Instead,
we must use a bitwise AND to test is all bits of all_threads_mask are set.

This patch must be backported in 1.8.
---
 src/hathreads.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/hathreads.c b/src/hathreads.c
index 5db3c2197..19a787fb8 100644
--- a/src/hathreads.c
+++ b/src/hathreads.c
@@ -106,7 +106,7 @@ static inline void thread_sync_barrier(volatile unsigned long *barrier)
 
 	HA_ATOMIC_CAS(barrier, , 0);
 	HA_ATOMIC_OR(barrier, tid_bit);
-	while (*barrier != all_threads_mask)
+	while ((*barrier & all_threads_mask) != all_threads_mask)
 		pl_cpu_relax();
 }
 
-- 
2.17.1