Re: 1.6.0 Error: Cannot Create Listening Socket for Frontend and Stats,Proxies

2015-10-20 Thread Willy Tarreau
On Tue, Oct 20, 2015 at 12:54:48AM +0530, Susheel Jalali wrote:
> Dear HAProxy Developers:
> 
> The following error message appears with HAProxy 1.6.0 after start and
> then the load balancer stops.  No haproxy.pid is getting created.  The
> same configuration works seamlessly with HAProxy 1.5.14 on the same
> server.  We are seeking insights into what we could be missing in our
> configuration?
> 
> The port numbers below are dedicated to this HAProxy instance and only
> one HAProxy instance is running.
> 
> /var/log/messages
> 
> Frontend:  Cannot create listening socket (0.0.0.0:)
> Frontend:  Cannot create listening socket (0.0.0.0:)
> Proxy for stats:  Cannot create listening socket ()

This sounds like either another process is listening on the same ports,
or that these are privileged ports and you're not starting it as root.

Try to start it by hand in foreground with "-db", you'll see all the
messages, maybe you'll see some warnings that you're missing here.

Willy




Re: [call to comment] HAProxy's DNS resolution default query type

2015-10-20 Thread Baptiste
Hi all,

Thanks a lot for your feedbacks. Really valuable.
I'll discuss with Willy the best approach for the change.

Baptiste


On Mon, Oct 19, 2015 at 11:50 PM, Andrew Hayworth
 wrote:
> Hi all -
>
> Just to chime in, we just got bit by this in production. Our dns
> resolver (unbound) does not follow CNAMES -> A records when you send
> an ANY query type. This is by design, so I can't just configure it
> differently (and ripping out our DNS resolver is not immediately
> feasible).
>
> I therefore vote to stop sending the ANY query type, and instead rely
> on A and  queries. I don't have any comments on behavior regarding
> NX behavior.
>
> NB: There is also support amongst some bigger internet companies to
> fully deprecate this query type:
> https://blog.cloudflare.com/deprecating-dns-any-meta-query-type/
>
> On Thu, Oct 15, 2015 at 12:49 PM, Lukas Tribus  wrote:
>>> I second this opinion. Removing ANY altogether would be the best case.
>>>
>>> In reality, I think it should use the OS's resolver libraries which
>>> in turn will honor whatever the admin has configured for preference
>>> order at the base OS level.
>>>
>>>
>>> As a sysadmin, one should reasonably expect that tweaking the
>>> preference knob at the OS level should affect most (and ideally, all)
>>> applications they are running rather than having to manually fiddle
>>> knobs at the OS and various application levels.
>>> If there is some discussion and *good* reasons to ignore the OS
>>> defaults, I feel this should likely be an *optional* config option
>>> in haproxy.cfg ie "use OS resolver, unless specifically told not to
>>> for $reason)
>>
>> Its exactly like you are saying.
>>
>> I don't think there is any doubt that HAproxy will bypass OS level
>> resolvers, since you are statically configuring DNS server IPs in the
>> haproxy configuration file.
>>
>> When you don't configure any resolvers, HAproxy does use libc's
>> gethostbyname() or getaddrinfo(), but both are fundamentally broken.
>>
>> Thats why some applications have to implement there own resolvers
>> (including nginx).
>>
>> First of all the OS resolver doesn't provide the TTL value. So you would
>> have to guess or use fixed TTL values. Second, both calls are blocking,
>> which is a big no-go for any event-loop based application (for this
>> reason, it can only be queried at startup, not while the application
>> is running).
>>
>> Just configure a hostname without resolver parameters, and haproxy
>> will resolve your hostnames at startup via OS (and then maintain those
>> IP's).
>>
>>
>> Applications either have to implement a resolver on their own (haproxy,
>> nginx), or use yet another external library, like getdnsapi [1].
>>
>>
>> The point is: there is a reason for this implementation, and you can
>> fallback to OS resolvers without any problems (just with their drawbacks).
>>
>>
>>
>>
>> Regards,
>>
>> Lukas
>>
>>
>> [1] https://getdnsapi.net/
>>
>
>
>
> --
> - Andrew Hayworth



Re: [PATCH] MINOR: cli: ability to set per-server maxconn

2015-10-20 Thread Willy Tarreau
Hi Andrew,

On Mon, Oct 19, 2015 at 02:23:39PM -0500, Andrew Hayworth wrote:
> In another thread "Dynamically change server maxconn possible?",
> someone raised the possibility of setting a per-server maxconn via the
> stats socket. I believe the below patch implements this functionality.
> 
> I'd appreciate any feedback, since I'm not really familiar with this
> part of the code. However, I've tested it by curling slow endpoints
> (the nginx echo_sleep module, specifically) and can confirm that NOSRV
> is returned appropriately according to whatever maxconn settings are
> set via the socket.

Thanks. Normally you also need to try to dequeue pending connections
when changing the value, because if you increase the limit, you need
to open the door for new connections. After changing the value, you
normally need something like this :

if (may_dequeue_tasks(srv, srv->proxy))
process_srv_queue(srv);

Regards,
Willy




Re: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Robin Geuze

Hey Willy,

Recursors are not required to recurse when serving an ANY query. ANY 
query means that you ask a server (either recursor or auth) for 
everything it has on label x. If it has a CNAME on that label just 
returning that is a valid response (just like would happen if you 
queried for the CNAME type at label x). However when you ask for an A or 
 record a recursor is required to follow the CNAME. Welcome to the 
wonderful world of DNS which doesn't really make sense anymore to anyone ;).


Like said in the other mailthread, ANY queries are just a very 
unreliable way to get the records/types you want. Just asking for the 
actual types, if necessary in multiple queries, is the way to go. DNS is 
(usually) fast enough that the one extra query really shouldn't matter 
that much.


-Robin-

On 10/20/2015 8:49 AM, Willy Tarreau wrote:

Hi Andrew,

On Mon, Oct 19, 2015 at 05:39:58PM -0500, Andrew Hayworth wrote:

The ANY query type is weird, and some resolvers don't 'do the legwork'
of resolving useful things like CNAMEs. Given that upstream resolver
behavior is not always under the control of the HAProxy administrator,
we should not use the ANY query type. Rather, we should use A or 
according to either the explicit preferences of the operator, or the
implicit default (/IPv6).

But how does that fix the problem for you ? In your example below,
the server clearly doesn't provide any A nor  in the response
so asking it for A or  should not work either if it doesn't
recurse, am I wrong ?


   PRODUCTION! ahaywo...@secret-hostname.com:~$
   dig @10.11.12.53 ANY api.somestartup.io

   ; <<>> DiG 9.8.4-rpz2+rl005.12-P1 <<>> @10.11.12.53 ANY api.somestartup.io
   ; (1 server found)
   ;; global options: +cmd
   ;; Got answer:
   ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62454
   ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0

   ;; QUESTION SECTION:
   ;api.somestartup.io.IN  ANY

   ;; ANSWER SECTION:
   api.somestartup.io. 20  IN  CNAME 
api-somestartup-production.ap-southeast-2.elb.amazonaws.com.

(...)

I fear that such a change will prevent CNAMEs from working for many
users where the DNS servers work fine, and will not necessarily fix
the problems for other people.

Regards,
willy







Re: 1.6.0 Error: Cannot Create Listening Socket for Frontend and Stats,Proxies

2015-10-20 Thread Susheel Jalali

Dear Willy,

Thank you for your insights.  As you advised, below is the output of
haproxy -f …cfg -db -V.

We are starting HAProxy as root.  There is no other application running
on this server  dedicated for load balancer.  ‘netstat -apon’ suggests
that these ports are not used by any other system process.

HAProxy 1.6.0 was compiled from source in the server environment:
Centos 7.1, and dynamic loading of (Lua 5.3.1, PCRE 8.32, OpenSSL
1.0.1e, zlib 1.2.7)

HAProxy 1.5.14 runs smoothly on this same server and with the same
configuration.  HAProxy configuration (same for both 1.5.14 and 1.6.0)
is given below.  Any insights / pointers for us to further investigate
and resolve this issue would be appreciated.

+++
Debug output of HAProxy 1.6.0
+++
Available polling systems :
   epoll : pref=300,  test result OK
poll : pref=200,  test result OK
  select : pref=150,  test result FAILED
Total: 3 (2 usable), will use epoll.

[ALERT] 292/025305 (4402) : Starting frontend webapps-frontend: cannot
create listening socket [0.0.0.0:80]
[ALERT] 292/025305 (4402) : Starting frontend webapps-frontend: cannot
create listening socket [0.0.0.0:443]
[ALERT] 292/025305 (4402) : Starting proxy haproxystats: cannot create
listening socket [Server_IP:]
Using epoll() as the polling mechanism.

++
Haproxy configuration
++
global
 log 127.0.0.1 local2
 pidfile /var/run/haproxy.pid
 userhaproxy
 group   haproxy
 #daemon
 debug
 chroot  /var/log/haproxy/
 stats socket  /var/log/haproxy/haproxy.stats
defaults
 modehttp
 option  abortonclose
 option  http-server-close
[….]
frontend webapps-frontend
 bind  *:80 name xxx
 bind  *:443 name yyy ssl crt /path/to/server.pem
[….]
listen haproxystats
 bind  Server_IP:
[….]

Thank you.

Sincerely,
-- --
Susheel Jalali
Coscend Communications Solutions
susheel.jal...@coscend.com

www.Coscend.com



On 10/20/15 12:29, Willy Tarreau wrote:
> On Tue, Oct 20, 2015 at 12:54:48AM +0530, Susheel Jalali wrote:
>> Dear HAProxy Developers:
>>
>> The following error message appears with HAProxy 1.6.0 after start and
>> then the load balancer stops.  No haproxy.pid is getting created.  The
>> same configuration works seamlessly with HAProxy 1.5.14 on the same
>> server.  We are seeking insights into what we could be missing in our
>> configuration?
>>
>> The port numbers below are dedicated to this HAProxy instance and only
>> one HAProxy instance is running.
>>
>> /var/log/messages
>>
>> Frontend:  Cannot create listening socket (0.0.0.0:)
>> Frontend:  Cannot create listening socket (0.0.0.0:)
>> Proxy for stats:  Cannot create listening socket 
()

>
> This sounds like either another process is listening on the same ports,
> or that these are privileged ports and you're not starting it as root.
>
> Try to start it by hand in foreground with "-db", you'll see all the
> messages, maybe you'll see some warnings that you're missing here.
>
> Willy
>





RE: 1.6.0 Error: Cannot Create Listening Socket for Frontend and Stats,Proxies

2015-10-20 Thread Lukas Tribus
> Dear Willy,
>
> Thank you for your insights. As you advised, below is the output of
> haproxy -f …cfg -db -V.

Can you run this through strace (strace haproxy -f …cfg -db -V) and
provide the output.

Also, if you have the strace output of a successful startup of 1.5.14 for
comparison, that would be very helpful as well.


Regards,

Lukas

  


Re: Re: haproxy 1.6.0 crashes

2015-10-20 Thread Willy Tarreau
Hi Rémi,

On Tue, Oct 20, 2015 at 10:39:16AM +0200, Remi Gacogne wrote:
> Hi,
> 
> On 10/19/2015 05:01 PM, Willy Tarreau wrote:
> >> [1] https://www.mail-archive.com/haproxy@formilux.org/msg19962.html
> >> [2] https://www.mail-archive.com/haproxy@formilux.org/msg19995.html
> > 
> > Regarding the second one, maybe Rémi's review could help. I noticed that
> > you used gen_ssl_ctx_ptr_index = -1 which is the same value used for
> > dh_params. Based on its name it makes me think it's a position in an
> > array, so I'm not sure whether we can make them collide for example. I
> > really don't know this API at all.
> 
> I am not familiar with the generated certificate cache, so I will just
> comment on the use of SSL_CTX_set_ex_data() and SSL_CTX_get_ex_data() in
> the second patch, at least for now.

Thank you, that was the part giving me some doubts.

> The value of gen_ssl_ctx_ptr_index is correctly initialized to -1, and a
> valid index is then obtained by calling SSL_CTX_get_ex_new_index() in
> the __ssl_sock_init() constructor, so there should not be any collision
> with other indexes.

OK great then!

> My only minor remark is that, though unlikely,
> SSL_CTX_get_ex_new_index() might return -1 in case of error. In the DH
> code we handle this by checking that ssl_dh_ptr_index is != -1 before
> using it, and I think it would be better to do the same check before
> using gen_ssl_ctx_ptr_index.

Good catch indeed. Thanks for your insights.

Best regards,
Willy




Re: 1.6.0 Error: Cannot Create Listening Socket for Frontend and Stats,Proxies

2015-10-20 Thread Willy Tarreau
On Tue, Oct 20, 2015 at 10:54:58AM +0200, Lukas Tribus wrote:
> > Dear Willy,
> >
> > Thank you for your insights. As you advised, below is the output of
> > haproxy -f ?cfg -db -V.
> 
> Can you run this through strace (strace haproxy -f ?cfg -db -V) and
> provide the output.
> 
> Also, if you have the strace output of a successful startup of 1.5.14 for
> comparison, that would be very helpful as well.

Yes definitely. Actually I'm seeing one difference between the two versions,
it's the introduction of namespaces in 1.6.0. If it was built with support
for namespaces and they are not supported in the operating system, I'm not
seeing how my_socketat() can recover in case setns() returns -1, which
happens when default_namespace = -1, which is the default case before
initialization :

#ifdef CONFIG_HAP_NS
if (default_namespace < 0 ||
(ns && setns(ns->fd, CLONE_NEWNET) == -1))
return -1;
#endif

That said we're not seeing any error there so I'm having doubts. Let's wait
for strace output then.

Willy




Re: Re: haproxy 1.6.0 crashes

2015-10-20 Thread Remi Gacogne
Hi,

On 10/19/2015 05:01 PM, Willy Tarreau wrote:
>> [1] https://www.mail-archive.com/haproxy@formilux.org/msg19962.html
>> [2] https://www.mail-archive.com/haproxy@formilux.org/msg19995.html
> 
> Regarding the second one, maybe Rémi's review could help. I noticed that
> you used gen_ssl_ctx_ptr_index = -1 which is the same value used for
> dh_params. Based on its name it makes me think it's a position in an
> array, so I'm not sure whether we can make them collide for example. I
> really don't know this API at all.

I am not familiar with the generated certificate cache, so I will just
comment on the use of SSL_CTX_set_ex_data() and SSL_CTX_get_ex_data() in
the second patch, at least for now.

The value of gen_ssl_ctx_ptr_index is correctly initialized to -1, and a
valid index is then obtained by calling SSL_CTX_get_ex_new_index() in
the __ssl_sock_init() constructor, so there should not be any collision
with other indexes.

My only minor remark is that, though unlikely,
SSL_CTX_get_ex_new_index() might return -1 in case of error. In the DH
code we handle this by checking that ssl_dh_ptr_index is != -1 before
using it, and I think it would be better to do the same check before
using gen_ssl_ctx_ptr_index.


Kind regards,

Remi






signature.asc
Description: OpenPGP digital signature


Re: haproxy 1.6.0 crashes

2015-10-20 Thread Christopher Faulet

Le 19/10/2015 17:01, Willy Tarreau a écrit :

On Mon, Oct 19, 2015 at 03:06:44PM +0200, Christopher Faulet wrote:

OK so the unused objects in the tree have a refcount of 1 while the used
ones have 2 or more, thus the refcount is always valid. Good that also
means we must not test if the tree is null or not in ssl_sock_close(),
we must always free the ssl_ctx as long as it was dynamically created,
so that its refcount decreases, otherwise it keeps increasing upon every
reuse.


No. Maybe my explanation was not really clear. The SSL_CTX refcount is
not exposed. It is an internal parameter. So, it is not incremented when
the SSL_CTX is pushed in the cache tree.

The call to SSL_set_SSL_CTX increases the refcount and the call to
SSL_free decrements it (when the connection is closed). And, of course,
the call to SSL_CTX_free decrements it too. The SSL_CTX object is
released when the refcount reaches 0.

For a SSL_CTX object, SSL_CTX_free must be called once. When it is
evicted from the cache tree (or when the tree is destroyed) _OR_ when
the connection is closed if there is no cache tree. If we always release
SSL_CTX objects when the SSL connection is closed, we will have
undefined references for cached objects, leading to a segfault.


OK, I understood the opposite, which is that we kept a refcount for each
user (cache and/or sessions).

But then how do we know that an SSL_CTX is still in use when we want to
evict it from the cache and that we must not free it ? Is it just the
fact that between the moment it's picked from the cache using
ssl_sock_get_generated_cert() and the moment it's associated to a session
using SSL_set_SSL_CTX() it's not possible to yield and destroy the cached
object so no race is possible here ? If so I'm fine with it for now (though
it will become "fun" when we start to play with threads), I just want to
be certain we're not overlooking this part as well.


This is not an issue because when we get (or create) a SSL_CTX object 
then it is associated to a session, without any interruption. So it 
cannot be evicted from the cache in the middle.
After this step, the refcount is >= 2, so, if the SSL_CTX object is 
evicted from the cache, the refcount is decremented and the SSL_CTX is 
not released. It will be automatically released with the closure of the 
last SSL connection using it.


But, now this works for a non-threaded environment. Is there any plan to 
add thread support? If yes, this feature will not work.




Also that raises another point : if the issue is related to SSL_CTX_free()
being called on static contexts, then to me it means that these contexts
were not properly refcounted when assigned to the SSL. Don't you think
that we shouldn't instead do something like the following to properly
refcount any context attached to an SSL and ensure that the SSL_CTX_free()
can always be performed regardless of parallel activities in the LRU tree
or anything else ?

 /* Alloc a new SSL session ctx */
 conn->xprt_ctx = 
SSL_new(objt_server(conn->target)->ssl_ctx.ctx);
+   SSL_set_SSL_CTX(conn->xprt_ctx, 
objt_server(conn->target)->ssl_ctx.ctx);


This last call will have no effect. Because the SSL_CTX is the same, 
this function returns immediately. Note that if the context changes, the 
refcount of the old one is decremented.


But there is no issue here, because the static contexts are only 
released when HAProxy is stopped.


Here is live cycle of static contexts:

  - HAProxy is started, static contexts are initialized by calling 
SSL_CTX_new (refcount is set to 1).


  - SSL connections use these contexts. SSL_new or SSL_set_SSL_CTX are 
called to assign a context to a SSL object. The refcount is incremented 
by 1 each time. When a SSL connection is closed, a call to SSL_free is 
done to release the SSL object and the refcount of the associated 
context is decremented. So the refcount is always greater or equal to 1.


  - HAPRoxy is stopped, all connections are closed, and finally, static 
contexts are freed by calling SSL_CTX_free. The refcount is equal to 1, 
so when SSL_CTX_free is called, it reaches 0 and the contexts is freed.



The refcount is not incremented when a SSL_CTX object is pushed in the
cache. There is no way to manually increment or decrement it. So, we
must really know if the SSL_CTX object was cached or not when the SSL
connection is closed.


I'm having an issue here as well since the LRU's destroy callback is set
to SSL_CTX_free. This we start with a non-null refcount. I'm sorry if I am
not clear, but the problem I'm having could be described like this :

   - two sets of entities can use a shared resource at any instant : cache
 and SSL sessions ;
   - each of them uses SSL_CTX_free() at release time to release the object ;
   - SSL_CTX_free() takes care of the refcount to know if it must free or not,
 which means that these two entities above are each responsible for one
 refcount point ;
   - the 

Re: 1.6.0 Error: Cannot Create Listening Socket for Frontend and Stats,Proxies

2015-10-20 Thread Willy Tarreau
On Tue, Oct 20, 2015 at 11:20:12AM +0200, Willy Tarreau wrote:
> On Tue, Oct 20, 2015 at 10:54:58AM +0200, Lukas Tribus wrote:
> > > Dear Willy,
> > >
> > > Thank you for your insights. As you advised, below is the output of
> > > haproxy -f ?cfg -db -V.
> > 
> > Can you run this through strace (strace haproxy -f ?cfg -db -V) and
> > provide the output.
> > 
> > Also, if you have the strace output of a successful startup of 1.5.14 for
> > comparison, that would be very helpful as well.
> 
> Yes definitely. Actually I'm seeing one difference between the two versions,
> it's the introduction of namespaces in 1.6.0. If it was built with support
> for namespaces and they are not supported in the operating system, I'm not
> seeing how my_socketat() can recover in case setns() returns -1, which
> happens when default_namespace = -1, which is the default case before
> initialization :
> 
> #ifdef CONFIG_HAP_NS
> if (default_namespace < 0 ||
> (ns && setns(ns->fd, CLONE_NEWNET) == -1))
> return -1;
> #endif

OK it's clear there's a bug here in my opinion because default_namespace
is *only* initialized if there are explicit namespaces. I could reproduce
the issue here, you simply need to build with USE_NS=1 and to declare no
namespace anywhere. Here's a proposed fix which works for me. Please
confirm.

Willy


diff --git a/src/namespace.c b/src/namespace.c
index a22f1a5..f1e81df 100644
--- a/src/namespace.c
+++ b/src/namespace.c
@@ -97,14 +97,13 @@ int my_socketat(const struct netns_entry *ns, int domain, 
int type, int protocol
int sock;
 
 #ifdef CONFIG_HAP_NS
-   if (default_namespace < 0 ||
-   (ns && setns(ns->fd, CLONE_NEWNET) == -1))
+   if (default_namespace >= 0 && ns && setns(ns->fd, CLONE_NEWNET) == -1)
return -1;
 #endif
sock = socket(domain, type, protocol);
 
 #ifdef CONFIG_HAP_NS
-   if (ns && setns(default_namespace, CLONE_NEWNET) == -1) {
+   if (default_namespace >= 0 && ns && setns(default_namespace, 
CLONE_NEWNET) == -1) {
close(sock);
return -1;
}


Re: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Willy Tarreau
Hi Robin,

[merging your reply and Lukas']

On Tue, Oct 20, 2015 at 08:59:27AM +0200, Robin Geuze wrote:
> Hey Willy,
>
> Recursors are not required to recurse when serving an ANY query. ANY
> query means that you ask a server (either recursor or auth) for
> everything it has on label x. If it has a CNAME on that label just
> returning that is a valid response (just like would happen if you
> queried for the CNAME type at label x). However when you ask for an A or
>  record a recursor is required to follow the CNAME. Welcome to the
> wonderful world of DNS which doesn't really make sense anymore to anyone ;).

I didn't know the server was required to follow the CNAME, that's the
info I was missing. Then of course it definitely makes sense.

> Like said in the other mailthread, ANY queries are just a very
> unreliable way to get the records/types you want. Just asking for the
> actual types, if necessary in multiple queries, is the way to go. DNS is
> (usually) fast enough that the one extra query really shouldn't matter
> that much.

Lukas:
> I don't think this is CNAME specific. ANY will just return what the
> recursor has in the cache. If it isn't in the cache, ANY won't make
> the recursor ask upstream DNS servers, only A and  (or MX or
> any other real qtype) will.

OK so you can get random response based on whatever someone else asked
this server in the past. These are very useful info. We'll discuss this
with Baptiste, and very likely Andrew's patch will be taken as-is.

Willy




Re: New 1.6 features overview?

2015-10-20 Thread Pavlos Parissis
On 20/10/2015 12:49 μμ, SL wrote:
> Hi, 
> 
> New 1.6 features look interesting from the news item.  Is there a
> comprehensive description of the new features anywhere?
> 
> Thanks
> 
> S

There is this:
http://blog.haproxy.com/2015/10/14/whats-new-in-haproxy-1-6/


Cheers,
Pavlos



signature.asc
Description: OpenPGP digital signature


New 1.6 features overview?

2015-10-20 Thread SL
Hi,

New 1.6 features look interesting from the news item.  Is there a
comprehensive description of the new features anywhere?

Thanks

S


Re: haproxy 1.6.0 crashes

2015-10-20 Thread Willy Tarreau
On Tue, Oct 20, 2015 at 02:14:37PM +0200, Christopher Faulet wrote:
> Le 20/10/2015 14:07, Willy Tarreau a écrit :
> >On Tue, Oct 20, 2015 at 01:59:52PM +0200, Willy Tarreau wrote:
> >>Then my understanding is that we should instead proceed differently :
> >>   - the cert is generated. It gets a refcount = 1.
> >>   - we assign it to the SSL. Its refcount becomes two.
> >>   - we try to insert it into the tree. The tree will handle its freeing
> >> using SSL_CTX_free() during eviction.
> >>   - if we can't insert into the tree because the tree is disabled, then
> >> we have to call SSL_CTX_free() ourselves, then we'd rather do it
> >> immediately. It will more closely mimmick the case where the cert
> >> is added to the tree and immediately evicted by concurrent activity
> >> on the cache.
> >>   - we never have to call SSL_CTX_free() during ssl_sock_close() because
> >> the SSL session only relies on openssl doing the right thing based on
> >> the refcount only.
> >>   - thus we never need to know how the cert was created since the
> >> SSL_CTX_free() is either guaranteed or already done for generated
> >> certs, and this protects other ones against any accidental call to
> >> SSL_CTX_free() without having to track where the cert comes from.
> >
> >This patch does this, and based on my understanding of your explanations,
> >it should do the right thing and be safe all the time. What's your opinion 
> >?
> >
> 
> Yes, it should work and it avoids keeping extra info on generated 
> certificates. Good idea !

Thanks. Do you have a easy reproducer for the issue with the certs ?
I tried a little bit but probably didn't test the proper sequence.

Willy




Re: haproxy 1.6.0 crashes

2015-10-20 Thread Willy Tarreau
On Tue, Oct 20, 2015 at 01:59:52PM +0200, Willy Tarreau wrote:
> Then my understanding is that we should instead proceed differently :
>   - the cert is generated. It gets a refcount = 1.
>   - we assign it to the SSL. Its refcount becomes two.
>   - we try to insert it into the tree. The tree will handle its freeing
> using SSL_CTX_free() during eviction.
>   - if we can't insert into the tree because the tree is disabled, then
> we have to call SSL_CTX_free() ourselves, then we'd rather do it
> immediately. It will more closely mimmick the case where the cert
> is added to the tree and immediately evicted by concurrent activity
> on the cache.
>   - we never have to call SSL_CTX_free() during ssl_sock_close() because
> the SSL session only relies on openssl doing the right thing based on
> the refcount only.
>   - thus we never need to know how the cert was created since the
> SSL_CTX_free() is either guaranteed or already done for generated
> certs, and this protects other ones against any accidental call to
> SSL_CTX_free() without having to track where the cert comes from.

This patch does this, and based on my understanding of your explanations,
it should do the right thing and be safe all the time. What's your opinion ?

Thanks,
Willy

diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 5319532..4eed2ea 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -1201,9 +1201,13 @@ ssl_sock_generate_certificate(const char *servername, 
struct bind_conf *bind_con
ssl_ctx = ssl_sock_do_create_cert(servername, serial, 
bind_conf, ssl);
lru64_commit(lru, ssl_ctx, cacert, 0, (void (*)(void 
*))SSL_CTX_free);
}
+   SSL_set_SSL_CTX(ssl, ssl_ctx);
}
-   else
+   else {
ssl_ctx = ssl_sock_do_create_cert(servername, serial, 
bind_conf, ssl);
+   SSL_set_SSL_CTX(ssl, ssl_ctx);
+   SSL_CTX_free(ssl_ctx);
+   }
return ssl_ctx;
 }
 
@@ -1271,7 +1275,6 @@ static int ssl_sock_switchctx_cbk(SSL *ssl, int *al, 
struct bind_conf *s)
if (s->generate_certs &&
(ctx = ssl_sock_generate_certificate(servername, s, ssl))) {
/* switch ctx */
-   SSL_set_SSL_CTX(ssl, ctx);
return SSL_TLSEXT_ERR_OK;
}
return (s->strict_sni ?
@@ -3123,13 +3126,6 @@ static int ssl_sock_from_buf(struct connection *conn, 
struct buffer *buf, int fl
 static void ssl_sock_close(struct connection *conn) {
 
if (conn->xprt_ctx) {
-#ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME
-   if (!ssl_ctx_lru_tree && objt_listener(conn->target)) {
-   SSL_CTX *ctx = SSL_get_SSL_CTX(conn->xprt_ctx);
-   if (ctx != 
objt_listener(conn->target)->bind_conf->default_ctx)
-   SSL_CTX_free(ctx);
-   }
-#endif
SSL_free(conn->xprt_ctx);
conn->xprt_ctx = NULL;
sslconns--;


Re: haproxy 1.6.0 crashes

2015-10-20 Thread Christopher Faulet

Le 20/10/2015 14:07, Willy Tarreau a écrit :

On Tue, Oct 20, 2015 at 01:59:52PM +0200, Willy Tarreau wrote:

Then my understanding is that we should instead proceed differently :
   - the cert is generated. It gets a refcount = 1.
   - we assign it to the SSL. Its refcount becomes two.
   - we try to insert it into the tree. The tree will handle its freeing
 using SSL_CTX_free() during eviction.
   - if we can't insert into the tree because the tree is disabled, then
 we have to call SSL_CTX_free() ourselves, then we'd rather do it
 immediately. It will more closely mimmick the case where the cert
 is added to the tree and immediately evicted by concurrent activity
 on the cache.
   - we never have to call SSL_CTX_free() during ssl_sock_close() because
 the SSL session only relies on openssl doing the right thing based on
 the refcount only.
   - thus we never need to know how the cert was created since the
 SSL_CTX_free() is either guaranteed or already done for generated
 certs, and this protects other ones against any accidental call to
 SSL_CTX_free() without having to track where the cert comes from.


This patch does this, and based on my understanding of your explanations,
it should do the right thing and be safe all the time. What's your opinion ?



Yes, it should work and it avoids keeping extra info on generated 
certificates. Good idea !


--
Christopher Faulet



Re: haproxy 1.6.0 crashes

2015-10-20 Thread Willy Tarreau
Hi Christopher,

On Tue, Oct 20, 2015 at 01:32:57PM +0200, Christopher Faulet wrote:
> >But then how do we know that an SSL_CTX is still in use when we want to
> >evict it from the cache and that we must not free it ? Is it just the
> >fact that between the moment it's picked from the cache using
> >ssl_sock_get_generated_cert() and the moment it's associated to a session
> >using SSL_set_SSL_CTX() it's not possible to yield and destroy the cached
> >object so no race is possible here ? If so I'm fine with it for now (though
> >it will become "fun" when we start to play with threads), I just want to
> >be certain we're not overlooking this part as well.
> 
> This is not an issue because when we get (or create) a SSL_CTX object 
> then it is associated to a session, without any interruption. So it 
> cannot be evicted from the cache in the middle.
> After this step, the refcount is >= 2, so, if the SSL_CTX object is 
> evicted from the cache, the refcount is decremented and the SSL_CTX is 
> not released. It will be automatically released with the closure of the 
> last SSL connection using it.

OK.

> But, now this works for a non-threaded environment. Is there any plan to 
> add thread support? If yes, this feature will not work.

Sure, we expected to be able to make some progress towards this in 1.6,
so now this is postponed to 1.7. We're at least trying not to make the
situation worse than it currently right now :-)

> >Also that raises another point : if the issue is related to SSL_CTX_free()
> >being called on static contexts, then to me it means that these contexts
> >were not properly refcounted when assigned to the SSL. Don't you think
> >that we shouldn't instead do something like the following to properly
> >refcount any context attached to an SSL and ensure that the SSL_CTX_free()
> >can always be performed regardless of parallel activities in the LRU tree
> >or anything else ?
> >
> > /* Alloc a new SSL session ctx */
> > conn->xprt_ctx = 
> > SSL_new(objt_server(conn->target)->ssl_ctx.ctx);
> >+SSL_set_SSL_CTX(conn->xprt_ctx, 
> >objt_server(conn->target)->ssl_ctx.ctx);
> 
> This last call will have no effect. Because the SSL_CTX is the same, 
> this function returns immediately. Note that if the context changes, the 
> refcount of the old one is decremented.

OK so the cert's refcount is already incremented by SSL_new(), then
I don't understand why the SSL_CTX_free() fails in ssl_sock_close()
since it should only decrease the refcount from what I understand. Or
maybe it is just because we're not allowed to call it twice (which
makes sense to me) ?

> But there is no issue here, because the static contexts are only 
> released when HAProxy is stopped.

Sure but I thought we should ensure that all SSL_CTX are properly refcounted
instead of handling some of them one way and the other ones another way. I
mean, if openssl provides refcounting for us, better use it globally than
just for certain certs.

> Here is live cycle of static contexts:
> 
>   - HAProxy is started, static contexts are initialized by calling 
> SSL_CTX_new (refcount is set to 1).

OK.

>   - SSL connections use these contexts. SSL_new or SSL_set_SSL_CTX are 
> called to assign a context to a SSL object. The refcount is incremented 
> by 1 each time. When a SSL connection is closed, a call to SSL_free is 
> done to release the SSL object and the refcount of the associated 
> context is decremented. So the refcount is always greater or equal to 1.

OK.

>   - HAPRoxy is stopped, all connections are closed, and finally, static 
> contexts are freed by calling SSL_CTX_free. The refcount is equal to 1, 
> so when SSL_CTX_free is called, it reaches 0 and the contexts is freed.

OK. My understanding here (and what you already explained) is that it's
really freed once it reaches zero, either upon SSL_CTX_free() or upon
SSL_free(). So for certs belonging to the cache, in fact we have one
call to SSL_CTX_free() then one to many calls to SSL_free().

> >I'm having an issue here as well since the LRU's destroy callback is set
> >to SSL_CTX_free. This we start with a non-null refcount. I'm sorry if I am
> >not clear, but the problem I'm having could be described like this :
> >
> >   - two sets of entities can use a shared resource at any instant : cache
> > and SSL sessions ;
> >   - each of them uses SSL_CTX_free() at release time to release the 
> >   object ;
> >   - SSL_CTX_free() takes care of the refcount to know if it must free or 
> >   not,
> > which means that these two entities above are each responsible for one
> > refcount point ;
> >   - the SSL_CTX_free() called by the cache is unconditional when the 
> >   object
> > is evicted from the cache ;
> >   - the SSL_CTX_free() is only done if the cache is enabled ;
> 
> This last step is wrong. SSL_CTX_free is only done if the cache is 
> _DISABLED_ or NULL if you prefer. SSL_CTX_free is called when 

Re: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Baptiste
Hi Andrew,

There is a bug repeated twice in your code.
In both dns_reset_resolution() and trigger_resolution(), you use
"resolution->resolver_family_priority" before it is positioned. This
may lead to using the last resolution->resolver_family_priority, which
may be different than the server one.
Please move the line "resolution->resolver_family_priority =
s->resolver_family_priority;" before using the value stored in it.

Appart this, it looks good.

Baptiste


On Tue, Oct 20, 2015 at 12:39 AM, Andrew Hayworth
 wrote:
> The ANY query type is weird, and some resolvers don't 'do the legwork'
> of resolving useful things like CNAMEs. Given that upstream resolver
> behavior is not always under the control of the HAProxy administrator,
> we should not use the ANY query type. Rather, we should use A or 
> according to either the explicit preferences of the operator, or the
> implicit default (/IPv6).
>
> - Andrew Hayworth
>
> From 8ed172424cbd79197aacacd1fd89ddcfa46e213d Mon Sep 17 00:00:00 2001
> From: Andrew Hayworth 
> Date: Mon, 19 Oct 2015 22:29:51 +
> Subject: [PATCH] MEDIUM: dns: Don't use the ANY query type
>
> Basically, it's ill-defined and shouldn't really be used going forward.
> We can't guarantee that resolvers will do the 'legwork' for us and
> actually resolve CNAMES when we request the ANY query-type. Case in point
> (obfuscated, clearly):
>
>   PRODUCTION! ahaywo...@secret-hostname.com:~$
>   dig @10.11.12.53 ANY api.somestartup.io
>
>   ; <<>> DiG 9.8.4-rpz2+rl005.12-P1 <<>> @10.11.12.53 ANY api.somestartup.io
>   ; (1 server found)
>   ;; global options: +cmd
>   ;; Got answer:
>   ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62454
>   ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0
>
>   ;; QUESTION SECTION:
>   ;api.somestartup.io.IN  ANY
>
>   ;; ANSWER SECTION:
>   api.somestartup.io. 20  IN  CNAME
> api-somestartup-production.ap-southeast-2.elb.amazonaws.com.
>
>   ;; AUTHORITY SECTION:
>   somestartup.io.   166687  IN  NS  ns-1254.awsdns-28.org.
>   somestartup.io.   166687  IN  NS  
> ns-1884.awsdns-43.co.uk.
>   somestartup.io.   166687  IN  NS  ns-440.awsdns-55.com.
>   somestartup.io.   166687  IN  NS  ns-577.awsdns-08.net.
>
>   ;; Query time: 1 msec
>   ;; SERVER: 10.11.12.53#53(10.11.12.53)
>   ;; WHEN: Mon Oct 19 22:02:29 2015
>   ;; MSG SIZE  rcvd: 242
>
> HAProxy can't handle that response correctly.
>
> Rather than try to build in support for resolving CNAMEs presented
> without an A record in an answer section (which may be a valid
> improvement further on), this change just skips ANY record types
> altogether. A and  are much more well-defined and predictable.
>
> Notably, this commit preserves the implicit "Prefer IPV6 behavior."
> ---
>  include/types/dns.h |  3 ++-
>  src/checks.c|  6 +-
>  src/dns.c   |  6 +-
>  src/server.c| 18 +++---
>  4 files changed, 19 insertions(+), 14 deletions(-)
>
> diff --git a/include/types/dns.h b/include/types/dns.h
> index f8edb73..ea1a9f9 100644
> --- a/include/types/dns.h
> +++ b/include/types/dns.h
> @@ -161,7 +161,8 @@ struct dns_resolution {
>   unsigned int last_status_change; /* time of the latest DNS
> resolution status change */
>   int query_id; /* DNS query ID dedicated for this resolution */
>   struct eb32_node qid; /* ebtree query id */
> - int query_type; /* query type to send. By default DNS_RTYPE_ANY */
> + int query_type;
> + /* query type to send. By default DNS_RTYPE_A or DNS_RTYPE_
> depending on resolver_family_priority */
>   int status; /* status of the resolution being processed RSLV_STATUS_* */
>   int step; /* */
>   int try; /* current resolution try */
> diff --git a/src/checks.c b/src/checks.c
> index ade2428..d3cd567 100644
> --- a/src/checks.c
> +++ b/src/checks.c
> @@ -2214,7 +2214,11 @@ int trigger_resolution(struct server *s)
>   resolution->query_id = query_id;
>   resolution->qid.key = query_id;
>   resolution->step = RSLV_STEP_RUNNING;
> - resolution->query_type = DNS_RTYPE_ANY;
> + if (resolution->resolver_family_priority == AF_INET) {
> + resolution->query_type = DNS_RTYPE_A;
> + } else {
> + resolution->query_type = DNS_RTYPE_;
> + }
>   resolution->try = resolvers->resolve_retries;
>   resolution->try_cname = 0;
>   resolution->nb_responses = 0;
> diff --git a/src/dns.c b/src/dns.c
> index 7f71ac7..53b65ab 100644
> --- a/src/dns.c
> +++ b/src/dns.c
> @@ -102,7 +102,11 @@ void dns_reset_resolution(struct dns_resolution
> *resolution)
>   resolution->qid.key = 0;
>
>   /* default values */
> - resolution->query_type = DNS_RTYPE_ANY;
> + if (resolution->resolver_family_priority == AF_INET) {
> + resolution->query_type = DNS_RTYPE_A;
> + } else {
> + resolution->query_type = DNS_RTYPE_;
> + }
>
>   /* the second 

Re: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Willy Tarreau
Hi Andrew,

On Mon, Oct 19, 2015 at 05:39:58PM -0500, Andrew Hayworth wrote:
> The ANY query type is weird, and some resolvers don't 'do the legwork'
> of resolving useful things like CNAMEs. Given that upstream resolver
> behavior is not always under the control of the HAProxy administrator,
> we should not use the ANY query type. Rather, we should use A or 
> according to either the explicit preferences of the operator, or the
> implicit default (/IPv6).

But how does that fix the problem for you ? In your example below,
the server clearly doesn't provide any A nor  in the response
so asking it for A or  should not work either if it doesn't
recurse, am I wrong ?

>   PRODUCTION! ahaywo...@secret-hostname.com:~$
>   dig @10.11.12.53 ANY api.somestartup.io
> 
>   ; <<>> DiG 9.8.4-rpz2+rl005.12-P1 <<>> @10.11.12.53 ANY api.somestartup.io
>   ; (1 server found)
>   ;; global options: +cmd
>   ;; Got answer:
>   ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62454
>   ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0
> 
>   ;; QUESTION SECTION:
>   ;api.somestartup.io.IN  ANY
> 
>   ;; ANSWER SECTION:
>   api.somestartup.io. 20  IN  CNAME 
> api-somestartup-production.ap-southeast-2.elb.amazonaws.com.

(...)

I fear that such a change will prevent CNAMEs from working for many
users where the DNS servers work fine, and will not necessarily fix
the problems for other people.

Regards,
willy




RE: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Lukas Tribus
> Hi Andrew,
>
> On Mon, Oct 19, 2015 at 05:39:58PM -0500, Andrew Hayworth wrote:
>> The ANY query type is weird, and some resolvers don't 'do the legwork'
>> of resolving useful things like CNAMEs. Given that upstream resolver
>> behavior is not always under the control of the HAProxy administrator,
>> we should not use the ANY query type. Rather, we should use A or 
>> according to either the explicit preferences of the operator, or the
>> implicit default (/IPv6).
>
> But how does that fix the problem for you ? In your example below,
> the server clearly doesn't provide any A nor  in the response
> so asking it for A or  should not work either if it doesn't
> recurse, am I wrong ?

I don't think this is CNAME specific. ANY will just return what the
recursor has in the cache. If it isn't in the cache, ANY won't make
the recursor ask upstream DNS servers, only A and  (or MX or
any other real qtype) will.

Just switching to ANY is not enough, we still need to fallback
from  to A and vice versa on NX responses for single homed
nodes.



Lukas

  


Re: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Willy Tarreau
On Tue, Oct 20, 2015 at 07:36:20PM +0200, Lukas Tribus wrote:
> Hi,
> 
> 
> >> A simple option in the resolvers section to instruct HAPoxy to not
> >> forgive on NX and failover to next family:
> >> option on-nx-try-next-family
> >
> > I personally find this confusing from the user's point of view.
> 
> Agreed, we should have good and safe defaults, and address corner
> cases with additional options, not the other way around.

Definitely. We used to know this situation till 1.4 included with
http working in tunnel mode by default and not doing the right thing
by default. Many of the problem reports were caused by the fact that
newcomers didn't know such specificities.

> > When you know that you can only use IPv4 to join the next server, I
> > think this :
> >
> > server remote1 remote1.mydomain check v4only
> >
> > is more obvious than this :
> >
> > option on-nx-try-next-family
> > server remote1 remote1.mydomain check prefer-ipv4
> 
> Actually I think "v4only" would be "prefer-ipv4" without
> on-nx-try-next-family, right? Anyway, I agree.

Yes that's it.

> Without automatic AF fallback and without ANY queries, the
> "prefer" keyword actually is restricting, and not preferring.

I would simply ignore prefer when v[46]only is set.

> > Also, it covers the case where some servers are known to support both
> > protocols while others are limited. This allows for example to join
> > the same remote server over two possible families behind a DSL line
> > which uses a random IP address after each reconnection :
> >
> > server home-v4 home-v4.mydomain check v4only
> > server home-v6 home-v6.mydomain check v6only
> >
> > And since we already have v4only/v6only on bind lines, the analogy
> > seems easy to remember.
> 
> The behavior with v4only or v6only is quite obvious, we just query that
> particular address family, but let me clarify: you are implying that
> without v4only/v6only keyword, we query one address family and then
> fallback to the other address family in case we get a NX response, right?

Yes that's it. And in this case it's prefer that gives the ordering.

> I think thats a good solution.

Thanks :-)

> Question: are we still talking about 1.6 here? It seems we have to
> make some intrusive changes that may break configurations (but they
> seem mandatory to get consistent and predictable behavior).

I don't know. I'm always only focused on the combination of user-visible
changes and risks of bugs (which are user-visible changes btw). So if we
can do it without breaking too much code, then it can be backported. What
we have now is something which is apparently insufficient to some users
so we can improve the situation. I wouldn't want to remove prefer-* or
change the options behaviour or whatever for example.

> By the amount of people that already hit the ANY issue (3 or more?),
> I would say we better break a small number of configurations between
> 1.6 and 1.6.1,

Normally we don't break them. Currenly prefer can pick any of two families
after a response to an ANY request, which is what is still currently being
done. It only doesn't retry after it tries a specific family. The only
difference will be that if a config reports NX for, say, A, then today it
doesn't retry and will cause the server to fail while after the change it
will allow the server to continue in v6 or to fail as well. So with this
change we will be very close to the current behaviour, and offer everyone
the option to fix their preference and make them restrictions.

> then having to deal with the fallout of the ANY issue
> (because the ANY removal changes resolve-prefer behavior as well)
> for the time that 1.6 is supported.

Absolutely. We had such related discussions with Baptiste during the
design and for many such choices, it was hard to get responses from the
users asking for the feature (and several of them were disagreeing on a
number of possibilities). It's common in fact, people want something
"very simple" and oversee the hidden complexities since corner cases is
just for others, they don't happen for them. So we decided to go the
modest route and see if anything required to be enforced. I still think
it was the best option.

Thanks very much for sharing your opinion, that definitely helps!

Willy




RE: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Lukas Tribus
> I don't know. I'm always only focused on the combination of user-visible
> changes and risks of bugs (which are user-visible changes btw). So if we
> can do it without breaking too much code, then it can be backported. What
> we have now is something which is apparently insufficient to some users
> so we can improve the situation. I wouldn't want to remove prefer-* or
> change the options behavior or whatever for example.

Ok, if we don't remove existing prefer-* keywords a 1.6 backport sounds
possible without user visible breakage, great.


lukas

  


RE: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Lukas Tribus
Hi,


>> A simple option in the resolvers section to instruct HAPoxy to not
>> forgive on NX and failover to next family:
>> option on-nx-try-next-family
>
> I personally find this confusing from the user's point of view.

Agreed, we should have good and safe defaults, and address corner
cases with additional options, not the other way around.



> When you know that you can only use IPv4 to join the next server, I
> think this :
>
> server remote1 remote1.mydomain check v4only
>
> is more obvious than this :
>
> option on-nx-try-next-family
> server remote1 remote1.mydomain check prefer-ipv4

Actually I think "v4only" would be "prefer-ipv4" without
on-nx-try-next-family, right? Anyway, I agree.

Without automatic AF fallback and without ANY queries, the
"prefer" keyword actually is restricting, and not preferring.



> Also, it covers the case where some servers are known to support both
> protocols while others are limited. This allows for example to join
> the same remote server over two possible families behind a DSL line
> which uses a random IP address after each reconnection :
>
> server home-v4 home-v4.mydomain check v4only
> server home-v6 home-v6.mydomain check v6only
>
> And since we already have v4only/v6only on bind lines, the analogy
> seems easy to remember.

The behavior with v4only or v6only is quite obvious, we just query that
particular address family, but let me clarify: you are implying that
without v4only/v6only keyword, we query one address family and then
fallback to the other address family in case we get a NX response, right?

I think thats a good solution.


Question: are we still talking about 1.6 here? It seems we have to
make some intrusive changes that may break configurations (but they
seem mandatory to get consistent and predictable behavior).

By the amount of people that already hit the ANY issue (3 or more?),
I would say we better break a small number of configurations between
1.6 and 1.6.1, then having to deal with the fallout of the ANY issue
(because the ANY removal changes resolve-prefer behavior as well)
for the time that 1.6 is supported.



Regards,

Lukas

  


Welcome to Abbexa

2015-10-20 Thread Hannah Watson
Welcome to Abbexa
We are very excited to welcome you to the Abbexa customer community. We’d
like to let you know that our professional support team is here to help you
anytime via email or phone. They can help you to explore and select the
correct products for your research. We are here to answer each and every
question you have so please send us an email and we’ll get back to you
ASAP!

Our brand new website has been designed by biotechnology experts, and
features more than 100,000 products, a new and improved search and a filter
to enable you to narrow down your results. Try it out and let us know your
thoughts!

Who we are?
Abbexa is a supplier and distributor of biological tools for the life
science, pharmaceutical development and biotechnology sectors. Based in
Cambridge, UK, we provide the scientific community with primary antibodies,
secondary antibodies, proteins, ELISA kits and enzymes as well as other kits
and tools. Working with various laboratories across the World, we aim to
develop relevant, high quality, tested products for the biomedical research
market to meet your needs at a reasonable price.



TCP raw socket data compress with haproxy

2015-10-20 Thread Tufan Gürsu

Hi,

We want to use zlib to compress/uncompress tcp data between tcp session. 
There is only compression code for http but not for tcp.I did some 
research and I encountered problem of the lack of chunk size.

Is there any sample or development for this scenario?
We are using stunnel currently. I think it uses high level protocol for 
compress and uncompress on top of TCP.


Thanks for all your help.


Re: TCP raw socket data compress with haproxy

2015-10-20 Thread Willy Tarreau
Hi,

On Tue, Oct 20, 2015 at 06:39:17PM +0300, Tufan Gürsu wrote:
> Hi,
> 
> We want to use zlib to compress/uncompress tcp data between tcp session. 
> There is only compression code for http but not for tcp.I did some 
> research and I encountered problem of the lack of chunk size.
> Is there any sample or development for this scenario?
> We are using stunnel currently. I think it uses high level protocol for 
> compress and uncompress on top of TCP.

There's nothing planned regarding this and no standard way to achieve
it either. Also, using zlib to compress live streams is really not a
good idea considering how slow it is compared to more suited algorithms
such as LZO, LZ4, zstd, snappy, etc that can be up to 20 times faster.

Regards,
Willy




Re: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Willy Tarreau
On Tue, Oct 20, 2015 at 06:26:38PM +0200, Baptiste wrote:
> > Also, we will have to address the issue that a server may just use
> > a single address-family, therefor we have to fallback between A
> > and , because a NX on a  query doesn't mean there are no
> > A records.
> 
> Hi Lukas,
> 
> I do agree on this point.
> A simple option in the resolvers section to instruct HAPoxy to not
> forgive on NX and failover to next family:
>  option on-nx-try-next-family

I personally find this confusing from the user's point of view. My
translation of what it does is to allow cross-family requests instead
of limiting requests to the preferred family.

Thus it seems to me that users will have to carefully configure both
this option and the prefer field to select the required family. In
practice I guess most people will simply want "v4only" or "v6only"
as alternatives to "prefer-ipv4" or "prefer-ipv6".

When you know that you can only use IPv4 to join the next server, I
think this :

server remote1 remote1.mydomain check v4only

is more obvious than this :

option on-nx-try-next-family
server remote1 remote1.mydomain check prefer-ipv4

Also, it covers the case where some servers are known to support both
protocols while others are limited. This allows for example to join
the same remote server over two possible families behind a DSL line
which uses a random IP address after each reconnection :

server home-v4 home-v4.mydomain check v4only
server home-v6 home-v6.mydomain check v6only

And since we already have v4only/v6only on bind lines, the analogy
seems easy to remember.

Willy




Re: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Baptiste
> Also, we will have to address the issue that a server may just use
> a single address-family, therefor we have to fallback between A
> and , because a NX on a  query doesn't mean there are no
> A records.

Hi Lukas,

I do agree on this point.
A simple option in the resolvers section to instruct HAPoxy to not
forgive on NX and failover to next family:
 option on-nx-try-next-family

The magic should happen in snr_resolution_error_cb().

Baptiste



Re: 1.6.0 Error: Cannot Create Listening Socket for Frontend and Stats,Proxies

2015-10-20 Thread Susheel Jalali

Dear Willy and Lukas,

Thank you for your guidance.  Upon implementing your insights, here is
the summed up result:

(1) With Willy’s patch HAProxy starts

(2) But we have to remove listen haproxystats block, as it still cannot
create listening socket for this listen proxy.

Detailed information you have requested is below without and with
Willy’s patch.  Your further guidance to fix the item (2) above would be 
appreciated.  Thank you.


---

@Willy:  HAProxy is running on two of our servers that have CentOS 7.1
(Server1) and CentOS 7 (Server2), which support namespace.  Checked
with: ip netns add namespace1 and then ip netns list -->This listed
namespace1


WITHOUT Willy’s patch

@Lukas and Willy
We installed HAPRoxy 1.5.14 and 1.6.0 on both Server1 and Server2.
1.5.14 starts and functions normally on both.
1.6.0 fails after start without creating pid.

ATTACHED are strace output from running the following on the same server 
with the same configuration (including port numbers):

HAProxy 1.5.14
HAProxy 1.6.0


WITH Willy’s Patch

@Willy
We patched your fixns.diff to namespace.c

Running haproxy in debug mode gives: Oct 20 14:03:35 localhost haproxy:
Starting haproxy: [ALERT] 292/140335 (7337) : Starting proxy
haproxystats: cannot bind socket []

Upon removing ‘listen haproxystats’ block from HAProxy configuration
(see below), it starts normally

++
Haproxy configuration
++
global
   log 127.0.0.1 local2
   pidfile /var/run/haproxy.pid
   userhaproxy
   group   haproxy
   #daemon
   debug
   chroot  /var/log/haproxy/
   stats socket  /var/log/haproxy/haproxy.stats defaults
   modehttp
   option  abortonclose
   option  http-server-close
[….]
frontend webapps-frontend
   bind  *:80 name xxx
   bind  *:443 name yyy ssl crt /path/to/server.pem
[….]
listen haproxystats
   bind  Server_IP:
[….]

Thank you.

Sincerely,
-- --
Susheel Jalali
Coscend Communications Solutions
susheel.jal...@coscend.com

www.Coscend.com
-

On 10/20/15 15:02, Willy Tarreau wrote:
> On Tue, Oct 20, 2015 at 11:20:12AM +0200, Willy Tarreau wrote:
>> On Tue, Oct 20, 2015 at 10:54:58AM +0200, Lukas Tribus wrote:
 Dear Willy,

 Thank you for your insights. As you advised, below is the output of
 haproxy -f ?cfg -db -V.
>>>
>>> Can you run this through strace (strace haproxy -f ?cfg -db -V) and
>>> provide the output.
>>>
>>> Also, if you have the strace output of a successful startup of 
1.5.14 for

>>> comparison, that would be very helpful as well.
>>
>> Yes definitely. Actually I'm seeing one difference between the two 
versions,
>> it's the introduction of namespaces in 1.6.0. If it was built with 
support
>> for namespaces and they are not supported in the operating system, 
I'm not

>> seeing how my_socketat() can recover in case setns() returns -1, which
>> happens when default_namespace = -1, which is the default case before
>> initialization :
>>
>> #ifdef CONFIG_HAP_NS
>> if (default_namespace < 0 ||
>> (ns && setns(ns->fd, CLONE_NEWNET) == -1))
>> return -1;
>> #endif
>
> OK it's clear there's a bug here in my opinion because default_namespace
> is *only* initialized if there are explicit namespaces. I could reproduce
> the issue here, you simply need to build with USE_NS=1 and to declare no
> namespace anywhere. Here's a proposed fix which works for me. Please
> confirm.
>
> Willy
>






strace.haproxy-1.15.4
Description: Binary data


strace.haproxy-1.6.0
Description: Binary data


Re: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Baptiste
On Tue, Oct 20, 2015 at 9:09 PM, Lukas Tribus  wrote:
>> I don't know. I'm always only focused on the combination of user-visible
>> changes and risks of bugs (which are user-visible changes btw). So if we
>> can do it without breaking too much code, then it can be backported. What
>> we have now is something which is apparently insufficient to some users
>> so we can improve the situation. I wouldn't want to remove prefer-* or
>> change the options behavior or whatever for example.
>
> Ok, if we don't remove existing prefer-* keywords a 1.6 backport sounds
> possible without user visible breakage, great.
>
> lukas

Ok, just to make it clear, let me write a few conf examples:
- server home-v4 home-v4.mydomain check resolve-prefer ipv4
 => A then  (failover on NX)
- server home-v4 home-v4.mydomain check v4only
 => A only (stop on NX)

If both 'resolve-prefer ipv[46]' and 'v[46]only' are set, whatever
combination, then, v[46]only applies, but configuration parsing may
return a warning.

So we don't break compatibility with current code and way of working!
Brilliant guys :)

Baptiste



Re: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Baptiste
Hi Andrew,

I've updated your patch quickly so Willy can integrate it.
I've also updated the commit message to follow Lukas recommendations.

Baptiste

On Tue, Oct 20, 2015 at 2:26 PM, Baptiste  wrote:
> Hi Andrew,
>
> There is a bug repeated twice in your code.
> In both dns_reset_resolution() and trigger_resolution(), you use
> "resolution->resolver_family_priority" before it is positioned. This
> may lead to using the last resolution->resolver_family_priority, which
> may be different than the server one.
> Please move the line "resolution->resolver_family_priority =
> s->resolver_family_priority;" before using the value stored in it.
>
> Appart this, it looks good.
>
> Baptiste
>
>
> On Tue, Oct 20, 2015 at 12:39 AM, Andrew Hayworth
>  wrote:
>> The ANY query type is weird, and some resolvers don't 'do the legwork'
>> of resolving useful things like CNAMEs. Given that upstream resolver
>> behavior is not always under the control of the HAProxy administrator,
>> we should not use the ANY query type. Rather, we should use A or 
>> according to either the explicit preferences of the operator, or the
>> implicit default (/IPv6).
>>
>> - Andrew Hayworth
>>
>> From 8ed172424cbd79197aacacd1fd89ddcfa46e213d Mon Sep 17 00:00:00 2001
>> From: Andrew Hayworth 
>> Date: Mon, 19 Oct 2015 22:29:51 +
>> Subject: [PATCH] MEDIUM: dns: Don't use the ANY query type
>>
>> Basically, it's ill-defined and shouldn't really be used going forward.
>> We can't guarantee that resolvers will do the 'legwork' for us and
>> actually resolve CNAMES when we request the ANY query-type. Case in point
>> (obfuscated, clearly):
>>
>>   PRODUCTION! ahaywo...@secret-hostname.com:~$
>>   dig @10.11.12.53 ANY api.somestartup.io
>>
>>   ; <<>> DiG 9.8.4-rpz2+rl005.12-P1 <<>> @10.11.12.53 ANY api.somestartup.io
>>   ; (1 server found)
>>   ;; global options: +cmd
>>   ;; Got answer:
>>   ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62454
>>   ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0
>>
>>   ;; QUESTION SECTION:
>>   ;api.somestartup.io.IN  ANY
>>
>>   ;; ANSWER SECTION:
>>   api.somestartup.io. 20  IN  CNAME
>> api-somestartup-production.ap-southeast-2.elb.amazonaws.com.
>>
>>   ;; AUTHORITY SECTION:
>>   somestartup.io.   166687  IN  NS  
>> ns-1254.awsdns-28.org.
>>   somestartup.io.   166687  IN  NS  
>> ns-1884.awsdns-43.co.uk.
>>   somestartup.io.   166687  IN  NS  ns-440.awsdns-55.com.
>>   somestartup.io.   166687  IN  NS  ns-577.awsdns-08.net.
>>
>>   ;; Query time: 1 msec
>>   ;; SERVER: 10.11.12.53#53(10.11.12.53)
>>   ;; WHEN: Mon Oct 19 22:02:29 2015
>>   ;; MSG SIZE  rcvd: 242
>>
>> HAProxy can't handle that response correctly.
>>
>> Rather than try to build in support for resolving CNAMEs presented
>> without an A record in an answer section (which may be a valid
>> improvement further on), this change just skips ANY record types
>> altogether. A and  are much more well-defined and predictable.
>>
>> Notably, this commit preserves the implicit "Prefer IPV6 behavior."
>> ---
>>  include/types/dns.h |  3 ++-
>>  src/checks.c|  6 +-
>>  src/dns.c   |  6 +-
>>  src/server.c| 18 +++---
>>  4 files changed, 19 insertions(+), 14 deletions(-)
>>
>> diff --git a/include/types/dns.h b/include/types/dns.h
>> index f8edb73..ea1a9f9 100644
>> --- a/include/types/dns.h
>> +++ b/include/types/dns.h
>> @@ -161,7 +161,8 @@ struct dns_resolution {
>>   unsigned int last_status_change; /* time of the latest DNS
>> resolution status change */
>>   int query_id; /* DNS query ID dedicated for this resolution */
>>   struct eb32_node qid; /* ebtree query id */
>> - int query_type; /* query type to send. By default DNS_RTYPE_ANY */
>> + int query_type;
>> + /* query type to send. By default DNS_RTYPE_A or DNS_RTYPE_
>> depending on resolver_family_priority */
>>   int status; /* status of the resolution being processed RSLV_STATUS_* */
>>   int step; /* */
>>   int try; /* current resolution try */
>> diff --git a/src/checks.c b/src/checks.c
>> index ade2428..d3cd567 100644
>> --- a/src/checks.c
>> +++ b/src/checks.c
>> @@ -2214,7 +2214,11 @@ int trigger_resolution(struct server *s)
>>   resolution->query_id = query_id;
>>   resolution->qid.key = query_id;
>>   resolution->step = RSLV_STEP_RUNNING;
>> - resolution->query_type = DNS_RTYPE_ANY;
>> + if (resolution->resolver_family_priority == AF_INET) {
>> + resolution->query_type = DNS_RTYPE_A;
>> + } else {
>> + resolution->query_type = DNS_RTYPE_;
>> + }
>>   resolution->try = resolvers->resolve_retries;
>>   resolution->try_cname = 0;
>>   resolution->nb_responses = 0;
>> diff --git a/src/dns.c b/src/dns.c
>> index 7f71ac7..53b65ab 100644
>> --- a/src/dns.c
>> +++ b/src/dns.c
>> @@ -102,7 

Re: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Willy Tarreau
On Tue, Oct 20, 2015 at 10:07:16PM +0200, Baptiste wrote:
> Hi Andrew,
> 
> I've updated your patch quickly so Willy can integrate it.
> I've also updated the commit message to follow Lukas recommendations.

Thanks Baptiste, I've merged it and backported it to 1.6.
I'm tempted to issue 1.6.1 right now as we have a number of
fixes pending already. Susheel's remaining issue is quite
strange and I'm actually not convinced we'll get rid of it
that quickly.

Willy




Re: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Willy Tarreau
On Tue, Oct 20, 2015 at 10:20:50PM +0200, Baptiste wrote:
> On Tue, Oct 20, 2015 at 9:09 PM, Lukas Tribus  wrote:
> >> I don't know. I'm always only focused on the combination of user-visible
> >> changes and risks of bugs (which are user-visible changes btw). So if we
> >> can do it without breaking too much code, then it can be backported. What
> >> we have now is something which is apparently insufficient to some users
> >> so we can improve the situation. I wouldn't want to remove prefer-* or
> >> change the options behavior or whatever for example.
> >
> > Ok, if we don't remove existing prefer-* keywords a 1.6 backport sounds
> > possible without user visible breakage, great.
> >
> > lukas
> 
> Ok, just to make it clear, let me write a few conf examples:
> - server home-v4 home-v4.mydomain check resolve-prefer ipv4
>  => A then  (failover on NX)
> - server home-v4 home-v4.mydomain check v4only
>  => A only (stop on NX)
> 
> If both 'resolve-prefer ipv[46]' and 'v[46]only' are set, whatever
> combination, then, v[46]only applies, but configuration parsing may
> return a warning.

Yes, but please avoid the warning, it makes it unconvenient to edit
configs. You may for example have "resolve-prefer ipv4" in the
default-server directive, and having it warn because one of your
servers has v4only is annoying. BTW, the v4only and resolve-prefer
should also be used during the initial resolving phase performed
by getaddrinfo() but that's for a future patch :-)

Willy




Re: 1.6.0 Error: Cannot Create Listening Socket for Frontend and Stats,Proxies

2015-10-20 Thread Willy Tarreau
Hi Susheel,

On Wed, Oct 21, 2015 at 01:29:33AM +0530, Susheel Jalali wrote:
> Dear Willy and Lukas,
> 
> Thank you for your guidance.  Upon implementing your insights, here is
> the summed up result:
> 
> (1) With Willy?s patch HAProxy starts

OK thanks for confirming this.

> (2) But we have to remove listen haproxystats block, as it still cannot
> create listening socket for this listen proxy.

This is *really* strange.

> Detailed information you have requested is below without and with
> Willy?s patch.  Your further guidance to fix the item (2) above would be 
> appreciated.  Thank you.
> 
> ---
> 
> @Willy:  HAProxy is running on two of our servers that have CentOS 7.1
> (Server1) and CentOS 7 (Server2), which support namespace.  Checked
> with: ip netns add namespace1 and then ip netns list -->This listed
> namespace1
> 
> 
> WITHOUT Willy?s patch
> 
> @Lukas and Willy
> We installed HAPRoxy 1.5.14 and 1.6.0 on both Server1 and Server2.
> 1.5.14 starts and functions normally on both.
> 1.6.0 fails after start without creating pid.
> 
> ATTACHED are strace output from running the following on the same server 
> with the same configuration (including port numbers):
> HAProxy 1.5.14
> HAProxy 1.6.0

Could you please also do the same on the patched version ? The unpatched
one clearly shows the bug I fixed (ie: the socket syscall is not even
called). But with the patch we must see it and I have no idea why it
could fail at all, since it does exactly the same thing as the original
socket() call did. Do you have any option on the "bind" line of the
haproxystats listener ?

Thanks,
Willy




[ANNOUNCE] haproxy-1.6.1

2015-10-20 Thread Willy Tarreau
Hi all,

we've got rid of all the reported bugs since 1.6.0 so it's the right
timing for a new release so that those who got burnt by these bugs
can play with fire again... just kidding, it should be much better now.

The changelog is very small, which is a very good thing for one week
after a dot-zero release, really! In 1.5, we fixed 7 bugs in 5 days,
here it's 3 in 7 days.

The most impacting bugs were the segfault when 2 crts were on the same
bind line, and the bug with namespaces preventing from binding if no
namespace was declared at all. The rest concerns DNS adjustments to
better query servers and avoid the ANY query type, and a few build
fixes. We still have Susheel's report under investigation, as nothing
obvious could cause his isolated binding error, but we don't yet have
all the elements to evaluate it. It could also be a side effect of an
unclean rebuild or something like this, and I couldn't manage to
reproduce it.

The full changelog for 1.6.1 is here :
- DOC: specify that stats socket doc (section 9.2) is in management
- BUILD: install only relevant and existing documentation
- CLEANUP: don't ignore debian/ directory if present
- BUG/MINOR: dns: parsing error of some DNS response
- BUG/MEDIUM: namespaces: don't fail if no namespace is used
- BUG/MAJOR: ssl: free the generated SSL_CTX if the LRU cache is disable
- MEDIUM: dns: Don't use the ANY query type

Usual URLs below :
Site index   : http://www.haproxy.org/
Sources  : http://www.haproxy.org/download/1.6/src/
Git repository   : http://git.haproxy.org/git/haproxy-1.6.git/
Git Web browsing : http://git.haproxy.org/?p=haproxy-1.6.git
Changelog: http://www.haproxy.org/download/1.6/src/CHANGELOG
Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/

Continue to deeply test and to carefully deploy, observe and enjoy.

Willy




Re: [PATCH] MEDIUM: dns: Don't use the ANY query type

2015-10-20 Thread Andrew Hayworth
Oh wonderful - something's come up that would have blocked me from
working on this until next week, so thank you very much for updating
it for me!

On Tue, Oct 20, 2015 at 3:07 PM, Baptiste  wrote:
> Hi Andrew,
>
> I've updated your patch quickly so Willy can integrate it.
> I've also updated the commit message to follow Lukas recommendations.
>
> Baptiste
>
> On Tue, Oct 20, 2015 at 2:26 PM, Baptiste  wrote:
>> Hi Andrew,
>>
>> There is a bug repeated twice in your code.
>> In both dns_reset_resolution() and trigger_resolution(), you use
>> "resolution->resolver_family_priority" before it is positioned. This
>> may lead to using the last resolution->resolver_family_priority, which
>> may be different than the server one.
>> Please move the line "resolution->resolver_family_priority =
>> s->resolver_family_priority;" before using the value stored in it.
>>
>> Appart this, it looks good.
>>
>> Baptiste
>>
>>
>> On Tue, Oct 20, 2015 at 12:39 AM, Andrew Hayworth
>>  wrote:
>>> The ANY query type is weird, and some resolvers don't 'do the legwork'
>>> of resolving useful things like CNAMEs. Given that upstream resolver
>>> behavior is not always under the control of the HAProxy administrator,
>>> we should not use the ANY query type. Rather, we should use A or 
>>> according to either the explicit preferences of the operator, or the
>>> implicit default (/IPv6).
>>>
>>> - Andrew Hayworth
>>>
>>> From 8ed172424cbd79197aacacd1fd89ddcfa46e213d Mon Sep 17 00:00:00 2001
>>> From: Andrew Hayworth 
>>> Date: Mon, 19 Oct 2015 22:29:51 +
>>> Subject: [PATCH] MEDIUM: dns: Don't use the ANY query type
>>>
>>> Basically, it's ill-defined and shouldn't really be used going forward.
>>> We can't guarantee that resolvers will do the 'legwork' for us and
>>> actually resolve CNAMES when we request the ANY query-type. Case in point
>>> (obfuscated, clearly):
>>>
>>>   PRODUCTION! ahaywo...@secret-hostname.com:~$
>>>   dig @10.11.12.53 ANY api.somestartup.io
>>>
>>>   ; <<>> DiG 9.8.4-rpz2+rl005.12-P1 <<>> @10.11.12.53 ANY api.somestartup.io
>>>   ; (1 server found)
>>>   ;; global options: +cmd
>>>   ;; Got answer:
>>>   ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62454
>>>   ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0
>>>
>>>   ;; QUESTION SECTION:
>>>   ;api.somestartup.io.IN  ANY
>>>
>>>   ;; ANSWER SECTION:
>>>   api.somestartup.io. 20  IN  CNAME
>>> api-somestartup-production.ap-southeast-2.elb.amazonaws.com.
>>>
>>>   ;; AUTHORITY SECTION:
>>>   somestartup.io.   166687  IN  NS  
>>> ns-1254.awsdns-28.org.
>>>   somestartup.io.   166687  IN  NS  
>>> ns-1884.awsdns-43.co.uk.
>>>   somestartup.io.   166687  IN  NS  
>>> ns-440.awsdns-55.com.
>>>   somestartup.io.   166687  IN  NS  
>>> ns-577.awsdns-08.net.
>>>
>>>   ;; Query time: 1 msec
>>>   ;; SERVER: 10.11.12.53#53(10.11.12.53)
>>>   ;; WHEN: Mon Oct 19 22:02:29 2015
>>>   ;; MSG SIZE  rcvd: 242
>>>
>>> HAProxy can't handle that response correctly.
>>>
>>> Rather than try to build in support for resolving CNAMEs presented
>>> without an A record in an answer section (which may be a valid
>>> improvement further on), this change just skips ANY record types
>>> altogether. A and  are much more well-defined and predictable.
>>>
>>> Notably, this commit preserves the implicit "Prefer IPV6 behavior."
>>> ---
>>>  include/types/dns.h |  3 ++-
>>>  src/checks.c|  6 +-
>>>  src/dns.c   |  6 +-
>>>  src/server.c| 18 +++---
>>>  4 files changed, 19 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/include/types/dns.h b/include/types/dns.h
>>> index f8edb73..ea1a9f9 100644
>>> --- a/include/types/dns.h
>>> +++ b/include/types/dns.h
>>> @@ -161,7 +161,8 @@ struct dns_resolution {
>>>   unsigned int last_status_change; /* time of the latest DNS
>>> resolution status change */
>>>   int query_id; /* DNS query ID dedicated for this resolution */
>>>   struct eb32_node qid; /* ebtree query id */
>>> - int query_type; /* query type to send. By default DNS_RTYPE_ANY */
>>> + int query_type;
>>> + /* query type to send. By default DNS_RTYPE_A or DNS_RTYPE_
>>> depending on resolver_family_priority */
>>>   int status; /* status of the resolution being processed RSLV_STATUS_* */
>>>   int step; /* */
>>>   int try; /* current resolution try */
>>> diff --git a/src/checks.c b/src/checks.c
>>> index ade2428..d3cd567 100644
>>> --- a/src/checks.c
>>> +++ b/src/checks.c
>>> @@ -2214,7 +2214,11 @@ int trigger_resolution(struct server *s)
>>>   resolution->query_id = query_id;
>>>   resolution->qid.key = query_id;
>>>   resolution->step = RSLV_STEP_RUNNING;
>>> - resolution->query_type = DNS_RTYPE_ANY;
>>> + if (resolution->resolver_family_priority == AF_INET) {
>>> + 

Re: haproxy 1.6.0 crashes

2015-10-20 Thread Willy Tarreau
On Tue, Oct 20, 2015 at 03:00:42PM +0200, Christopher Faulet wrote:
> Le 20/10/2015 14:41, Willy Tarreau a écrit :
> >On Tue, Oct 20, 2015 at 02:14:37PM +0200, Christopher Faulet wrote:
> >>Le 20/10/2015 14:07, Willy Tarreau a écrit :
> >>>On Tue, Oct 20, 2015 at 01:59:52PM +0200, Willy Tarreau wrote:
> Then my understanding is that we should instead proceed differently :
>    - the cert is generated. It gets a refcount = 1.
>    - we assign it to the SSL. Its refcount becomes two.
>    - we try to insert it into the tree. The tree will handle its freeing
>  using SSL_CTX_free() during eviction.
>    - if we can't insert into the tree because the tree is disabled, then
>  we have to call SSL_CTX_free() ourselves, then we'd rather do it
>  immediately. It will more closely mimmick the case where the cert
>  is added to the tree and immediately evicted by concurrent activity
>  on the cache.
>    - we never have to call SSL_CTX_free() during ssl_sock_close() 
>    because
>  the SSL session only relies on openssl doing the right thing based 
>  on
>  the refcount only.
>    - thus we never need to know how the cert was created since the
>  SSL_CTX_free() is either guaranteed or already done for generated
>  certs, and this protects other ones against any accidental call to
>  SSL_CTX_free() without having to track where the cert comes from.
> >>>
> >>>This patch does this, and based on my understanding of your explanations,
> >>>it should do the right thing and be safe all the time. What's your 
> >>>opinion
> >>>?
> >>>
> >>
> >>Yes, it should work and it avoids keeping extra info on generated
> >>certificates. Good idea !
> >
> >Thanks. Do you have a easy reproducer for the issue with the certs ?
> >I tried a little bit but probably didn't test the proper sequence.
> >
> 
> Of course. Here is a little config file:
> 
> 
> global
> tune.ssl.default-dh-param   2048
> daemon
> 
> listen ssl_server
> mode tcp
> bind 127.0.0.1:4443 ssl crt srv1.test.com.pem crt srv2.test.com.pem
> 
> timeout connect 5000
> timeout client  3
> timeout server  3
> 
> server srv A.B.C.D:80
> 
> 
> 
> You just need to generate 2 SSL certificates with 2 CN (here 
> srv1.test.com and srv2.test.com).
> 
> Then, by doing SSL requests with the first CN, there is no problem. But 
> with the second CN, it should segfault on the 2nd request.
> 
> openssl s_client -connect 127.0.0.1:4443 -servername srv1.test.com // OK
> openssl s_client -connect 127.0.0.1:4443 -servername srv1.test.com // OK
> 
> But,
> 
> openssl s_client -connect 127.0.0.1:4443 -servername srv2.test.com // OK
> openssl s_client -connect 127.0.0.1:4443 -servername srv2.test.com // KO

Marvellous, thank you :-)

Willy




Re: haproxy 1.6.0 crashes

2015-10-20 Thread Willy Tarreau
So I can confirm with your reproducer that it's OK now. I've merged
the proposed fix with copies of your long detailed analysis. Thanks
for being so patient to explain me :-)

We'll have to wait for the last pending DNS fixes and I'll emit 1.6.1
so that we get rid of these annoying early bugs.

See you tomorrow,
Willy




Re: haproxy 1.6.0 crashes

2015-10-20 Thread Christopher Faulet

Le 20/10/2015 14:41, Willy Tarreau a écrit :

On Tue, Oct 20, 2015 at 02:14:37PM +0200, Christopher Faulet wrote:

Le 20/10/2015 14:07, Willy Tarreau a écrit :

On Tue, Oct 20, 2015 at 01:59:52PM +0200, Willy Tarreau wrote:

Then my understanding is that we should instead proceed differently :
   - the cert is generated. It gets a refcount = 1.
   - we assign it to the SSL. Its refcount becomes two.
   - we try to insert it into the tree. The tree will handle its freeing
 using SSL_CTX_free() during eviction.
   - if we can't insert into the tree because the tree is disabled, then
 we have to call SSL_CTX_free() ourselves, then we'd rather do it
 immediately. It will more closely mimmick the case where the cert
 is added to the tree and immediately evicted by concurrent activity
 on the cache.
   - we never have to call SSL_CTX_free() during ssl_sock_close() because
 the SSL session only relies on openssl doing the right thing based on
 the refcount only.
   - thus we never need to know how the cert was created since the
 SSL_CTX_free() is either guaranteed or already done for generated
 certs, and this protects other ones against any accidental call to
 SSL_CTX_free() without having to track where the cert comes from.


This patch does this, and based on my understanding of your explanations,
it should do the right thing and be safe all the time. What's your opinion
?



Yes, it should work and it avoids keeping extra info on generated
certificates. Good idea !


Thanks. Do you have a easy reproducer for the issue with the certs ?
I tried a little bit but probably didn't test the proper sequence.



Of course. Here is a little config file:


global
tune.ssl.default-dh-param   2048
daemon

listen ssl_server
mode tcp
bind 127.0.0.1:4443 ssl crt srv1.test.com.pem crt srv2.test.com.pem

timeout connect 5000
timeout client  3
timeout server  3

server srv A.B.C.D:80



You just need to generate 2 SSL certificates with 2 CN (here 
srv1.test.com and srv2.test.com).


Then, by doing SSL requests with the first CN, there is no problem. But 
with the second CN, it should segfault on the 2nd request.


openssl s_client -connect 127.0.0.1:4443 -servername srv1.test.com // OK
openssl s_client -connect 127.0.0.1:4443 -servername srv1.test.com // OK

But,

openssl s_client -connect 127.0.0.1:4443 -servername srv2.test.com // OK
openssl s_client -connect 127.0.0.1:4443 -servername srv2.test.com // KO


--
Christopher Faulet