Re: Fix for rare EADDRNOTAVAIL error

2014-02-05 Thread Willy Tarreau
Hi Denis,

On Thu, Feb 06, 2014 at 12:36:05PM +0700, Denis Malyshkin wrote:
> Hello Willy,
> 
> Thank you for the explanation and suggestions.
> I've re-checked logs and connections.
> 
> 1. There are no TIME_WAIT connections on our server. They may appear for 
> a very short time, but there are no long-waiting ones. So in that our 
> system works good.

OK. When you say the server, you mean the machine running haproxy, can
you confirm ?

> 2. What is connection retry mechanism you mentioned? Is it a haproxy or 
> a system mechanism?

Haproxy supports a retry mechanism for each connect. It defaults to 3
retries after a failed connect. It's set in the backend (or defaults)
using the "retries" directive.

> 3. With my re-connect loop the second try always was successful. Does it 
> mean that without my loop connection retry mechanism will also 
> successfully re-connect and such log errors may be completely ignored?

Yes, and you'll have a non-null retry count on the log lines (the value
after the connection counts, also the last field of the second block of
5 counters).

> 4. The main question. If above is right and such error messages are 
> completely harmless why so such errors are logged here while all other 
> connect errors aren't?

I have no idea, and that's what we need to sort out.

> Such logging worries our admins (and me) and so 
> we started to investigate and try to fix them. May it be better to 
> remove these log messages, or move it somewhere upper to the point where 
> connection retry mechanism decides that all reconnect tries are 
> unsuccessful? Do you have any reasons to leave them just here?

This is a very serious incident, it means that the system is about to
collapse. It's totally abnormal that connect() cannot find a spare
port, so I'd better keep the message.

> 5. We see "Connect() failed...: no free ports." errors 20-70 times per 
> day (depending on server load). Could you imagine any reasons why such 
> errors may occur? haproxy has only about 500-700 open connections, there 
> are no "dead" ones, all are in ESTABLISHED state.

Do you have any listening ports in the same range as the outgoing port
range ? It could be one reason for the system occasionally failing to
allocate a port. It's also very possible that you're facing a kernel
bug, we've had many changes to the source port allocation mechanism
in various kernels in order to workaround some such issues. Maybe just
upgrading the kernel will get rid of the issue.

Alternately, you can use the "source" parameter either on each server
or in the backend to fix a port range. Haproxy will then use an explicit
bind. This is normally used when you want to have more than 64k conns on
multiple servers. But here you could try this :

source 0.0.0.0:32678-61000

Regards,
Willy




Re: optimizing TLS time to first byte

2014-02-05 Thread Willy Tarreau
Hi Ilya,

On Wed, Feb 05, 2014 at 05:01:03PM -0800, Ilya Grigorik wrote:
> This is looking very promising! I created a simple page which loads a large
> image (~1.5MB), then onload fires, and after about 5s of wait, another
> image is fetched. All the assets are fetched over the same TCP connection.

Cool!

> - Sample WPT run:
> http://www.webpagetest.org/result/140206_R2_0eab5be9abebd600c17f199158782114/3/details/
> - tcpdump trace:
> http://cloudshark.org/captures/5092d680b992?filter=tcp.stream%3D%3D4

Thanks for the links.

> All requests begin with a 1440 byte records (configured
> as: tune.ssl.maxrecord=1400), and then get bumped to 16KB - awesome.

In my opinion you could even double this in order to fill 2 MSS at once,
since each client will accept at least 2 MSS in slow start. It will also
avoid some systems delaying ACK of the single segment.

> A couple of questions:
> 
> (a) It's not clear to me how the threshold upgrade is determined? What
> triggers the record size bump internally?

The forwarding mechanism does two things :
  - the read side counts the number of consecutive iterations that
read() filled the whole receive buffer. After 3 consecutive times,
it considers that it's a streaming transfer and sets the flag
CF_STREAMER on the communication channel.

  - after 2 incomplete reads, the flag disappears.

  - the send side detects the number of times it can send the whole
buffer at once. It sets CF_STREAMER_FAST if it can flush the
whole buffer 3 times in a row.

  - after 2 incomplete writes, the flag disappears.

I preferred to only rely on CF_STREAMER and ignore the _FAST variant
because it would only favor high bandwidth clients (it's used to
enable splice() in fact). But I thought that CF_STREAMER alone would
do the right job. And your WPT test seems to confirm this, when we
look at the bandwidth usage!

> (b) If I understood your earlier comment correctly, HAProxy will
> automatically begin each new request with small record size... when it
> detects that it's a new request.

Indeed. In HTTP mode, it processes transactions (request+response), not
connections, and each new transaction starts in a fresh state where these
flags are cleared.

> This works great if we're talking to a
> backend in "http" mode: we parse the HTTP/1.x protocol and detect when a
> new request is being processed, etc. However, what if I'm using HAProxy to
> terminate TLS (+alpn negotiate) and then route the data to a "tcp" mode
> backend.. which is my spdy / http/2 server talking over a non-encrypted
> channel.

Ah good point. I *suspect* that in practice it will work because :

  - the last segment of the first transfer will almost always be incomplete
(you don't always transfer exact multiples of the buffer size) ;
  - the first response for the next request will almost always be incomplete
(headers and not all data)

So if we're in this situation, this will be enough to reset the CF_STREAMER
flag (2 consecutive incomplete reads). I think it would be worth testing it.
A very simple way to test it in your environment would be to chain two
instances, one in TCP mode deciphering, and one in HTTP mode.

> In this instance this logic wouldn't work, since HAProxy doesn't
> have any knowledge or understanding of spdy / http/2 streams -- we'd start
> the entire connection with small records, but then eventually upgrade it to
> 16KB and keep it there, correct?

It's not kept, it really depends on the transfer sizes all along. It matches
more or less what you explained at the beginning of this thread, but based
on transfer sizes at the lower layers.

> Any clever solutions for this? And on that note, are there future plans to
> add "http/2" smarts to HAProxy, such that we can pick apart different
> streams within a session, etc?

Yes, I absolutely want to implement HTTP/2 but it will be time consuming and
we won't have this for 1.5 at all. I also don't want to implement SPDY nor
too early releases of 2.0, just because whatever we do will take a lot of
time. Haproxy is a low level component, and each protocol adaptation is
expensive to do. Not as much expensive as what people have to do with ASICs,
but still harder than what some other products can do by using a small lib
to perform the abstraction.

One of the huge difficulties we'll face will be to manage multiple streams
over one connection. I think it will change the current paradigm of how
requests are instanciated (which already started). From the very first
version, we instanciated one "session" upon accept(), and this session
contains buffers on which analyzers are plugged. The HTTP parsers are
such analyzers. All the states and counters are stored at the session
level. In 1.5, we started to change a few things. A connection is
instanciated upon accept, then the session allocated after the connection
is initialized (eg: SSL handshake complete). But splitting the sessions
between multiple requests will be quite complex. For example, I

Hot reconfiguration of HAProxy still lead to failed request, any suggestions?

2014-02-05 Thread 周登朋
I found there are still failed request when the traffic is high using
command like this "haproxy -f /etc/haproxy.cfg -p /var/run/haproxy.pid -sf
$(cat /var/run/haproxy.pid)" to hot reload the updated config file.

Here below is the presure testing result using webbench :
/usr/local/bin/webbench -c 10 -t 30
http://targetHAProxyIP:1080/
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://targetHAProxyIP:1080/
10 clients, running 30 sec.

Speed=70586 pages/min, 13372974 bytes/sec.
* Requests: 35289 susceed, 4 failed.*

I run command "haproxy -f /etc/haproxy.cfg -p /var/run/haproxy.pid -sf
$(cat /var/run/haproxy.pid)" several times during the pressure testing.

In the haproxy documentation, it mentioned "They will receive the SIGTTOU
611 signal to ask them to temporarily stop listening to the ports so that
the new
612 process can grab them",
* so there might be a time period that the old process is not listening on
the PORT(say 80) and the new process haven't start to listen to the PORT
(say 80), and during this specific time period, it will cause the NEW
connections failed, make sense?*


Re: Fix for rare EADDRNOTAVAIL error

2014-02-05 Thread Denis Malyshkin

Hello Willy,

Thank you for the explanation and suggestions.
I've re-checked logs and connections.

1. There are no TIME_WAIT connections on our server. They may appear for 
a very short time, but there are no long-waiting ones. So in that our 
system works good.


2. What is connection retry mechanism you mentioned? Is it a haproxy or 
a system mechanism?


3. With my re-connect loop the second try always was successful. Does it 
mean that without my loop connection retry mechanism will also 
successfully re-connect and such log errors may be completely ignored?


4. The main question. If above is right and such error messages are 
completely harmless why so such errors are logged here while all other 
connect errors aren't? Such logging worries our admins (and me) and so 
we started to investigate and try to fix them. May it be better to 
remove these log messages, or move it somewhere upper to the point where 
connection retry mechanism decides that all reconnect tries are 
unsuccessful? Do you have any reasons to leave them just here?


5. We see "Connect() failed...: no free ports." errors 20-70 times per 
day (depending on server load). Could you imagine any reasons why such 
errors may occur? haproxy has only about 500-700 open connections, there 
are no "dead" ones, all are in ESTABLISHED state.


Thank you a lot for your help!


Hello Denis,

On Tue, Feb 04, 2014 at 12:10:05PM +0700, Denis Malyshkin wrote:
  

Hello all,

We have used haproxy for several months. And periodically see the next 
error messages in the log:


Sep 27 16:17:06 localhost haproxy[12874]: Connect() failed for backend 
https: no free ports.



I've investigated this issue and found that EADDRNOTAVAIL error is 
returned sometimes.
Probably it is caused by the fact that we are using one port from 
ephemeral range for our internal needs.
According to http://en.wikipedia.org/wiki/Ephemeral_port 'connect' 
function usually just uses Round-Robin algorithm to choice the next 
ephemeral port, and so when it encountered already used port -- it just 
produces the above error.


Solution for this issue is simple -- add a loop around connect. We have 
implemented it and tested on our environment. It works for us. May it 
will be good enough to include into the core haproxy...


Logic of the solution is simple -- try to connect 3 times in case of 
EAGAIN, EADDRINUSE or EADDRNOTAVAIL errors:



You should not need to do this, this will naturally be handled by the
connection retry mechanism for the number of configured retries. Also,
your method will not work with explicit source port ranges, because it
will insist on reusing the same source, while the retries method will
automatically pick another one.

BTW, if you're seeing this problem, I suspect you're running a bogus
protocol such as Redis where the client closes first, causing the local
ports to remain in TIME_WAIT state for some time and not being reusable.
If this is the case, you should put an "option nolinger" in the backend
section (don't put it in the frontend!). That way it will tell the system
to flush whatever data may remain upon close and will get rid of the
TIME_WAIT. Otherwise, under moderate load, you can end up with no more
free ports at all and your workaround will not work anymore.

Best regards,
Willy
  



--
Best regards,
 Denis Malyshkin,
Senior C++ Developer
of ISS Art, Ltd., Omsk, Russia.
Mobile Phone: +7 913 669 2896
Office tel/fax +7 3812 396959
Yahoo Messenger: dmalyshkin
Web: http://www.issart.com
E-mail: dmalysh...@issart.com



Re: optimizing TLS time to first byte

2014-02-05 Thread Ilya Grigorik
This is looking very promising! I created a simple page which loads a large
image (~1.5MB), then onload fires, and after about 5s of wait, another
image is fetched. All the assets are fetched over the same TCP connection.

- Sample WPT run:
http://www.webpagetest.org/result/140206_R2_0eab5be9abebd600c17f199158782114/3/details/
- tcpdump trace:
http://cloudshark.org/captures/5092d680b992?filter=tcp.stream%3D%3D4

All requests begin with a 1440 byte records (configured
as: tune.ssl.maxrecord=1400), and then get bumped to 16KB - awesome.

A couple of questions:

(a) It's not clear to me how the threshold upgrade is determined? What
triggers the record size bump internally?
(b) If I understood your earlier comment correctly, HAProxy will
automatically begin each new request with small record size... when it
detects that it's a new request. This works great if we're talking to a
backend in "http" mode: we parse the HTTP/1.x protocol and detect when a
new request is being processed, etc. However, what if I'm using HAProxy to
terminate TLS (+alpn negotiate) and then route the data to a "tcp" mode
backend.. which is my spdy / http/2 server talking over a non-encrypted
channel. In this instance this logic wouldn't work, since HAProxy doesn't
have any knowledge or understanding of spdy / http/2 streams -- we'd start
the entire connection with small records, but then eventually upgrade it to
16KB and keep it there, correct?

Any clever solutions for this? And on that note, are there future plans to
add "http/2" smarts to HAProxy, such that we can pick apart different
streams within a session, etc?

ig


On Sun, Feb 2, 2014 at 12:32 AM, Willy Tarreau  wrote:

> Hi Ilya,
>
> On Sat, Feb 01, 2014 at 11:33:50AM -0800, Ilya Grigorik wrote:
> > Hi Eric.
> >
> > 0001-MINOR-ssl-handshake-optimz-for-long-certificate-chai: works great!
> > After applying this patch the full cert is sent in one RTT and without
> any
> > extra pauses. [1]
>
> Cool, I'm impressed to see that the SSL time has been divided by 3! Great
> suggestion from you on this, thank you! I'll merge this one now.
>
> > 0002-MINOR-ssl-Set-openssl-max_send_fragment-using-tune.s: I'm testing
> with
> > / against openssl 1.0.1e, and it seems to work. Looking at the tcpdump,
> the
> > packets look identical to previous runs without this patch. [2]
> >
> > Any thoughts on dynamic sizing? ;)
>
> OK I've implemented it and tested it with good success. I'm seeing several
> small packets at the beginning, then large ones.
>
> However in order to do this I am not using Emeric's 0002 patch, because we
> certainly don't want to change the fragment size from the SSL stack upon
> every ssl_write() in dynamic mode, so I'm back to the initial principle of
> just moderating the buffer size. By using tune.ssl.maxrecord 2859, I'm
> seeing a few series of two segments of 1448 and 1437 bytes respectively,
> then larger ones up to 14-15kB that are coalesced by TSO on the NIC.
>
> It seems to do what we want :-)
>
> I'm attaching the patches if you're interested in trying it. However you'll
> have to revert patch 0002.
>
> Thanks for your tests and suggestions!
> Willy
>
>


HAProxy Question

2014-02-05 Thread Rem Fox
Hello,

I am trying to trouble shoot a technical issue with haproxy.  We are using
a round robin algorithm for both http and tcp 443.  The thing we notice is
that the 443 connections in the logs show multiple tcp ports opening for
the same source IP as most clients are behind some type of firewall so the
source IP is the same.  haproxy uses multiple ports as well to the real
servers to distinguish sessions.  The question I have is what is the
default behavior of the load balancer for these sessions?  Is each port
call its own session (same ip, different port) and only closed by a tcp
FIN.  Is there a way to look at active sessions real time so we can
determine that round robin is working properly per client with and without
a sticky mode?

If there is a quick guide in the manual for trouble shooting basics that
would be great.  I could not locate that.

Thanks,
Rem


Re: SSL front and backend

2014-02-05 Thread Kobus Bensch
Thank you Lukas. I think I got it sorted. I will post my config as soon as I 
can for reference. Kobus

Sent from my iPhone

> On 5 Feb 2014, at 16:52, Lukas Tribus  wrote:
> 
> Hi,
> 
> 
>> Excellent. Having looked at the documentation, I cant clearly see the 
>> configuration options I need to use. Can you point me to a doc that 
>> will explain on how to set it up and which options to use please? 
> 
> 
> examples/ssl.cfg is a (very) simplified configuration of what you would
> like to do.
> 
> Add "option forwardfor" in the frontend/backend according to your needs.
> 
> 
> Use dev22 or newer (so you don't get the default tunnel mode, but full
> keep-alive by default).
> 
> 
> 
> Regards,
> 
> Lukas 

-- 


Trustpay Global Limited is an authorised Electronic Money Institution 
regulated by the Financial Conduct Authority registration number 900043. 
Company No 07427913 Registered in England and Wales with registered address 
130 Wood Street, London, EC2V 6DL, United Kingdom.

For further details please visit our website at www.trustpayglobal.com.

The information in this email and any attachments are confidential and 
remain the property of Trustpay Global Ltd unless agreed by contract. It is 
intended solely for the person to whom or the entity to which it is 
addressed. If you are not the intended recipient you may not use, disclose, 
copy, distribute, print or rely on the content of this email or its 
attachments. If this email has been received by you in error please advise 
the sender and delete the email from your system. Trustpay Global Ltd does 
not accept any liability for any personal view expressed in this message.



[Patch V2 1/1] [MINOR] Enhancement to stats page to provide information of last session time.

2014-02-05 Thread Bhaskar Maddala
Hello,

  Resubmitting the patch with changed to address concerns from previous attempt

  Updates from previous submission (a) using unsigned long for last
session time stamp instead of timeval struct (b) avoiding tv_now and
using now


Thanks
Bhaskar


0001_last_session_date_stats.patch
Description: Binary data


RE: SSL front and backend

2014-02-05 Thread Lukas Tribus
Hi,


> Excellent. Having looked at the documentation, I cant clearly see the 
> configuration options I need to use. Can you point me to a doc that 
> will explain on how to set it up and which options to use please? 


examples/ssl.cfg is a (very) simplified configuration of what you would
like to do.

Add "option forwardfor" in the frontend/backend according to your needs.


Use dev22 or newer (so you don't get the default tunnel mode, but full
keep-alive by default).



Regards,

Lukas <>

Derniers jours pour profiter de l'Offre Plus Jamais Froid en Hiver

2014-02-05 Thread Boutique


	
		
			
	
		
			
Voir l'email sur le web  

			
		
	

	
		
			

	
		
			
			 

			
	



		
			
			 PROFITEZ DE L'OFFRE PLUS JAMAIS FROID EN
HIVER JUSQU'AU 10 FEVRIER 2014
   
  
 
 
  
 
 
  
 
 
  
 
 
  
 
   Pour être
sûr de recevoir nos newsletters, merci d'ajouter notre adresse email
boutique@cgrgolf.fr à vos contacts.  
 www.cgrgolf.fr   
Rejoignez-nous sur  

  

 
  

			
	


	
		
			
			
 Veuillez me retirer de votre liste
de diffusion 
   Désinscrivez-vous ici   
 
		
	

			
		
	

			
		
	




Re: SSL front and backend

2014-02-05 Thread Kobus Bensch
Excellent. Having looked at the documentation, I cant clearly see the 
configuration options I need to use. Can you point me to a doc that will 
explain on how to set it up and which options to use please?


Kobus


On 05/02/2014 15:00, Lukas Tribus wrote:

Hi,




Can you tell me if the following is possible with HA proxy please:

LB-Prim-Node---LB-Backup-Node
HTTPS VIP
|___Heart Beat___|
| | |
| | |
| | |
Real-Srv1 Real-Srv2 Real-Srv3
HTTPS HTTPS HTTPS

I need a HTTPS entry and the backend server in the farm also need to be
HTTPS for PCI-DSS requirements.

I further need the xforwarded header to be passed for fraud prevention
so I cant use the TCP service.

Yes it its. You will need to install the official certificate on both
load-balancers and terminate SSL on both front and backends.


Regards,

Lukas   


--
Kobus Bensch Trustpay Global LTD email signature Kobus Bensch
Senior Systems Administrator
Address:  22 & 24 | Frederick Sanger Road | Guildford | Surrey | GU2 7YD
DDI:  0207 871 3958
Tel:  0207 871 3890
Email: kobus.ben...@trustpayglobal.com 



--


Trustpay Global Limited is an authorised Electronic Money Institution 
regulated by the Financial Conduct Authority registration number 900043. 
Company No 07427913 Registered in England and Wales with registered address 
130 Wood Street, London, EC2V 6DL, United Kingdom.


For further details please visit our website at www.trustpayglobal.com.

The information in this email and any attachments are confidential and 
remain the property of Trustpay Global Ltd unless agreed by contract. It is 
intended solely for the person to whom or the entity to which it is 
addressed. If you are not the intended recipient you may not use, disclose, 
copy, distribute, print or rely on the content of this email or its 
attachments. If this email has been received by you in error please advise 
the sender and delete the email from your system. Trustpay Global Ltd does 
not accept any liability for any personal view expressed in this message.
<>

RE: SSL front and backend

2014-02-05 Thread Lukas Tribus
Hi,



> Can you tell me if the following is possible with HA proxy please:
>
> LB-Prim-Node---LB-Backup-Node
> HTTPS VIP
> |___Heart Beat___|
> | | |
> | | |
> | | |
> Real-Srv1 Real-Srv2 Real-Srv3
> HTTPS HTTPS HTTPS
>
> I need a HTTPS entry and the backend server in the farm also need to be
> HTTPS for PCI-DSS requirements.
>
> I further need the xforwarded header to be passed for fraud prevention
> so I cant use the TCP service.

Yes it its. You will need to install the official certificate on both
load-balancers and terminate SSL on both front and backends.


Regards,

Lukas 


SSL front and backend

2014-02-05 Thread Kobus Bensch

Hi

Can you tell me if the following is possible with HA proxy please:

LB-Prim-Node---LB-Backup-Node
HTTPS VIP
|___Heart Beat___|
|||
|||
|||
Real-Srv1Real-Srv2Real-Srv3
HTTPSHTTPSHTTPS

I need a HTTPS entry and the backend server in the farm also need to be 
HTTPS for PCI-DSS requirements.


I further need the xforwarded header to be passed for fraud prevention 
so I cant use the TCP service.


Thanks

Kobus

--


Trustpay Global Limited is an authorised Electronic Money Institution 
regulated by the Financial Conduct Authority registration number 900043. 
Company No 07427913 Registered in England and Wales with registered address 
130 Wood Street, London, EC2V 6DL, United Kingdom.


For further details please visit our website at www.trustpayglobal.com.

The information in this email and any attachments are confidential and 
remain the property of Trustpay Global Ltd unless agreed by contract. It is 
intended solely for the person to whom or the entity to which it is 
addressed. If you are not the intended recipient you may not use, disclose, 
copy, distribute, print or rely on the content of this email or its 
attachments. If this email has been received by you in error please advise 
the sender and delete the email from your system. Trustpay Global Ltd does 
not accept any liability for any personal view expressed in this message.




Re: [Patch V1 1/1] Enhancement to stats page to provide information of last session time.

2014-02-05 Thread Willy Tarreau
Hi Bhaskar,

On Tue, Feb 04, 2014 at 04:11:58PM -0500, Bhaskar Maddala wrote:
> Hello,
> 
>   I took a stab at implementing
> 
>   - add a last activity date for each server (req/resp) that will be
> displayed in the stats. It will be useful with soft stop.
> 
>   from the ROADMAP document.

Cool!

>   I digressed from the "last activity date (req/resp)" by logging a timestamp
> for the last session. I am not certain if this is an adequate replacement,
> however in my testing (http mode) with keep alive it seemed to suffice.

It will work for short sessions such as HTTP. But some people also have to
deal with long sessions (websocket, RDP, ...) and on these ones, it's not
really easy to know if things continue to move or not.

I know it will be more difficult to update these counters from the lower
layers. I suspect we could do that from the stream interfaces, because we
have the target information in the connection and if the target is a server
then we can update the counter. But I also think that what you did is
already a significant improvement over what we currently have, and we
could start with this and avoid over-complicating things as a first step.

I have some comments about the patch itself :


+/* set the time of last session on the backend */
+static void inline be_set_sess_last(struct proxy *be)
+{
+   tv_zero(&be->be_counters.last_sess);
+   tv_now(&be->be_counters.last_sess);
+}

You should just do the following above :

be->be_counters.last_sess = now;

The reason is that tv_now() uses gettimeofday() (which costs a syscall)
but more importantly which is not corrected for time drift and is not
monotonous. "now" is corrected. This is especially important on VMs
which tend to experience huge time drifts, because you don't want to
see negative time offsets or large ones.

+/* set the time of last session on the designated server */
+static void inline srv_set_sess_last(struct server *s)
+{
+   tv_zero(&s->counters.last_sess);
+   tv_now(&s->counters.last_sess);
+}
+

So same here, of course. You could even just save the seconds since
the microseconds are not used.

> I included testing of soft stop and start since that was explicitly mentioned.
> 
> Let me know if this works or if there are better alternative that you would
> like me to pursue.
> 
> [1] http://tinyurl.com/odlnvza

It's looking nice and natural. I would have expected to put this on the right
close to the checks but in fact it's much more logical where you put it,
especially since it allows to shrink the column title to 4 letters :-)

That's a nice work. If you could address the comments above, I'd happily
merge it.

Thanks Bhaskar!

Willy




Re: Change Request (slash bug report)

2014-02-05 Thread Willy Tarreau
On Tue, Feb 04, 2014 at 10:41:57PM -0800, Tyler Stobbe wrote:
> HAProxy is quite nice, don't get me wrong in all of this, but I have a very
> basic change request that would bring it more in line with the modern era...
> 
> The problem stems from the fact that, the way the daemon works now, a
> "reload" request dumps the initial process group completely, which is
> entirely antithetical to how daemons have traditionally worked. This has
> led to fedora shipping with completely broken systemd service files and, I
> wouldn't be surprised if upstart services are non-working as well.
> 
> The current HAProxy daemonization process:
> 
>1. Launch, fork, all the standard daemon things
>2. Do your stuff
>3. Someone runs haproxy with -sf (a completely different process group)
>4. Agree, soft-shutdown, and exit after the other process group says,
>"Good to go"
>5. Profit? No. Because haproxy is gone as far as any service manager is
>concerned.

Note that this became a problem since those crappy "service managers"
tried to replace what has been working on all unix systems for 40 years.

> Indeed, in this scenario, an entirely new process group was started,
> without any oversight from anyone, and other process group is dead. Not so
> good.
> 
> I would like to see the following instead (very standard practice):
> 
>1. Launch, fork, all the standard daemon things
>2. On SIGUSR2 (frequently this is SIGHUP, but I read your docs and
>that's in use), fork a "fresh" process that reads the config.
>3. Just before the sub-process starts binding listening ports report to
>the parent, "Okay, I'm going to listen now, please stop listening"
>4. The parent then shuts down listening sockets (but not active
>connections) and starts a "graceful" shutdown process.
>5. The new child (soon to be main/only process in the group) listens and
>continues on as a normal "main" process would.
>6. Rinse and repeat as necessary.

That's exactly what you get with the provided systemd wrapper, so I
suspect you're not using it maybe.

> This, not only being the traditional way to do things (HAProxy was the
> first I came across that does it completely the opposite) solves tons of
> issues with modern service managers and process control in general.

  s/modern/of-the-day :-)

One feature we lose with such methods is the ability to perform process
upgrades because in your sequence, it's the same process that forks itself.
So upgrades require a service outage. Our reload system allows you to
seamlessly upgrade the daemon, and this is used a lot. Most very large
deployments I know have totally automatized this and that's both efficient
and reliable.

That said, with the systemd wrapper, you have the best of both worlds,
systemd works and the wrapper really execs the haproxy binary, either
itself or a hard-coded path for now, so you can remove the binary and
put the new one instead and have it upgraded (using symlinks would be
more efficient).

Regards,
Willy