Re: SSL farm
if a SSL server dies, LVS can direct the traffic to another server. Alternatively you can save SSL sessions in memcached for example, to share between SSL servers in the SSL farm. I once stumbled upon a patch for nginx that can do that. Regards, Bar. On Tue, May 22, 2012 at 9:16 PM, Allan Wind allan_w...@lifeintegrity.comwrote: On 2012-05-22 19:46:45, Vincent Bernat wrote: Yes. And solve session problem by using some kind of persistence, for example source hashing load balancing algorithm. Persistence here meaning ssl packets for a given session goes to the same ssl server? If so what happens if that ssl server dies? /Allan -- Allan Wind Life Integrity, LLC http://lifeintegrity.com
CD state in error log
Hi, I'm seeing lots of these errors lately: May 17 19:01:32 lb-01 haproxy[25531]: 2.51.83.90:38410[17/May/2012:19:01:31.687] public static/web-01 0/0/0/1/576 200 58381 - - CD-- 3201/3201/26/10/0 0/0 GET /js/all.min.js?v=163bb6a7e70d HTTP/1.1 This is on static images. I don't think that my clients changed somehow, and CD means that the client aborted the connection before it should have, as I understood. Any idea on what can cause this? Thanks, Bar.
Re: BADREQ on production haproxy
Willy, Thank you, I will follow up with your suggestions soon. But I just had a production down-time with the haproxy machine: After posting something to our Facebook wall (it happened twice, yesterday and 3 days ago), which usually brings more traffic (but not more than we can usually handle (for example before haproxy was deployed), the haproxy machine got into swap, all the memory was taken (1GB) and the machine kept keepalived bouncing to the backup machine (I believe because it was so unresponsive). How can I check that further? Should I just increase the machine's RAM? Thanks, Bar. On Sat, May 12, 2012 at 10:39 AM, Willy Tarreau w...@1wt.eu wrote: Hi Bar, On Thu, May 10, 2012 at 07:02:58PM +0300, Bar Ziony wrote: Hey, We're running haproxy 1.4.20 as our LB, nginx is listening on the same machine on port 443 and terminating SSL, proxying the unencrypted requests to haproxy on localhost:80. I see many of these errors on the haproxy log: May 10 15:54:06 lb-01 haproxy[6563]: 1.1.1.1:50929[10/May/2012:15:54:01.113] public public/NOSRV -1/-1/-1/-1/5519 400 187 - - CR-- 3045/3045/0/0/0 0/0 BADREQ * I changed the source IP for the sake of this example. We get around 5-15 of these per second, and I checked some of the IPs and it seems at least some of them are IPs that users registered with (maybe it's a very big proxy or something so it's not actually those users). As you can see, the client took 5.5 seconds to send an incomplete request then closed the connection ('CR'). It is possible that some users have developped monitoring scripts which are targetting your site. I sometimes get a number of these on the haproxy web site too. While the ones sending valid requests are just a bandwidth annoyance, the ones sending invalid requests are much more annoying. If the requests are completely invalid, you can find a capture of them on the stats socket using show errors : echo show errors | socat stdio /var/run/haproxy.stat (or whereever you put it, check stats socket in your global section) We're running on a pretty fast Linode VPS (1GB RAM), it handled 5000 requests per second on testing (which is low, I know, but it is still a VPS). We are doing much less than 5000 req/sec ... The CPU usage is 10-20% for haproxy alone (10% more for nginx), and 10-20% RAM usage for haproxy (~150MB RES, ~180MB VIRT). Does that make sense? Yes, nothing sounds strange here. What are these requests? Is it possible these are regular users trying to somehow get to our web app and not succeeding ? That's really unlikely, because such invalid requests happen at a layer which is only controlled by the browser. A normal browser cannot emit invalid requests. Only bots do. It's possible that some of your users are running crappy site sucking plugins, or home-made search engines which emit invalid requests. Some of the invalid requests I'm used to observe are those where the user forget to send the last CR/LF, so the request is incomplete. Well, if you have 15 of these a second, just run tcpdump for a few seconds to capture some of them and you'll know what these are. Regards, Willy
Re: BADREQ on production haproxy
I have no problem increasing the RAM if needed, but how do I know if it's needed? Where can I see the number of connections per second to see if I somehow reached 20k ? I don't think I reached 20k because the global maxconn is 20K This is my TCP tuning config for the LB: # TCP stack tuning net.ipv4.tcp_tw_reuse = 1 net.ipv4.ip_local_port_range = 1024 65535 net.ipv4.tcp_max_syn_backlog = 1 net.ipv4.tcp_max_tw_buckets = 40 net.ipv4.tcp_max_orphans = 6 net.ipv4.tcp_synack_retries = 3 net.core.somaxconn = 2 # netfilter/iptables tuning net.netfilter.nf_conntrack_max = 524288 net.netfilter.nf_conntrack_tcp_timeout_time_wait = 30 # Allow binding to non-local IP addresses net.ipv4.ip_nonlocal_bind = 0 This is my haproxy.cfg: global daemon user haproxy group proxy log 127.0.0.1 local0 log-send-hostname maxconn 2 defaults mode http log global retries 2 timeout client 90s # Client and server timeout must match the longest. timeout server 90s # Time we may wait for a response from the server. timeout queue 90s # Don't queue requests too long if saturated. timeout connect 4s # There's no reason to change this one. option abortonclose# Close aborted connections if they still didn't reach a backend (e.g still in a queue). option http-server-close # Enable HTTP connection closing on the server (backend) side. option log-health-checks option tcp-smart-accept option tcp-smart-connect frontend public bind :80 maxconn 19500 option httplog # Add the backend server ID as a response header rspadd X-Backend:\ 0 if { srv_id 1 } rspadd X-Backend:\ 1 if { srv_id 2 } # Use dynamic backend if the request path ends with .php, fallback to the default static otherwise acl url_dynamic path_end .php use_backend dynamic if url_dynamic default_backend static backend dynamic balance roundrobin option forwardfor except 127.0.0.1 # Set the client IP in X-Forwarded-For except for when the client IP is loopback (nginx SSL termination). option httpchk GET /dynamic_health_check default-server inter 4000 server web-01 web-01:80 maxconn 80 check server web-02 web-02:80 maxconn 80 check backend static balance roundrobin option httpchk GET /static_health_check server web-01 web-01:80 check server web-02 web-02:80 check # Enable the stats page on a dedicated port () listen stats # Uncomment 'disabled' below to disable the stats page # disabled bind : stats uri / stats realm HAProxy\ Statistics stats auth admin:my-secret-password Any help would be much appreciated, we're experiencing issues with less traffic than before haproxy... Thanks, Bar. On Sat, May 12, 2012 at 2:31 PM, Willy Tarreau w...@1wt.eu wrote: On Sat, May 12, 2012 at 01:23:17PM +0200, Baptiste wrote: On Sat, May 12, 2012 at 1:01 PM, Bar Ziony bar...@gmail.com wrote: Willy, Thank you, I will follow up with your suggestions soon. But I just had a production down-time with the haproxy machine: After posting something to our Facebook wall (it happened twice, yesterday and 3 days ago), which usually brings more traffic (but not more than we can usually handle (for example before haproxy was deployed), the haproxy machine got into swap, all the memory was taken (1GB) and the machine kept keepalived bouncing to the backup machine (I believe because it was so unresponsive). How can I check that further? Should I just increase the machine's RAM? Thanks, Bar. o Maybe you can share with us some sysctl (mainly the ones related to TCP buffers), as well as your HAProxy configuration (hiding private information) Are there any other processes which may eat memory on the machine? tcp_mem is often quite sensible, you need to limit it if you don't have enough RAM. You can also reduce haproxy's buffers size. I run all machines at slightly less than 8kB which is more than enough and holds 5.5 TCP segments, limiting copies in the kernel: global tune.bufsize 8030 tune.maxrewrite 1030 The numbers I'm used to see with this settings are 1/3 of the RAM used by haproxy, 1/3 used by socket buffers in the kernel and the last 1/3 for the rest of the system. With such numbers, you have each connection take around 17kB on haproxy, which theorically allows up to around 40k concurrent conns on a 1GB machine. Warning, it's very tricky to reach 40k conns per GB. Better stay safe and aim at 20k per GB. Regards, Willy
Stats for backend queue
Hey, I have a dynamic backend with maxconn 80 with multiple servers. Many times I can see on the haproxy stats page that servers on this backend are reaching their maximum 80, but I don't see the number of requests currently in queue. The maximum number I ever see is 80. Why is that? Can I somehow see the number of requests in the queue? Also, with a munin plugin that checks the HTTP page with ;csv, I see that sometimes the graphs shows 400+ req/sec for this backend, which is not possible since the maximum is 80... Last, what is the difference between Sessions and Session rate ? How can I tell when I need another dynamic backend server? Thanks! Bar.
Re: Stats for backend queue
Willy, thanks for your answer. On Sat, May 12, 2012 at 7:21 PM, Willy Tarreau w...@1wt.eu wrote: On Sat, May 12, 2012 at 07:01:19PM +0300, Bar Ziony wrote: Hey, I have a dynamic backend with maxconn 80 with multiple servers. Many times I can see on the haproxy stats page that servers on this backend are reaching their maximum 80, but I don't see the number of requests currently in queue. The maximum number I ever see is 80. Why is that? Can I somehow see the number of requests in the queue? The queue is split between servers and backend. In the servers' queue, you only see the requests which absolutely need to be served by the given server (due to persistence cookie or stick-tables). Otherwise the request lies in the backend's queue so that it will be served by the first available server. It's very normal not to have too many requests in the server's queue and have more in the backend's queue. I now see the Queue part in the backend line and indeed I can see the numbers rising on load! Thanks :) I have no persistency and my backend servers are totally agnostic to what user they're serving (user sessions are stored on a centeralized memcached). Also, with a munin plugin that checks the HTTP page with ;csv, I see that sometimes the graphs shows 400+ req/sec for this backend, which is not possible since the maximum is 80... Last, what is the difference between Sessions and Session rate ? You seem to be really confusing concurrency and rate I'm afraid. Imagine a highway, it's the same. Session rate is the number of cars you see pass an observation point each second. Session concurrency is the number of parallel lanes that are occupied at a given instant. If the traffic slows down, you need more lanes to drain the same number of cars without slowing the rate down. If your cars drive faster, you need less lanes for a same cars rate. So session rate is the number of requests per second ? Why is it called session then if it's really requests? And Sessions is just plain sessions number, without caring for how much of them were happening in 1 sec? Regards, Willy How can I tell when I need another dynamic backend server? It's simple : observe the total queue size in a backend (backend + sum of servers). Divide the number by the maxconn and it will tell you the number of servers that would allow the requests to be processed without queuing. Note that it's fine to have a bit of queueing, it saves you from buying more hardware at the expense of a slightly delayed processing. You just need to ensure the queue is not too deep. The average time spent in the queue is the average queue size divided by the maxconn and multiplied by the average response time. So in order to get an idea : srv1 has maxconn 80 and queue around 10 srv2 has maxconn 80 and queue around 10 backend has a queue around 100 The total queue is 120, which is the equivalent of 1.5 server. Let's say you add a single server, you'll then have around 80 requests spread over the last server, and 40 requests still in the queues. If your servers exhibit an average response time of 50 ms, the average time spent in the queue will be 40/80*50 ms = 25ms, so the total response time will increase from 50ms to 75ms due to the queue. For many sites this will not be noticeable and probably acceptable. Now if your site is already slow (eg: 2 seconds response time), adding 50% more will give you 3 seconds and your users will clearly notice the difference. How can I know the average response time of my servers? haproxy provides that data somewhere? I have a max of 800 requests in the backend queue (none in the servers queue since there is no persistence). Is that a lot ? :| I also see 3,400 sessions in the frontend, and only ~100 in the dynamic backend and 15 in the static backend (in the cur column). How is that possible? So many requests are not valid, or sessions are kept and are not for 1 request only ? :\ I don't understand that.. Thanks, Bar. That's why you first need to maintain the response times as low as possible by limiting the maxconn, and only then estimate the number of servers needed to keep the response time low. Hoping this helps, Willy
Re: BADREQ on production haproxy
Hi Willy, On Sat, May 12, 2012 at 7:08 PM, Willy Tarreau w...@1wt.eu wrote: On Sat, May 12, 2012 at 06:54:06PM +0300, Bar Ziony wrote: I have no problem increasing the RAM if needed, but how do I know if it's needed? Where can I see the number of connections per second to see if I somehow reached 20k ? I have not said 20k conns/s, but 20k concurrent conns. Concurrency is the conns rate times the response time. If the site slows down, concurrency increases. I'm sorry but I didn't quite get what does Concurrency means. Connection/sec * response time ? Why is that = concurrency? I don't think I reached 20k because the global maxconn is 20K OK so most likely you need some tuning on the system. This is my TCP tuning config for the LB: # TCP stack tuning (...) nothing wrong here, but please check : tcp_mem tcp_rmem tcp_wmem Here they are: net.ipv4.tcp_mem = 24372 32496 48744 net.ipv4.tcp_wmem = 4096 16384 1039872 net.ipv4.tcp_rmem = 4096 87380 1039872 Are those valid? This is my haproxy.cfg: global daemon user haproxy group proxy log 127.0.0.1 local0 log-send-hostname maxconn 2 You can reduce haproxy's memory usage by reducing buffer sizes this way here : tune.bufsize 8030 tune.maxrewrite 1030 But would this hurt somehow? I can increase the RAM if that will solve the problem! I just wonder how it is possible that haproxy is using so much RAM, when I didn't see so much RAM usage from my old single web server (nginx). I don't want to configure stuff for a low-RAM machine if I actually need more RAM. We have no problem paying for a bigger VPS (but unfortunately we must stay on this VPS infrastructure). Any help would be much appreciated, we're experiencing issues with less traffic than before haproxy... Which is quite the opposite of the seeked goal I can imagine ! You should really check what consumes memory on your system. I once hit an issue on a machine which was centralizing logs each night, but the problem is that these logs were completely buffered before being transferred, causing the system to swap. Only nginx is running on this machine as well to terminate SSL, but it seems like haproxy is the one that consumes all memory. Nothing else is running on the machines besides syslog for haproxy and the machine itself, regular processes and munin plugins every 5 minutes (which are not causing any RAM issues)... Ideally you should disable swap. Any component which swaps in web environments is going to definitely kill all the performance and make things worse. Better have the component die and switchover to the redundant one than having it swap and stay alive! Very good idea, especially for our LBs. I will do that. BTW, I'm just thinking at one point since you're virtualized : are you *certain* that you have the 1GB memory dedicated to you and that you've not enabled any form of ballooning ? Ballooning is a clever technique some hypervisors use to allow vendors to sell the memory multiple times to their customers, because the principle is that the hypervisor allocates the RAM from the VMs when it needs some! It easily allows a 64G machine to be split into 256*1G VMs by stealing the unused memory from the VMs. But when these VMs try to use their memory, they can only swap :-) Yes, I'm certain that our VPs provider is not doing this. We have the entire 1GB for ourselves. Cheers, Willy I just wonder what would be the best approach? Should I try the 2GB RAM machine? Thanks! Bar.
Re: Stats for backend queue
Hi Willy :) On Sat, May 12, 2012 at 10:06 PM, Willy Tarreau w...@1wt.eu wrote: On Sat, May 12, 2012 at 08:43:43PM +0300, Bar Ziony wrote: So session rate is the number of requests per second ? Why is it called session then if it's really requests? You have the two. Initially in haproxy, you had no keepalive, so 1 req = 1 session. Now you have the numbers in the session column, and if you pass your mouse over the number, you'll see the requests too. Great, just saw that. and that's only for the frontend because only it enables keepalive and then req/sec session/sec ? And Sessions is just plain sessions number, without caring for how much of them were happening in 1 sec? sessions cur is the number of concurrent sessions observed at the moment it is reported. It is not related to any timing, it's what is observed at one instant. To keep the analogy with the highway, it's how many lanes are occupied at the precise instant you're taking the snapshot. How can I know the average response time of my servers? haproxy provides that data somewhere? Yes you have each response time value in your logs :-) But can I get the average response time? Also, this is the response time of the backend, or the full response time ? I have a max of 800 requests in the backend queue (none in the servers queue since there is no persistence). Is that a lot ? :| It depends. If you're doing 800 reqs/s, you know that on average it will take one second to drain these 800 requests, so it can be a lot. But if you're facing an exceptional event, maybe you accept to delay requests by up to 1s instead of seeing your servers die or stop responding. I also see 3,400 sessions in the frontend, and only ~100 in the dynamic backend and 15 in the static backend (in the cur column). How is that possible? So many requests are not valid, or sessions are kept and are not for 1 request only ? :\ I don't understand that.. It's because a connection is only forwarded to the backend once the client has sent a request. And for some clients, sending a request takes some time (eg: large requests, or simply because of poor network connectivity), so you're always having more connections on the frontend than on the backend. Also, haproxy closes the connection to the server as soon as it has the last byte of the response, but it still forwards those data to the client (so it acts as a TCP buffer). During this response buffering, the clients are still connected to the frontend but the backend is already released. This behaviour enables some multiplexing of the server connections, because they never remain idle, even if the clients are slow to read the responses. OK, got why there are more frontend sessions than backend sessions. But is it usual to see so much more? I'm sorry but I didn't quite get what does Concurrency means. It is the number of parallel sessions you have at one instant. When you do netstat -an | grep -c ESTAB, you get a number of concurrent connections. Connection/sec * response time ? Why is that = concurrency? If you're not used to this, you need to draw it on paper to understand. Imagine a road passing on a bridge. Your bridge is designed to support 100 cars. This is its concurrency limit. The response time is the time a car spends on the bridge. The cars enter the bridge at a rate of 4 per second. If the cars last more than 25 seconds on the bridge, you'll have more than 100 cars on your bridge and it might break. If you make your cars run faster on the bridge, they will last there less time and there will be less cars on it. If you are on a holiday season, you'll get a higher car rate at the input of the bridge, and if they don't run faster, you'll break the limit again. Cool, thanks :) Here they are: net.ipv4.tcp_mem = 24372 32496 48744 net.ipv4.tcp_wmem = 4096 16384 1039872 net.ipv4.tcp_rmem = 4096 87380 1039872 Are those valid? So you have between 24372*4096 and 48744*4096 = 100..200 MB of RAM assigned to the TCP stack, which is fine considering your VM size. Your read and write buffers are correct too (the kernel automatically adjusts them depending on the available memory). However you have to be aware that a socket buffer needs at least 4kB in each direction (hence why the min limit is 4kB), so 200 MB limits you to 200/2/4 = 25k sockets, 20k of which can be on the frontend side and 5k of which may be on the backend side. Didn't get much of that, I'll read about tcp_[rm]?mem some more :) You can reduce haproxy's memory usage by reducing buffer sizes this way here : tune.bufsize 8030 tune.maxrewrite 1030 But would this hurt somehow? It would only hurt if your average object size is larger than 7kB. And it would not hurt that much, it would only eat a bit more CPU because haproxy would have to perform twice the number of recv/send to forward data between sockets. If you're using mostly large
Re: Stats for backend queue
Oh, thanks. Small value = 10 sec for example? :| What is an optimal keepalive timeout? Thanks, Bar. On Sat, May 12, 2012 at 10:51 PM, Cyril Bonté cyril.bo...@free.fr wrote: Hi, Le 12/05/2012 21:42, Bar Ziony a écrit : OK, got why there are more frontend sessions than backend sessions. But is it usual to see so much more? In the configuration you provided, you didn't set any timeout http-keep-alive. It means that on your frontend, your keep-alive timeout is equal to your client timeout : 90 seconds is maybe too long for your server. Try to add a timeout with a small value. -- Cyril Bonté
Re: Stats for backend queue
Is there a benefit to allow a larger keepalive timeout so more than resources from 1 page will be downloaded, or is it just best to create a new connection for succeeding pages? Willy, did you see my previous email in this correspondence? :) Thanks, Bar. On Sat, May 12, 2012 at 11:18 PM, Willy Tarreau w...@1wt.eu wrote: On Sat, May 12, 2012 at 11:04:49PM +0300, Bar Ziony wrote: Oh, thanks. Small value = 10 sec for example? :| What is an optimal keepalive timeout? I like to use just a few seconds so that all objects from the same page are fetched at once and the connection automatically closes after this. But your mileage may vary. Willy
Re: Stats for backend queue
Thank you Willy! I increased the RAM to 2GB and now I don't see the problem. I will change the buffers size and report back as well. Why did you recommend bufsize of 8130 and not 8192? Also, Why do I need to change tune.maxrewrite ? Thanks, Bar. On Sat, May 12, 2012 at 11:42 PM, Willy Tarreau w...@1wt.eu wrote: On Sat, May 12, 2012 at 11:31:03PM +0300, Bar Ziony wrote: Is there a benefit to allow a larger keepalive timeout so more than resources from 1 page will be downloaded, or is it just best to create a new connection for succeeding pages? It depends on your available memory. Ideally you'd keep connections open long enough for two consecutive clicks to be performed in the same connection, but if this means having 10* the amount of memory, it probably isn't worth it. Willy, did you see my previous email in this correspondence? :) Yes, just replied. Willy
BADREQ on production haproxy
Hey, We're running haproxy 1.4.20 as our LB, nginx is listening on the same machine on port 443 and terminating SSL, proxying the unencrypted requests to haproxy on localhost:80. I see many of these errors on the haproxy log: May 10 15:54:06 lb-01 haproxy[6563]: 1.1.1.1:50929[10/May/2012:15:54:01.113] public public/NOSRV -1/-1/-1/-1/5519 400 187 - - CR-- 3045/3045/0/0/0 0/0 BADREQ * I changed the source IP for the sake of this example. We get around 5-15 of these per second, and I checked some of the IPs and it seems at least some of them are IPs that users registered with (maybe it's a very big proxy or something so it's not actually those users). We're running on a pretty fast Linode VPS (1GB RAM), it handled 5000 requests per second on testing (which is low, I know, but it is still a VPS). We are doing much less than 5000 req/sec ... The CPU usage is 10-20% for haproxy alone (10% more for nginx), and 10-20% RAM usage for haproxy (~150MB RES, ~180MB VIRT). Does that make sense? What are these requests? Is it possible these are regular users trying to somehow get to our web app and not succeeding ? Thanks, Bar.
Re: HAProxy and SSL traffic termination
Alexander, I just implemented such a setup, with nginx listening on the LB for HTTP requests (port 443), proxying via HTTP to haproxy on the same machine. HTTP requests are coming straight to haproxy and from there to our app servers. There is a 2nd LB that is a replica of the first, and a keepalived daemon keeping a floating IP on one of them. This way you don't have any SPOF. As for performance, I did a small benchmark for our use case, stud was a bit faster than nginx (900 requests/sec vs 800 requests/sec, no keepalive so this is checking SSL performance). Using 64-bit has MUCH better performance in SSL for some reason. More than x2 requests rate. Please note that this setup doesn't scale on the SSL tier. We are planning on vertically increasing the LB's capacity by more powerful hardware, if it will be needed. If you need full scaling capabilities on the SSL tier, you're better of using some kind of IP load balancer such as LVS in front, forwarding SSL stuff to a SSL farm which is scalable and regular HTTP traffic to haproxy (scalable as well). Don't take my experiments for granted, I'm new to this game. I hope this helps. P.S. Willy - Putting your help and information to use ! ;) Regards, Bar. On Thu, May 3, 2012 at 9:56 AM, Alexander Kamardash alexander.kamard...@trusteer.com wrote: Hi, ** ** I am pretty sure that termination traffic on Pound, Apache or Nginx will do a work. My question is more about performance of such solution. It will eb a entrance point and I don't want to create a single point of failure. In case of splitting it to 2 LB layers HAProxy- SSL termination-backend servers - create additional complexity. ** ** Alexander Kamardash ** ** *From:* adu...@fireitup.net [mailto:adu...@fireitup.net] *On Behalf Of *Vikram Adukia *Sent:* Thursday, May 03, 2012 1:38 AM *To:* Alexander Kamardash *Cc:* haproxy@formilux.org *Subject:* Re: HAProxy and SSL traffic termination ** ** A fairly easy configuration is to have Pound SSL sitting in front of HAProxy. I don't have benchmark numbers, but the configuration is fairly simple: ** ** Pound:443 - Haproxy:80 (or really any tcp port that haproxy is listening on) ** ** Here's most of my pound.cfg file: ** ** ListenHTTPS Address 0.0.0.0 Port443 # Obviously, adjust this to point to wherever your ssl cert is Cert/etc/ssl/yourssl.pem End ** ** Service Backend # in this configuration, haproxy is sitting on the same server as pound Address 127.0.0.1 Port 80 End End ** ** On Wed, May 2, 2012 at 3:00 PM, Baptiste bed...@gmail.com wrote: On Wed, May 2, 2012 at 3:46 PM, Alexander Kamardash alexander.kamard...@trusteer.com wrote: Hi, We want to perform LB, SSL termination and L7 on HAProxy. Could you please advise the best approach for it? We are interested in max performance and not complicated configuration. If you are already running such configuration, pls share what is the max connection rate you reach. - Alexander Hi, If you can wait a bit, HAProxy will do SSL endpoint for you. Waiting that, either nginx or stud looks to perform quite well. cheers ** **
Re: HAProxy and SSL traffic termination
Adding the list. On Thu, May 3, 2012 at 11:09 AM, Bar Ziony bar...@gmail.com wrote: Alexander, Yes, we're using Linode servers. I chose the 1024 linode. Since it's very easy to change that, choose something and test :) I've reached around 800 req/sec with SSL and ~5000 req/sec with HTTP. This is actually very low for haproxy, and is because of the Virtualized overhead. This is much more than we need anyway, so it's fine by us. On Thu, May 3, 2012 at 10:44 AM, Alexander Kamardash alexander.kamard...@trusteer.com wrote: Thank you Bar. ** ** Are you planning to use Linode servers ? What are the HW specs of node that you chose? You reached few thousands req\s ? Bottleneck is in CPU, I/O or network ? ** ** Alexander Kamardash ** ** *From:* Bar Ziony [mailto:bar...@gmail.com] *Sent:* Thursday, May 03, 2012 10:16 AM *To:* Alexander Kamardash *Cc:* Vikram Adukia; haproxy@formilux.org *Subject:* Re: HAProxy and SSL traffic termination ** ** Alexander, ** ** I just implemented such a setup, with nginx listening on the LB for HTTP requests (port 443), proxying via HTTP to haproxy on the same machine. HTTP requests are coming straight to haproxy and from there to our app servers. ** ** There is a 2nd LB that is a replica of the first, and a keepalived daemon keeping a floating IP on one of them. This way you don't have any SPOF.** ** ** ** As for performance, I did a small benchmark for our use case, stud was a bit faster than nginx (900 requests/sec vs 800 requests/sec, no keepalive so this is checking SSL performance). Using 64-bit has MUCH better performance in SSL for some reason. More than x2 requests rate. ** ** Please note that this setup doesn't scale on the SSL tier. We are planning on vertically increasing the LB's capacity by more powerful hardware, if it will be needed. If you need full scaling capabilities on the SSL tier, you're better of using some kind of IP load balancer such as LVS in front, forwarding SSL stuff to a SSL farm which is scalable and regular HTTP traffic to haproxy (scalable as well). ** ** Don't take my experiments for granted, I'm new to this game. I hope this helps. ** ** P.S. Willy - Putting your help and information to use ! ;) ** ** Regards, Bar. ** ** On Thu, May 3, 2012 at 9:56 AM, Alexander Kamardash alexander.kamard...@trusteer.com wrote: Hi, I am pretty sure that termination traffic on Pound, Apache or Nginx will do a work. My question is more about performance of such solution. It will eb a entrance point and I don't want to create a single point of failure. In case of splitting it to 2 LB layers HAProxy- SSL termination-backend servers - create additional complexity. Alexander Kamardash *From:* adu...@fireitup.net [mailto:adu...@fireitup.net] *On Behalf Of *Vikram Adukia *Sent:* Thursday, May 03, 2012 1:38 AM *To:* Alexander Kamardash *Cc:* haproxy@formilux.org *Subject:* Re: HAProxy and SSL traffic termination A fairly easy configuration is to have Pound SSL sitting in front of HAProxy. I don't have benchmark numbers, but the configuration is fairly simple: Pound:443 - Haproxy:80 (or really any tcp port that haproxy is listening on) Here's most of my pound.cfg file: ListenHTTPS Address 0.0.0.0 Port443 # Obviously, adjust this to point to wherever your ssl cert is Cert/etc/ssl/yourssl.pem End Service Backend # in this configuration, haproxy is sitting on the same server as pound Address 127.0.0.1 Port 80 End End On Wed, May 2, 2012 at 3:00 PM, Baptiste bed...@gmail.com wrote: On Wed, May 2, 2012 at 3:46 PM, Alexander Kamardash alexander.kamard...@trusteer.com wrote: Hi, We want to perform LB, SSL termination and L7 on HAProxy. Could you please advise the best approach for it? We are interested in max performance and not complicated configuration. If you are already running such configuration, pls share what is the max connection rate you reach. - Alexander Hi, If you can wait a bit, HAProxy will do SSL endpoint for you. Waiting that, either nginx or stud looks to perform quite well. cheers ** **
nginx alone performs x2 than haproxy-nginx
Hi, I have 2 questions about a haproxy setup I configured. This is the setup: LB server (haproxy 1.4.20, debian squeeze 64-bit) in http mode, forwarding requests to a single nginx web server, that resides on a different machine. I'll paste the haproxy config at the end of this message. 1. Benchmarking: When doing some benchmarking with 'ab' or 'siege', for a small (2 bytes, single char) file: ab -n 1 -c 40 http://lb/test.html VS ab -n 1 -c 40 http://web-01/test.html web-01 directly gets 6000-6500 requests/sec. haproxy-nginx gets 3000 requests/sec. When using ab -k to enable keepalives, nginx is getting 12,000 requests/sec, and haproxy gets around 6000-7000 requests/sec. I wanted to ask if the x2 difference is normal? I tried to remove the ACL for checking if the path ends with PHP, the results were not different. 2. As you can see, I separate the dynamic (PHP) requests from other (static) requests. a. Is this the way to do it (path_end .php) ? b. I limit the number of connections to the dynamic backend server(s). I just set it according to the number of FastCGI PHP processes available on that machine. How do I check/benchmark what is the limit for the static backend? Or is it not needed? My nginx config is quite trivial. Here is my haproxy config: global daemon user haproxy group proxy log 127.0.0.1 local0 log-send-hostname maxconn 2 defaults mode http log global retries 2 timeout client 90s # Client and server timeout must match the longest. timeout server 90s # Time we may wait for a response from the server. timeout queue 90s # Don't queue requests too long if saturated. timeout connect 4s # There's no reason to change this one. option abortonclose# Close aborted connections if they still didn't reach a backend (e.g still in a queue). option http-server-close # Enable HTTP connection closing on the server (backend) side. frontend public bind :80 maxconn 19500 option httplog # Add the backend server ID as a response header rspadd X-Backend:\ 0 if { srv_id 1 } rspadd X-Backend:\ 1 if { srv_id 2 } # Use dynamic backend if the request path ends with .php, fallback to the default static otherwise acl url_dynamic path_end .php use_backend dynamic if url_dynamic default_backend static backend dynamic balance roundrobin option forwardfor except 127.0.0.1 # Set the client IP in X-Forwarded-For except for when the client IP is loopback (nginx SSL termination). server web web:80 maxconn 50 check disabled server web-01 web-01:80 maxconn 50 check backend static balance roundrobin server web web:80 check disabled server web-01 web-01:80 check # Enable the stats page on a dedicated port () listen stats # Uncomment 'disabled' below to disable the stats page # disabled bind : stats uri / stats realm HAProxy\ Statistics stats auth admin:mypassword Thanks! Bar.
Re: nginx alone performs x2 than haproxy-nginx
Hi Willy, Thanks for your time. I really didn't know this are such low results. I ran 'ab' from a different machine than haproxy and nginx (which are different machines too). I also tried to run 'ab' from multiple machines (not haproxy or nginx) and the results are pretty much / 3 the single result 'ab' result... I'm using VPS machines from Linode.com, they are quite powerful. They're based on Xen. I don't see the network card saturated. As for nf_conntrack, I have iptables enabled with rules as a firewall on each machine, I stopped it on all involved machines and I still get those results. nf_conntrack is compiled to the kernel (it's a kernel provided by Linode) so I don't think I can disable it completely. Just not use it (and not use any firewall between them). Even if 6-7K is very low (for nginx directly), why is haproxy doing half than that? about nginx static backend maxconn - what is a high maxconn number? Just the limit I can see with 'ab'? Thanks, Bar. On Sun, Apr 29, 2012 at 4:27 PM, Willy Tarreau w...@1wt.eu wrote: Hi Bar, On Sun, Apr 29, 2012 at 02:09:42PM +0300, Bar Ziony wrote: Hi, I have 2 questions about a haproxy setup I configured. This is the setup: LB server (haproxy 1.4.20, debian squeeze 64-bit) in http mode, forwarding requests to a single nginx web server, that resides on a different machine. I'll paste the haproxy config at the end of this message. 1. Benchmarking: When doing some benchmarking with 'ab' or 'siege', for a small (2 bytes, single char) file: ab -n 1 -c 40 http://lb/test.html VS ab -n 1 -c 40 http://web-01/test.html web-01 directly gets 6000-6500 requests/sec. haproxy-nginx gets 3000 requests/sec. This is extremely low, it's approximately what I achieve on a sub-1watt 500 MHz Geode LX, and I guess you're running on much larger hardware since you're saying it's 64-bit. When using ab -k to enable keepalives, nginx is getting 12,000 requests/sec, and haproxy gets around 6000-7000 requests/sec. Even this is very low. Note that the 6-7k here relates to what nginx support above without keep-alive so it might make sense, but all these numbers seem very low in general. I wanted to ask if the x2 difference is normal? I tried to remove the ACL for checking if the path ends with PHP, the results were not different. Is ab running on the same machine as haproxy ? Do you have nf_conntrack loaded on any of the systems ? Do you observe any process reaching 100% CPU somewhere ? Aren't you injecting on a 100 Mbps NIC ? 2. As you can see, I separate the dynamic (PHP) requests from other (static) requests. a. Is this the way to do it (path_end .php) ? It looks fine. Other people like to store all their statics in a small set of directories and use path_beg with these prefixes instead. But it depends on how you classify your URLs in fact. b. I limit the number of connections to the dynamic backend server(s). I just set it according to the number of FastCGI PHP processes available on that machine. How do I check/benchmark what is the limit for the static backend? Or is it not needed? Nginx performs quite well in general and specially as a static file server. You may well set a high maxconn or none at all on the static backend, you won't harm it. Otherwise I found nothing suspect in your config. Regards, Willy
Re: nginx alone performs x2 than haproxy-nginx
Willy, Thanks as always for the very detailed and helpful answer. I'll reply in-line, like you ;-) On Sun, Apr 29, 2012 at 7:18 PM, Willy Tarreau w...@1wt.eu wrote: On Sun, Apr 29, 2012 at 05:25:01PM +0300, Bar Ziony wrote: Hi Willy, Thanks for your time. I really didn't know this are such low results. I ran 'ab' from a different machine than haproxy and nginx (which are different machines too). I also tried to run 'ab' from multiple machines (not haproxy or nginx) and the results are pretty much / 3 the single result 'ab' result... OK so this clearly means that the limitation comes from the tested components and not the machine running ab. I'm using VPS machines from Linode.com, they are quite powerful. They're based on Xen. I don't see the network card saturated. OK I see now. There's no point searching anywhere else. Once again you're a victim of the high overhead of virtualization that vendors like to pretend is almost unnoticeable :-( The overhead is really that huge? As for nf_conntrack, I have iptables enabled with rules as a firewall on each machine, I stopped it on all involved machines and I still get those results. nf_conntrack is compiled to the kernel (it's a kernel provided by Linode) so I don't think I can disable it completely. Just not use it (and not use any firewall between them). It's having the module loaded with default settings which is harmful, so even unloading the rules will not change anything. Anyway, now I'm pretty sure that the overhead caused by the default conntrack settings is nothing compared with the overhead of Xen. Why is it harmful that it loaded with default setteings? Could it be disabled? Even if 6-7K is very low (for nginx directly), why is haproxy doing half than that? That's quite simple : it has two sides so it must process twice the number of packets. Since you're virtualized, you're packet-bound. Most of the time is spent communicating with the host and with the network, so the more the packets and the less performance you get. That's why you're seeing a 2x increase even with nginx when enabling keep-alive. 1. Can you explain what does it mean that I'm packet-bound, and why is it happening since I'm using virtualization? 2. When you say twice the number of packets, you mean: Client sends request (as 1 or more packets) to haproxy which intercepts it, acts upon it and sends a new request (1 or more packets) to the server, which then again sends the response, that's why it's twice the number of packets? It's not twice the bandwidth of using the web-server directly right? I'd say that your numbers are more or less in line with a recent benchmark we conducted at Exceliance and which is summarized below (each time the hardware was running a single VM) : http://blog.exceliance.fr/2012/04/24/hypervisors-virtual-network-performance-comparison-from-a-virtualized-load-balancer-point-of-view/ (BTW you'll note that Xen was the worst performer here with 80% loss compared to native performance). In your case it's very unlikely that you'd have dedicated hardware, and since you don't have access to the host, you don't know what its settings are, so I'd say that what you managed to reach is not that bad for such an environment. You should be able to slightly increase performance by adding the following options in your defaults section : option tcp-smart-accept option tcp-smart-connect Thanks! I think it did help and now I get 3700 req/sec without -k , and almost 5000 req/sec with -k. I do have a small issue (it was there before I added these options): when doing 'ab -n 1 -c 60 http://lb-01/test.html', 'ab' gets stuck for a second or two at the end, causing the req/sec to drop to around 2000 req/sec. If I Ctrl+c before the end, I see the numbers above. Is this happening because of 'ab' or because of something with my setup? With -k it doesn't happen. And I also think it doesn't always happen with the second, passive LB (when I tested it). Each of them will save one packet during the TCP handshake, which may slightly compensate for the losses caused by virtualization. Note that I have also encountered a situation once where conntrack was loaded on the hypervisor and not tuned at all, resulting in extremely low performance. The effect is that the performance continuously drops as you add requests, until your source ports roll over and the performance remains stable. In your case, you run with only 10k reqs, which is not enough to measure the performance under such conditions. You should have one injecter running a constant load (eg: 1M requests in loops) and another one running the 10k reqs several times in a row to observe if the results are stable or not. What do you mean by until your source ports rollover ? I'm sorry, but I didn't quite understand the meaning of your proposed check? about nginx static backend maxconn - what is a high maxconn number