Re: [openstack-dev] [Neutron][LBaaS] HA functionality discussion

Carlos Garza Thu, 17 Apr 2014 18:18:07 -0700

On Apr 17, 2014, at 5:49 PM, Stephen Balukoff 
<[email protected]<mailto:[email protected]>>
 wrote:


Heyas, y'all!

So, given both the prioritization and usage info on HA functionality for 
Neutron LBaaS here:  
https://docs.google.com/spreadsheet/ccc?key=0Ar1FuMFYRhgadDVXZ25NM2NfbGtLTkR0TDFNUWJQUWc&usp=sharing

It's clear that:

A. HA seems to be a top priority for most operators
B. Almost all load balancer functionality deployed is done so in an 
Active/Standby HA configuration

I know there's been some round-about discussion about this on the list in the 
past (which usually got stymied in "implementation details" disagreements), but 
it seems to me that with so many players putting a high priority on HA 
functionality, this is something we need to discuss and address.

This is also apropos, as we're talking about doing a major revision of the API, 
and it probably makes sense to seriously consider if or how HA-related stuff 
should make it into the API. I'm of the opinion that almost all the HA stuff 
should be hidden from the user/tenant, but that the admin/operator at the very 
least is going to need to have some visibility into HA-related functionality. 
The hope here is to discover what things make sense to have as a "least common 
denominator" and what will have to be hidden behind a driver-specific 
implementation.

I certainly have a pretty good idea how HA stuff works at our organization, but 
I have almost no visibility into how this is done elsewhere, leastwise not 
enough detail to know what makes sense to write API controls for.

So! Since gathering data about actual usage seems to have worked pretty well 
before, I'd like to try that again. Yes, I'm going to be asking about 
implementation details, but this is with the hope of discovering any "least 
common denominator" factors which make sense to build API around.

For the purposes of this document, when I say "load balancer devices" I mean 
either physical or virtual appliances, or software executing on a host 
somewhere that actually does the load balancing. It need not directly 
correspond with anything physical... but probably does. :P

And... all of these questions are meant to be interpreted from the perspective 
of the cloud operator.

Here's what I'm looking to learn from those of you who are allowed to share 
this data:

1. Are your load balancer devices shared between customers / tenants, not 
shared, or some of both?
     If by shared you mean the ability to add and delete loadbalancer Our 
loadbalancers are not shared by different customers which we call accounts. If 
your referring to networking then yes they are on the same clan. Our clusters 
are basically a physical grouping of 4 or 5 stingray devices that share IPs on 
the vip side. The configs are created on all stingray nodes in a cluster. If a 
stingray loadbalancer goes down all its vips will be taken over by one of the 
other 4 or 5 machines. We achieve HA by moving noisy customers IPs to another 
stingray node. The machine taking over an ip will send a gratuitous ARP 
response for the router to train its arp table on.  Usually we have 2 stingray 
nodes available for fail over. We could have spread the load across all boxes 
evenly but we felt that if we were near the end of the capacity for a given 
cluster if one of the nodes tanked this would have degraded performance as the 
other nodes were already nearing capacity.

    We also have the usual dual switch dual router set up incase one dies 
config.

1a. If shared, what is your strategy to avoid or deal with collisions of 
customer rfc1918 address space on back-end networks? (For example, I know of no 
load balancer device that can balance traffic for both customer A and customer 
B if both are using the 10.0.0.0/24<http://10.0.0.0/24> subnet for their 
back-end networks containing the nodes to be balanced, unless an extra layer of 
NATing is happening somewhere.)

    We order a set of CIDR blocks from our backbone and route them to our 
Cluster via a 10Gig/s link which In our bigger clusters can be upgraded via 
link bonding.
downstream we have two routes to one route for our own internal ServiceNet 
10.0.0.0/8 space and the public Internet for everything not on our service net. 
Our pool members are specified by CIDR block only with no association to a 
layer 2 network. When customers create their cloud servers they will be 
assigned an IP with in the address space of 10.0.0.0/24 and also get a publicly 
routable IP address. At that point the customer cane achieve isolation via IP 
tables or what ever tools their VM supports. In theory a user could mistaking 
punch in an IP address in a node that doesn't belong to them but that just 
means the lb will route to only one machine but the loadbalancer would be 
useless at that point. We don't charge our users for bandwidth going across 
service net since each DC has its own service net and our customers want to 
have the LoadBalancer close to their servers anyways. If they want to host back 
end servers on say Amazon hostgater or what ever then then the loadbalancer 
will unfortunately route over the public interet for those. I'm not sure why 
customers would want to do this but we were flexible enough to support it. In 
short HA is achieved through shared ips between our Stingray Nodes.
We have 2 fail over nodes that basically do nothing on standby just in case an 
active node suddenly dies. So I guess you could call this HA n+1. We also 
divide the cluster into two cabinets with a failover in each one heaven forbid 
a whole cabinet should sudden fail. We've never seen this happen Knock on wood.


2. What kinds of metrics do you use in determining load balancing capacity?

    so far we've been measuring bandwidth for the most part as it usually caps 
out before CPU does. Out newest Stingray nodes have 24 cores. We of course 
gather metrics for IP space left (So we can order more ahead of time) We have 
noticed that we are limited to horizontal scaling of around 6 stingray nodes. 
CPU load goes up after 6 nodes which we have determined to be due to the 
rapidly changing configs must be synced across all the stingray nodes, stingray 
has a pretty nasty flaw in how it sends its configs to its other stingray nodes 
in cluster.

    In the case of SSL if an SSL user uses SSL in mixed mode (meaning http and 
https not sure why'ed they do) we actually set up two virtual servers 
transparent to the customers so we track the ssl bandwidth separately but using 
the same SNMP call.

3. Do you operate with a pool of unused load balancer device capacity (which a 
cloud OS would need to keep track of), or do you spin up new capacity (in the 
form of virtual servers, presumably) on the fly?

    Kind of answered in question 1. This doesn't apply to much to us as we use 
physical loadbalancers behind our API. For CLB2.0 we would like to see how we 
would achieve the same level of HA in the virtual space world.

3a. If you're operating with a availability pool, can you describe how new load 
balancer devices are added to your availability pool?  Specifically, are there 
any steps in the process that must be manually performed (ie. so no API could 
help with this)?

    The API could help with some aspects of this. For example we have and are 
advocating a separate management API thats separate from the public one that 
can do things like tell the provisioner(What your calling a scheduler) when new 
capacity is available how to route to it and store this in the database for the 
public API to use in determing how to allocate resources. Our management API in 
particular is used to add IPv4 address space to our database once backbone 
routes them to us. So its like our current process involves the classic Hey 
Back bone I'd like to order a new /22 were running low on ips. Then the 
management interface could then be called to add the CIDR block so that it can 
track the ips in its database.


4. How are new devices 'registered' with the cloud OS? How are they removed or 
replaced?

5. What kind of visibility do you (or would you) allow your user base to see 
into the HA-related aspects of your load balancing services?
    We don't. We view HA in terms of redundant hardware and floating IPs and 
being that end users don't control those its not visible. We do state our 4 
nines uptime which hasten't been broken as well as compensation for violations 
of our end of the SLA agreement.

http://www.rackspace.com/information/legal/cloud/sla
https://status.rackspace.com/

6. What kind of functionality and visibility do you need into the operations of 
your load balancer devices in order to maintain your services, troubleshoot, 
etc.? Specifically, are you managing the infrastructure outside the purview of 
the cloud OS? Are there certain aspects which would be easier to manage if done 
within the purview of the cloud OS?

     We wrote SNMP tools to monitor the Stingray nodes which our executed by  
our Api nodes. Stingray offers a rich oid MIB that allows us to track pretty 
much anything. But we only look at Bandwidth In, Bandwidth out,  and the number 
of concurrent connections. I'm considering adding CPU statistics now actually.


7. What kind of network topology is used when deploying load balancing 
functionality? (ie. do your load balancer devices live inside or outside 
customer firewalls, directly on tenant networks? Are you using layer-3 routing? 
etc.)

Just pure layer3. This limitation has left us wanting a private networking 
solution and during that investigation we arrived here in Neutron/Lbaas.



8. Is there any other data you can share which would be useful in considering 
features of the API that only cloud operators would be able to perform?

   Shared and failoverable IPs is desired. but much of the HA stuff will come 
from the driver/provider. I just think floating IPs just needs to be supported 
by the api or at least query able to see if it supports floating ips.

And since we're one of these operators, here are my responses:

1. We have both shared load balancer devices and private load balancer devices.

1a. Our shared load balancers live outside customer firewalls, and we use IPv6 
to reach individual servers behind the firewalls "directly." We have followed a 
careful deployment strategy across all our networks so that IPv6 addresses 
between tenants do not overlap.

yea us too. We hash the tenant_id into 32bits and use it in bits 64-96 leaving 
the customer with 32 bits to play for their hosts. If they need more then 4 
billion then we have bigger problems. Our cluster is a /48 so were wasting 16 
bits on nothing in the middle.
Cluster, Tenant id, host_id
CCCC:CCCC:CCCC:0000:TTTT:TTTT:HHHH:HHHH



2. The most useful ones for us are "number of appliances deployed" and "number 
and type of load balancing services deployed" though we also pay attention to:
* Load average per "active" appliance
* Per appliance number and type of load balancing services deployed
* Per appliance bandwidth consumption
* Per appliance connections / sec
* Per appliance SSL connections / sec

Since our devices are software appliances running on linux we also track 
OS-level metrics as well, though these aren't used directly in the load 
balancing features in our cloud OS.

3. We operate with an availability pool that our current cloud OS pays 
attention to.

3a. Since the devices we use correspond to physical hardware this must of 
course be rack-and-stacked by a datacenter technician, who also does initial 
configuration of these devices.

4. All of our load balancers are deployed in an active / standby configuration. 
Two machines which make up an active / standby pair are registered with the 
cloud OS as a single unit that we call a "load balancer cluster." Our 
availability pool consists of a whole bunch of these load balancer clusters. 
(The devices themselves are registered individually at the time the cluster 
object is created in our database.) There are a couple manual steps in this 
process (currently handled by the datacenter techs who do the racking and 
stacking), but these could be automated via API. In fact, as we move to virtual 
appliances with these, we expect the entire process to become automated via API 
(first cluster primitive is created, and then "load balancer device objects" 
get attached to it, then the cluster gets added to our availability pool.)

Removal of a "cluster" object is handled by first evacuating any customer 
services off the cluster, then destroying the load balancer device objects, 
then the cluster object. Replacement of a single load balancer device entails 
removing the dead device, adding the new one, synchronizing configuration data 
to it, and starting services.

5. At the present time, all our load balancing services are deployed in an 
active / standby HA configuration, so the user has no choice or visibility into 
any HA details. As we move to Neutron LBaaS, we would like to give users the 
option of deploying non-HA load balancing capacity. Therefore, the only 
visibility we want the user to get is:

* Choose whether a given load balancing service should be deployed in an HA 
configuration ("flavor" functionality could handle this)
* See whether a running load balancing service is deployed in an HA 
configuration (and see the "hint" for which physical or virtual device(s) it's 
deployed on)
* Give a "hint" as to which device(s) a new load balancing service should be 
deployed on (ie. for customers looking to deploy a bunch of test / QA / etc. 
environments on the same device(s) to reduce costs).

Note that the "hint" above corresponds to the "load balancing cluster" alluded 
to above, not necessarily any specific physical or virtual device. This means 
we retain the ability to switch out the underlying hardware powering a given 
service at any time.

Users may also see usage data, of course, but that's more of a generic stats / 
billing function (which doesn't have to do with HA at all, really).

6. We need to see the status of all our load balancing devices, including 
availability, current role (active or standby), and all the metrics listed 
under 2 above. Some of this data is used for creating trend graphs and business 
metrics, so being able to query the current metrics at any time via API is 
important. It would also be very handy to query specific device info (like 
revision of software on it, etc.) Our current cloud OS does all this for us, 
and having Neutron LBaaS provide visibility into all of this as well would be 
ideal. We do almost no management of our load balancing services outside the 
purview of our current cloud OS.

7. Shared load balancers must live outside customer firewalls, private load 
balancers typically live within customer firewalls (sometimes in a DMZ). In any 
case, we use layer-3 routing (distributed using routing protocols on our core 
networking gear and static routes on customer firewalls) to route requests for 
"service IPs" to the "highly available routing IPs" which live on the load 
balancers themselves. (When a fail-over happens, at a low level, what's really 
going on is the "highly available routing IPs" shift from the active to standby 
load balancer.)

We have contemplated using layer-2 topology (ie. directly connected on the same 
vlan / broadcast domain) and are building a version of our appliance which can 
operate in this way, potentially reducing the reliance on layer-3 routes (and 
making things more friendly for the OpenStack environment, which we understand 
probably isn't ready for layer-3 routing just yet).

8. I wrote this survey, so none come to mind for me. :)

Stephen

--
Stephen Balukoff
Blue Box Group, LLC
(800)613-4305 x807
_______________________________________________
OpenStack-dev mailing list
[email protected]<mailto:[email protected]>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Neutron][LBaaS] HA functionality discussion

Reply via email to