> On Jun 30, 2016, at 7:15 AM, Ihar Hrachyshka <[email protected]> wrote: > >> >> On 30 Jun 2016, at 01:16, Brandon Logan <[email protected]> wrote: >> >> Hi Ihar, thanks for starting this discussion. Comments in-line. >> >> After writing my comments in line, I might now realize that you're just >> talking about documenting a way for a user to do this, and not have >> Octavia handle it at all. If that's the case I apologize for my reading >> comprehension, but I'll keep my comments in case I'm wrong. My brain is >> not working well today, sorry :( > > Right. All the mechanisms needed to apply the approach are already in place > in both Octavia and Neutron as of Mitaka. The question is mostly about > whether the team behind the project may endorse the alternative approach in > addition to whatever is in the implementation in regards to failovers by > giving space to describe it in the official docs. I don’t suggest that the > approach is the sole documented, or that octavia team need to implement > anything. [That said, it may be wise to look at providing some smart scripts > on top of neutron/octavia API that would realize the approach without putting > the burden of multiple API calls onto users.]
I don’t have a problem documenting it, but I also wouldn’t personally want to recommend it. We’re adding a layer of NAT, which has performance and HA implications of its own. We’re adding FIPs, when the neutron advice for “simple nova-net like deployment” is provider nets and linuxbridge, which don’t support them. Thanks, doug > >> >> Thanks, >> Brandon >> >> On Wed, 2016-06-29 at 18:14 +0200, Ihar Hrachyshka wrote: >>> Hi all, >>> >>> I was looking lately at upgrades for octavia images. This includes using >>> new images for new loadbalancers, as well as for existing balancers. >>> >>> For the first problem, the amp_image_tag option that I added in Mitaka >>> seems to do the job: all new balancers are created with the latest image >>> that is tagged properly. >>> >>> As for balancers that already exist, the only way to get them use a new >>> image is to trigger an instance failure, that should rebuild failed nova >>> instance, using the new image. AFAIU the failover process is not currently >>> automated, requiring from the user to set the corresponding port to DOWN >>> and waiting for failover to be detected. I’ve heard there are plans to >>> introduce a specific command to trigger a quick-failover, that would >>> streamline the process and reduce the time needed for the process because >>> the failover would be immediately detected and processed instead of waiting >>> for keepalived failure mode to occur. Is it on the horizon? Patches to >>> review? >> >> Not that I know of and with all the work slated for Newton, I'm 99% sure >> it won't be done in Newton. Perhaps Ocata. > > I see. Do we maybe want to provide a smart script that would help to trigger > a failover with neutron API? [detect the port id, set it to DOWN, …] > >>> >>> While the approach seems rather promising and may be applicable for some >>> environments, I have several concerns about the failover approach that we >>> may want to address. >>> >>> 1. HA assumption. The approach assumes there is another node running >>> available to serve requests while instance is rebuilding. For non-HA >>> amphoras, it’s not the case, meaning the image upgrade process has a >>> significant downtime. >>> >>> 2. Even if we have HA, for the time of instance rebuilding, the balancer >>> cluster is degraded to a single node. >>> >>> 3. (minor) during the upgrade phase, instances that belong to the same HA >>> amphora may run different versions of the image. >>> >>> What’s the alternative? >>> >>> One idea I was running with for some time is moving the upgrade complexity >>> one level up. Instead of making Octavia aware of upgrade intricacies, allow >>> it to do its job (load balance), while use neutron floating IP resource to >>> flip a switch from an old image to a new one. Let me elaborate. >> I'm not sure I like the idea of tying this to floating IP as there are >> deployers who do not use floating IPs. Then again, we are currently >> depending on allowed address pairs which is also an extension, but I >> suspect its probably deployed in more places. I have no proof of this >> though. > > I guess you already deduced that, but just for the sake of completeness: no, > I don’t suggest that octavia ties its backend to FIPs. I merely suggest to > document the proposed approach as ‘yet another way of doing it’, at least > until we tackle the first two concerns raised. > >>> >>> Let’s say we have a load balancer LB1 that is running Image1. In this >>> scenario, we assume that access to LB1 VIP is proxied through a floating ip >>> FIP that points to LB1 VIP. Now, the operator uploaded a new Image2 to >>> glance registry and tagged it for octavia usage. The user now wants to >>> migrate the load balancer function to using the new image. To achieve this, >>> the user follows the steps: >>> >>> 1. create an independent clone of LB1 (let’s call it LB2) that has exact >>> same attributes (members) as LB1. >>> 2. once LB2 is up and ready to process requests incoming to its VIP, >>> redirect FIP to the LB2 VIP. >>> 3. now all new flows are immediately redirected to LB2 VIP, no downtime >>> (for new flows) due to atomic nature of FIP update on the backend (we use >>> iptables-save/iptables-restore to update FIP rules on the router). >> Will this sever any existing connections? Is there a way to drain >> connections? Or is that already done? > > Not sure. Hopefully conntrack entries still apply until you shutdown the node > or close all current sessions. I don’t know of a way to detect if there are > active sessions running. The safe fallback would be giving the load balancer > enough time for any connections to die (a day?) before deprovisioning the old > balancer. > >>> 4. since LB1 is no longer handling any flows, we can deprovision it. LB2 is >>> now the only balancer handling members. >>> >>> With that approach, 1) we provide for consistent downtime expectations >>> irrelevant to amphora architecture chosen (HA or not); 2) we flip the >>> switch when the clone is up and ready, so no degraded state for the >>> balancer function; 3) all instances in an HA amphora run the same image. >>> >>> Of course, it won’t provide no downtime for existing flows that may already >>> be handled by the balancer function. That’s a limitation that I believe is >>> shared by all approaches currently at the table. >>> >>> As a side note, the approach would work for other lbaas drivers, like >>> namespaces, f.e. in case we want to update haproxy. >>> >>> Several questions in regards to the topic: >>> >>> 1. are there any drawbacks with the approach? can we consider it an >>> alternative way of doing image upgrades that could find its way into >>> official documentation? >> >> Echoing my comment above of being tightly coupled with floating IPs is a >> draw back. >> >> Another way would be to make use of the allowed address pairs: >> 1) spin up a clone of the amp cluster for a loadbalancer but don't bring >> up the VIP IP Interface and don't start keepalived (or just prevent >> garping) >> 2) update the allowed address pairs for the clones to accept the vip IP >> 3) bring up VIP IP interface up and start keepalived (or do a garp) >> 4) stop keepalived on the old cluster, take the interface down >> 5) deprovision old cluster. >> >> I feel bad things can happen between 3 and 4 though. This is just a >> thought to play around with, I'm sure I'm not realizing some minute >> details that may cause this to not work. Plus, its a bit more involved >> that the FIP solution you proposed. > > I think there is benefit to discuss how to make upgrades more atomic. Pairs > are indeed something to consider, that would allow us to proceed without > introducing port replug in neutron. > > Anyway, that’s a lot more involving than either FIP or failover approach, and > would take a lot of time to properly plan for it. > >>> >>> 2. if the answer is yes, then how can I contribute the piece? should I sync >>> with some other doc related work that I know is currently ongoing in the >>> team? >>> >>> Ihar >>> __________________________________________________________________________ >>> OpenStack Development Mailing List (not for usage questions) >>> Unsubscribe: [email protected]?subject:unsubscribe >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: [email protected]?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
