> On Jun 30, 2016, at 7:15 AM, Ihar Hrachyshka <[email protected]> wrote:
> 
>> 
>> On 30 Jun 2016, at 01:16, Brandon Logan <[email protected]> wrote:
>> 
>> Hi Ihar, thanks for starting this discussion.  Comments in-line.
>> 
>> After writing my comments in line, I might now realize that you're just
>> talking about documenting  a way for a user to do this, and not have
>> Octavia handle it at all.  If that's the case I apologize for my reading
>> comprehension, but I'll keep my comments in case I'm wrong.  My brain is
>> not working well today, sorry :(
> 
> Right. All the mechanisms needed to apply the approach are already in place 
> in both Octavia and Neutron as of Mitaka. The question is mostly about 
> whether the team behind the project may endorse the alternative approach in 
> addition to whatever is in the implementation in regards to failovers by 
> giving space to describe it in the official docs. I don’t suggest that the 
> approach is the sole documented, or that octavia team need to implement 
> anything. [That said, it may be wise to look at providing some smart scripts 
> on top of neutron/octavia API that would realize the approach without putting 
> the burden of multiple API calls onto users.]

I don’t have a problem documenting it, but I also wouldn’t personally want to 
recommend it.

We’re adding a layer of NAT, which has performance and HA implications of its 
own.

We’re adding FIPs, when the neutron advice for “simple nova-net like 
deployment” is provider nets and linuxbridge, which don’t support them.

Thanks,
doug


> 
>> 
>> Thanks,
>> Brandon
>> 
>> On Wed, 2016-06-29 at 18:14 +0200, Ihar Hrachyshka wrote:
>>> Hi all,
>>> 
>>> I was looking lately at upgrades for octavia images. This includes using 
>>> new images for new loadbalancers, as well as for existing balancers.
>>> 
>>> For the first problem, the amp_image_tag option that I added in Mitaka 
>>> seems to do the job: all new balancers are created with the latest image 
>>> that is tagged properly.
>>> 
>>> As for balancers that already exist, the only way to get them use a new 
>>> image is to trigger an instance failure, that should rebuild failed nova 
>>> instance, using the new image. AFAIU the failover process is not currently 
>>> automated, requiring from the user to set the corresponding port to DOWN 
>>> and waiting for failover to be detected. I’ve heard there are plans to 
>>> introduce a specific command to trigger a quick-failover, that would 
>>> streamline the process and reduce the time needed for the process because 
>>> the failover would be immediately detected and processed instead of waiting 
>>> for keepalived failure mode to occur. Is it on the horizon? Patches to 
>>> review?
>> 
>> Not that I know of and with all the work slated for Newton, I'm 99% sure
>> it won't be done in Newton.  Perhaps Ocata.
> 
> I see. Do we maybe want to provide a smart script that would help to trigger 
> a failover with neutron API? [detect the port id, set it to DOWN, …]
> 
>>> 
>>> While the approach seems rather promising and may be applicable for some 
>>> environments, I have several concerns about the failover approach that we 
>>> may want to address.
>>> 
>>> 1. HA assumption. The approach assumes there is another node running 
>>> available to serve requests while instance is rebuilding. For non-HA 
>>> amphoras, it’s not the case, meaning the image upgrade process has a 
>>> significant downtime.
>>> 
>>> 2. Even if we have HA, for the time of instance rebuilding, the balancer 
>>> cluster is degraded to a single node.
>>> 
>>> 3. (minor) during the upgrade phase, instances that belong to the same HA 
>>> amphora may run different versions of the image.
>>> 
>>> What’s the alternative?
>>> 
>>> One idea I was running with for some time is moving the upgrade complexity 
>>> one level up. Instead of making Octavia aware of upgrade intricacies, allow 
>>> it to do its job (load balance), while use neutron floating IP resource to 
>>> flip a switch from an old image to a new one. Let me elaborate.
>> I'm not sure I like the idea of tying this to floating IP as there are
>> deployers who do not use floating IPs.  Then again, we are currently
>> depending on allowed address pairs which is also an extension, but I
>> suspect its probably deployed in more places.  I have no proof of this
>> though.
> 
> I guess you already deduced that, but just for the sake of completeness: no, 
> I don’t suggest that octavia ties its backend to FIPs. I merely suggest to 
> document the proposed approach as ‘yet another way of doing it’, at least 
> until we tackle the first two concerns raised.
> 
>>> 
>>> Let’s say we have a load balancer LB1 that is running Image1. In this 
>>> scenario, we assume that access to LB1 VIP is proxied through a floating ip 
>>> FIP that points to LB1 VIP. Now, the operator uploaded a new Image2 to 
>>> glance registry and tagged it for octavia usage. The user now wants to 
>>> migrate the load balancer function to using the new image. To achieve this, 
>>> the user follows the steps:
>>> 
>>> 1. create an independent clone of LB1 (let’s call it LB2) that has exact 
>>> same attributes (members) as LB1.
>>> 2. once LB2 is up and ready to process requests incoming to its VIP, 
>>> redirect FIP to the LB2 VIP.
>>> 3. now all new flows are immediately redirected to LB2 VIP, no downtime 
>>> (for new flows) due to atomic nature of FIP update on the backend (we use 
>>> iptables-save/iptables-restore to update FIP rules on the router).
>> Will this sever any existing connections? Is there a way to drain
>> connections? Or is that already done?
> 
> Not sure. Hopefully conntrack entries still apply until you shutdown the node 
> or close all current sessions. I don’t know of a way to detect if there are 
> active sessions running. The safe fallback would be giving the load balancer 
> enough time for any connections to die (a day?) before deprovisioning the old 
> balancer.
> 
>>> 4. since LB1 is no longer handling any flows, we can deprovision it. LB2 is 
>>> now the only balancer handling members.
>>> 
>>> With that approach, 1) we provide for consistent downtime expectations 
>>> irrelevant to amphora architecture chosen (HA or not); 2) we flip the 
>>> switch when the clone is up and ready, so no degraded state for the 
>>> balancer function; 3) all instances in an HA amphora run the same image.
>>> 
>>> Of course, it won’t provide no downtime for existing flows that may already 
>>> be handled by the balancer function. That’s a limitation that I believe is 
>>> shared by all approaches currently at the table.
>>> 
>>> As a side note, the approach would work for other lbaas drivers, like 
>>> namespaces, f.e. in case we want to update haproxy.
>>> 
>>> Several questions in regards to the topic:
>>> 
>>> 1. are there any drawbacks with the approach? can we consider it an 
>>> alternative way of doing image upgrades that could find its way into 
>>> official documentation?
>> 
>> Echoing my comment above of being tightly coupled with floating IPs is a
>> draw back.
>> 
>> Another way would be to make use of the allowed address pairs:
>> 1) spin up a clone of the amp cluster for a loadbalancer but don't bring
>> up the VIP IP Interface and don't start keepalived (or just prevent
>> garping)
>> 2) update the allowed address pairs for the clones to accept the vip IP
>> 3) bring up VIP IP interface up and start keepalived (or do a garp)
>> 4) stop keepalived on the old cluster, take the interface down
>> 5) deprovision old cluster.
>> 
>> I feel bad things can happen between 3 and 4 though.  This is just a
>> thought to play around with, I'm sure I'm not realizing some minute
>> details that may cause this to not work.  Plus, its a bit more involved
>> that the FIP solution you proposed.
> 
> I think there is benefit to discuss how to make upgrades more atomic. Pairs 
> are indeed something to consider, that would allow us to proceed without 
> introducing port replug in neutron.
> 
> Anyway, that’s a lot more involving than either FIP or failover approach, and 
> would take a lot of time to properly plan for it.
> 
>>> 
>>> 2. if the answer is yes, then how can I contribute the piece? should I sync 
>>> with some other doc related work that I know is currently ongoing in the 
>>> team?
>>> 
>>> Ihar
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: [email protected]?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: [email protected]?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [email protected]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to