Re: [openstack-dev] [nova][neutron] Migration from nova-network to Neutron for large production clouds

2014-08-29 Thread Joe Harrison
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



On 27/08/14 12:59, Tim Bell wrote:
 -Original Message- From: Michael Still
 [mailto:mi...@stillhq.com] Sent: 26 August 2014 22:20 To:
 OpenStack Development Mailing List (not for usage questions) 
 Subject: Re: [openstack-dev] [nova][neutron] Migration from
 nova-network to Neutron for large production clouds
 ...
 
 Mark and I finally got a chance to sit down and write out a basic
 proposal. It looks like this:
 
 
 Thanks... I've put a few questions inline and I'll ask the experts
 to review the steps when they're back from holidays
 
 == neutron step 0 == configure neutron to reverse proxy calls to
 Nova (part to be written)
 
 == nova-compute restart one == Freeze nova's network state
 (probably by stopping nova-api, but we could be smarter than that
 if required) Update all nova-compute nodes to point Neutron and
 remove nova-net agent for Neutron Nova aware L2 agent Enable
 Neutron Layer 2 agent on each node, this might have the side
 effect of causing the network configuration to be rebuilt for
 some instances API can be unfrozen at this time until ready for
 step 2
 
 
 - Would it be possible to only update some of the compute nodes ?
 We'd like to stage the upgrade if we can in view of scaling risks.
 Worst case, we'd look to do it cell by cell but those are quite
 large already (200+ hypervisors)

I have a few what-ifs when comes to this:-

- - What if the migration fails halfway through? How do we administrate
nova in this situation?

Unfortunately Tim, last time I checked Neutron has no awareness of
Nova's cells (and only recently became aware of nova regions) so I
don't see how this would be taken into account for a migration.

 
 == neutron restart two == Freeze nova's network state (probably
 by stopping nova-api, but we could be smarter than that if
 required) Dump/translate/restore date from Nova-Net to Neutron
 Configure Neutron to point to its own database Unfreeze Nova API
 

I think it's a good idea to be smarter.

 
 - Linked with the point above, we'd like to do the nova-net to
 neutron in stages if we can

Again, this sounds like a nightmare if it fails. This sounds like it's
meant to be one big transaction, but it is anything but.

For this to be done safely in a production cloud (which is one of the
few reasons to actually do a replacement instead of just swapping out
the component), we need to be able to run Neutron and Nova-net at the
same time or it *does* have to become a transactional migration.

If the migration fails at some stage, you're left in limbo. Does Nova
work? Does Neutron work?

There needs to be some sort of fault tolerance or rollback feature if
you're going down the all or nothing approach to stop a cloud being
left in an inconsistent (and impossible to administrate or operate via
APIs) state.

If the two of them (Nova-network and Neutron) could both exist and
operate at the same time in a cloud, it wouldn't have to be a one-shot
migration. If some nodes fail, that's fine as you could just let them
fall back to Nova-net and fix them whilst your cloud still works and
more importantly nova-api is up and running.

 
 *** Stopping point for linuxbridge to linuxbridge translation, or
 continue for rollout of new tech
 
 == nova-compute restart two == Configure OVS or new technology,
 ensure that proper ML2 driver is installed Restart Layer2 agent
 on each hypervisor where next gen networking should be enabled
 
 
 So, I want to stop using the word cold to describe this. Its
 more of a rolling upgrade than a cold migration. So... Would two
 shorter nova API outages be acceptable?
 
 
 Two Nova API outages would be OK for us.

I think the Nova API outages are the least concern in comparison to
being left in a halfway state in a production environment. Hopefully
these concerns can be addresses.

 
 Michael
 
 -- Rackspace Australia

Whilst I wholeheartedly agree that this migration plan seems like a
good idea (and reminds me of an Raiders of the Lost Ark-esque scene),
I'm afraid of what would happen if something went wrong in the middle
of this swap.

It wouldn't be a good idea to stop nova-api to fix this, as users and
services would be able to use it again.

Perhaps we should change the policy on nova-api during this migration
to only allow access to a special migration role or the like? This
would disable services or users from accessing Nova's api when a
special policy is applied for the migration, but allow administrators
to continue monitoring via the API and fix any problems. This seems
like a currently absent must-have.

I like the idea of the migration, but I hope that any and all what
if? questions have been addressed and the problems are mitigated.

I wish you and Mark lots of luck with this migration, but please make
sure it's not fragile and ensure it's fault tolerant!

Cheers,
Joe
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQIcBAEBAgAGBQJUAIpaAAoJEHYEICnOV08jDrMQALq9oqx1Qj9j5AKNEPdofA+M

[openstack-dev] [Neutron][Extending] Binding/Restricting subnets to specific hosts more

2014-02-06 Thread Joe Harrison
Hi,

(Scroll down for tl;dr)

Unfortunately due to networking constraints I don't have the leisure
of a large and flat layer two network.

As such, different compute nodes and network nodes will be in separate
distinct subnets on the same network.

There will be hundreds if not thousands of subnets, and it does not
seem very user friendly to create a one-to-one mapping between these
subnets and neutron network objects.

Is there a resilient way to restrict and map subnets to compute nodes
and network nodes (or nodes running neutron plugin agents) without
having to hack the IP allocation code to bits and extending/modifying
the existing code.

Further to this, for auditing and network configuration purposes,
information such as MAC address, IP address and hostname needs to be
forwarded to an external system via means of a proprietary API.

To do this, my idea was to create a separate special agent which
attaches to the messaging server and manages this workflow for us,
hooking in with a few RPC calls here and there and subscribing to the
needed messaging queues and exchanges, whilst also creating my own API
extension to manage this workflow.

Does anyone have any advice, pointers or (hopefully) solutions to this
issue beyond what I'm already doing?

tl;dr need to restrict subnets to specific hosts. Also need to manage
an external networking workflow with an API extension and special
agent.

Thanks in advance,
Joe

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev