Re: [openstack-dev] [neutron][L3] L3 routed network segments

Neil Jerram Wed, 17 Jun 2015 08:26:09 -0700

Hi Carl,

I know you said at the end of your message below to ping you on IRC, butthere are details here that I'm not sure about, and some suggestionsthat I'd like to make, and I think it will be clearer to discuss thosein context. I hope that's OK.


On 11/06/15 17:26, Carl Baldwin wrote:

Neil,

I'm very glad to here of your interest.  I have been talking with Kyle
Mestery about the rfe you mention [1] since the day he filed it.  It
relates to a blueprint that I have been trying to get traction on [2]
in various forms for a while [*].

[...]
> [*] You're not the only one having trouble getting traction.
> Sometimes it takes a while to realize that we're interested in similar
> things and to find the commonalities and then to get people excited
> about something.  It has been an uphill battle for me until recently.

What you wrote for [*] is so true. I previously thought that I wastrying to introduce fundamentally new ideas into Neutron, but in fact itappears that similar ideas have been being batted around for some timeby various folk, including yourself, who are already more involved inand experienced with OpenStack than I am. It has been difficult for meto discover those existing conversations - but I hope I have them all now.

The rfe talks about attaching VMs directly to the the L3 routed
network.  This will require some coordination between ip address
assignment and scheduling of the instance to a compatible physical
server.

Could you describe in more detail the kind of attachment that you havein mind, and why it requires the IP address coordination that you mention?

By way of a counter-example... For the kind of attachment that myproject Calico provides, there is no restriction on where IP addressesmay be used. The attachment in this case looks like:


          +----------------------+          +----------------+
          |        Host          |          |        VM      |
          |                      |          |                |
   -------------  routing  -----------------------           |
          | eth0          tap123 |          | eth0           |
          | 172.19.8.239         |          | 10.65.0.2      |
          |                      |          |                |
          +----------------------+          +----------------+

Even though the 10.65.0.2 address comes from an OpenStack-defined subnetsuch as 10.65.0/24, Calico can assign IP addresses from that subnet toVMs on any hosts, and provide L3 connectivity between them. It doesthis by making the host respond to ARP requests on the TAP interfaces,and hence forces the host to be the first IP hop for all data from the VMs.

So - and assuming that I've correctly understood what you meant - Idon't think it's true that the concept of an L3 routed networknecessarily implies restrictions on IP address allocation; hence I'dsuggest that we treat those as separate concepts in the API.

My blueprint, on the other hand, tries to maintain IP mobility across
the network by relying on the BGP speaker work:  another BP we've been
trying to get traction on for a while.

I think it's important here to clearly separate API from implementation.For the API, I think the concept that you are expressing is that aVM's IP address should be routable from outside the immediate network,without involving a floating IP. Is that correct?

If so, there are then multiple possible implementations of that. Whenthe immediate network is a traditional Neutron L2 network, theimplementation is as per your BP, i.e. for the virtual router to exportthat network's IP address by acting as a BGP speaker.

On the other hand, if the immediate network is a Calico-style L3network, there is already a BGP speaker running on each host, becausethat is part of how Calico implements connectivity within the immediatenetwork (by exporting local TAP interface routes like '10.65.0.2/32 devtap123'), and nothing further is needed.


Does that make sense?

(In both cases, of course, there must be BGP peerings to the othernetworks to which it is desired to export VM IP addresses. I'm not yetsure if it makes sense to aim to specify such peerings on the NeutronAPI, or if such details should be regarded as individual deploymentconfiguration.)

 I also limit the connections
to the L3 routed network to virtual routers for now.

Right - I think you mean here that, for your work, the immediate networkto which a VM is connected is still a traditional Neutron L2 network.Is that correct?

For my interests - and I think for those of some other commenters on [1]- I'd certainly like that to be generalized, to allow the immediatenetwork to be 'L3' rather than 'L2'.


[1] https://bugs.launchpad.net/neutron/+bug/1458890

Again, it would be good to be as clear as possible on the API concepthere, i.e. about what we really mean. Specifically, I'm not sure 'L3'is the right concept, because an L2 network is also L3-capable. It'sactually, I think, that network ports are not (necessarily) on an L2broadcast domain. In the Calico case, there are no L2 broadcast domainsanywhere (or else you could say that each TAP-VM interface is in abroadcast domain on its own). In the case of other commenters at [1],the desire (I believe) is to specify that some subset of a network'sports are on an L2 segment, some other subset on a different L2 segment,and so on.


As a strawman, a possible API representation of this would involve:

- a Network-level attribute indicating that 'the ports on this networkare generally not on an L2 segment'

- a Port-level attribute indicating the L2 segment ID (if any) that thatport is on.

L2 broadcast capability would then be taken to exist between Ports withthe same L2 segment ID.

Moving onto implementation - we'd then have to consider whether and howNeutron's in-tree implementation components need tweaking so as tosupport such networks. I'm familiar already with the DHCP agent,because we've modified that for Calico so as to provide DHCP service tounbridged TAP interfaces (as in the abandoned spec at [2]). Butprobably there are other components to consider, too.


[2] https://review.openstack.org/#/c/130736/4

The two have network segments in common.  So, as I proceed on the
implementation of my blueprint [2], I will keep in mind the needs of
the rfe [1] and build network segments in a way which can be utilized
by both.  However, I will leave the coordination of VM scheduling and
IP address assignment to someone else.  Does this all make sense?

Yes, and thanks. I am happy to help out with any of the work here, andI hope that my writings above are useful in helping to synthesize ourvarious objectives.


Please do let me know what you think.

Thanks,
        Neil

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [neutron][L3] L3 routed network segments

Reply via email to