Hi Carl,

I know you said at the end of your message below to ping you on IRC, but there are details here that I'm not sure about, and some suggestions that I'd like to make, and I think it will be clearer to discuss those in context. I hope that's OK.

On 11/06/15 17:26, Carl Baldwin wrote:
Neil,

I'm very glad to here of your interest.  I have been talking with Kyle
Mestery about the rfe you mention [1] since the day he filed it.  It
relates to a blueprint that I have been trying to get traction on [2]
in various forms for a while [*].
[...]
> [*] You're not the only one having trouble getting traction.
> Sometimes it takes a while to realize that we're interested in similar
> things and to find the commonalities and then to get people excited
> about something.  It has been an uphill battle for me until recently.

What you wrote for [*] is so true. I previously thought that I was trying to introduce fundamentally new ideas into Neutron, but in fact it appears that similar ideas have been being batted around for some time by various folk, including yourself, who are already more involved in and experienced with OpenStack than I am. It has been difficult for me to discover those existing conversations - but I hope I have them all now.

The rfe talks about attaching VMs directly to the the L3 routed
network.  This will require some coordination between ip address
assignment and scheduling of the instance to a compatible physical
server.

Could you describe in more detail the kind of attachment that you have in mind, and why it requires the IP address coordination that you mention?

By way of a counter-example... For the kind of attachment that my project Calico provides, there is no restriction on where IP addresses may be used. The attachment in this case looks like:

          +----------------------+          +----------------+
          |        Host          |          |        VM      |
          |                      |          |                |
   -------------  routing  -----------------------           |
          | eth0          tap123 |          | eth0           |
          | 172.19.8.239         |          | 10.65.0.2      |
          |                      |          |                |
          +----------------------+          +----------------+

Even though the 10.65.0.2 address comes from an OpenStack-defined subnet such as 10.65.0/24, Calico can assign IP addresses from that subnet to VMs on any hosts, and provide L3 connectivity between them. It does this by making the host respond to ARP requests on the TAP interfaces, and hence forces the host to be the first IP hop for all data from the VMs.

So - and assuming that I've correctly understood what you meant - I don't think it's true that the concept of an L3 routed network necessarily implies restrictions on IP address allocation; hence I'd suggest that we treat those as separate concepts in the API.

My blueprint, on the other hand, tries to maintain IP mobility across
the network by relying on the BGP speaker work:  another BP we've been
trying to get traction on for a while.

I think it's important here to clearly separate API from implementation. For the API, I think the concept that you are expressing is that a VM's IP address should be routable from outside the immediate network, without involving a floating IP. Is that correct?

If so, there are then multiple possible implementations of that. When the immediate network is a traditional Neutron L2 network, the implementation is as per your BP, i.e. for the virtual router to export that network's IP address by acting as a BGP speaker.

On the other hand, if the immediate network is a Calico-style L3 network, there is already a BGP speaker running on each host, because that is part of how Calico implements connectivity within the immediate network (by exporting local TAP interface routes like '10.65.0.2/32 dev tap123'), and nothing further is needed.

Does that make sense?

(In both cases, of course, there must be BGP peerings to the other networks to which it is desired to export VM IP addresses. I'm not yet sure if it makes sense to aim to specify such peerings on the Neutron API, or if such details should be regarded as individual deployment configuration.)

 I also limit the connections
to the L3 routed network to virtual routers for now.

Right - I think you mean here that, for your work, the immediate network to which a VM is connected is still a traditional Neutron L2 network. Is that correct?

For my interests - and I think for those of some other commenters on [1] - I'd certainly like that to be generalized, to allow the immediate network to be 'L3' rather than 'L2'.

[1] https://bugs.launchpad.net/neutron/+bug/1458890

Again, it would be good to be as clear as possible on the API concept here, i.e. about what we really mean. Specifically, I'm not sure 'L3' is the right concept, because an L2 network is also L3-capable. It's actually, I think, that network ports are not (necessarily) on an L2 broadcast domain. In the Calico case, there are no L2 broadcast domains anywhere (or else you could say that each TAP-VM interface is in a broadcast domain on its own). In the case of other commenters at [1], the desire (I believe) is to specify that some subset of a network's ports are on an L2 segment, some other subset on a different L2 segment, and so on.

As a strawman, a possible API representation of this would involve:

- a Network-level attribute indicating that 'the ports on this network are generally not on an L2 segment'

- a Port-level attribute indicating the L2 segment ID (if any) that that port is on.

L2 broadcast capability would then be taken to exist between Ports with the same L2 segment ID.

Moving onto implementation - we'd then have to consider whether and how Neutron's in-tree implementation components need tweaking so as to support such networks. I'm familiar already with the DHCP agent, because we've modified that for Calico so as to provide DHCP service to unbridged TAP interfaces (as in the abandoned spec at [2]). But probably there are other components to consider, too.

[2] https://review.openstack.org/#/c/130736/4

The two have network segments in common.  So, as I proceed on the
implementation of my blueprint [2], I will keep in mind the needs of
the rfe [1] and build network segments in a way which can be utilized
by both.  However, I will leave the coordination of VM scheduling and
IP address assignment to someone else.  Does this all make sense?

Yes, and thanks. I am happy to help out with any of the work here, and I hope that my writings above are useful in helping to synthesize our various objectives.

Please do let me know what you think.

Thanks,
        Neil

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to