> Re: [nvo3] Multi-subnet VNs [was Re: FW: New Version Notification 
> for draft-yong-nvo3-frwk-dpreq-addition-00.txt]
> 
> Hi Aldrin,

> On Tue, Dec 18, 2012 at 8:29 PM, Aldrin Isaac <[email protected]> 
wrote:
> Kireeti,
> 
> I'm not clear what difference it makes whether a packet is unicast
> forwarded using MAC address or IP address within a subnet 
> 
> Two important differences:
> a) you don't have to know the MAC address if you forward on IP.  I.
> e., you don't have to propagate the ARP to the destination (flood), 
> get the reply, bind IP to MAC (ARP table), and maintain ARP binding 
> (timeout, validate, etc.). The first is a real problem; the rest 
> are annoyances that become problems at scale.
[Lizhong] the first issue is not a problem for E-VPN or other control 
plane learning solution as Aldrin pointed also. The good thing I see for 
routing for intra-subnet is saving the Ethernet headers in the packet 
acrossing the underlay. But then how to do the broadcast? You have to 
discover the broadcast domain through control plane, right? This will also 
bring additional complexity to the control plane.
Additional issue is, such kind of forwarding may impact the anticipation 
for application developers. They will reasonable assume that the packet 
with broadcast MAC address will be receivied by all nodes within the 
subnet, while such kind of packet may have unicast IP address. I am not 
sure if "route IP, bridge non-IP" could do correct forwarding for such 
kind of packet.

Regards
Lizhong



> 
> (Note that the ARMD WG was created to address this issue, and you 
> know where that ended.)
> 
> (Note further that this may be hard to do in general, but in the 
> case of an orchestrated data center, you have the information about 
> where a given IP lives, and you have a control plane (ORACLE) to 
> inform all relevant NVEs.  And of course, an overlay to shield the 
> infrastructure from poking its nose into your forwarding behavior --
> i.e., the infra doesn't care whether you route or switch TS traffic.)


> 
> b) In the quite common case where all traffic from a TS is IP, you 
> don't have to maintain two tables and two forwarding paradigms at 
> the NVE (one for IPs and one for MACs).  This is common enough to 
> warrant optimization.
> 
> A third difference is that if you have only unicast traffic, you 
> don't have to maintain a multicast tree (for flooding).  For some, 
> this is a nice bonus, but I know you have a multicast packet or two 
> in your network :-)
> 
> as long as
> it gets to the intended destination along the most optimal path,
> particularly when the price to pay is non-standard behavior
> (intra-subnet ARP manglers ;}, etc).  I understand the argument about
> the sub-optimal routing from a third site, but when the primary sites
> end up aggregating prefixes for scaling reasons that argument falls
> off the table.  One way or other the piper gets paid.
> 
> One way, the piper gets paid a fair bit more than the other!
> 
> In terms of the real world issue of getting there from here --
> personally I haven't seen any vendor working towards a standards-based
> solution that will allow intra-subnet routing for subnets over
> HW/TOR-based PE, let alone intra-subnet routing for subnets that span
> across both hypervisor-based PE and TOR-based PE.  This makes me leery
> of solutions that can only take us half way there, particularly during
> the transition phase.  So if we're talking about network
> virtualization based purely on hypervisors, "route IP, bridge non-IP"
> may be realistic if you're willing to accept the caveats, but does not
> seem to be otherwise.
> 
> Good point.  Clearly, this is not a local decision: "route IP, 
> bridge non-IP" means that intra-subnet routes are propagated the 
> same way as inter-subnet routes, and thus every NVE, h/w or s/w, 
> must be on the same page.
> 
> To make this concrete using BGP VPNs, "route IP, bridge non-IP" 
> means all routes, intra- and inter-subnet, are propagated as IP VPN 
> routes, and E-VPN routes contain MACs without IPs.  "Bridge intra-
> subnet IP and non-IP, route inter-subnet" means inter-subnet routes 
> are propagated as IP VPN routes, and intra-VPN routes as E-VPN MAC+IP 
routes.
> 
> We can have a chat off-list on h/w vendors working towards this. 
>  Hopefully, others will weigh the above arguments, and support this.
>  Deployers (like you) have a say in this too :-)
> 
> Btw, I understand how multicast may be less than efficient when
> building both inter and intra subnet trees for the same IP mcast group
> that end up overlapping links (maybe even more than twice) -- but I'd
> like to hear your take on any other *insolvable* issues with regard to
> multicast.
> 
> Isn't that enough?  :-)  I am not a multicast expert, but I can try 
> to dig up IRB multicast horror stories.
> 
> Cheers,
> Kireeti.
> 
> Best regards -- aldrin
> 
> 
> 
> On Tue, Dec 18, 2012 at 6:06 PM, Kireeti Kompella
> <[email protected]> wrote:
> > Hi Thomas,
> >
> > On Dec 18, 2012, at 09:03 , Thomas Narten <[email protected]> wrote:
> >
> >> Kireeti Kompella <[email protected]> writes:
> >>
> >>> The solution is simple: route if IP, bridge if not.  Yes, one could
> >>> do IRB, but why?  IRB brings in complications, especially for
> >>> multicast.  I'm sure someone suggested this already, so put me down
> >>> as supporting this view.
> >>
> >> I'm not sure I understand the difference.
> >>
> >> From an *NVE* perspective, when it receives a packet (which will have
> >> an L2 header), it can look at the Ethertype, and if its IP, it can
> >> route it. Otherwise, it can provide normal L2 service. So, in this
> >> sense, "route if IP, bridge if not" is straightforward. And more to
> >> the point, I assume that if the packet gets L2 service, the entire VN
> >> is treated as a *single* broadcast domain. All nodes can reach all
> >> other nodes. Right?
> >
> > Right.
> >
> >> Just so I understand, how is this different than IRB?  What does IRB
> >> imply that the above does not?
> >
> > IRB follows the principle of "bridge when you can, route 
> otherwise".  So, an IP packet with dest IP in the same subnet 
> actually gets bridged; the originator (e.g., the VM) is responsible 
> for ARPing the IP address, slapping the right dest MAC on the packet
> and sending that to the NVE which simply forwards based on dest MAC 
> address *without* decrementing the TTL.
> >
> > If the dest IP is in another subnet, the packet is sent to the 
> gateway (which for IRB would be the same NVE), which this time does 
> an IP address lookup, decrements TTL and routes the packet.
> >
> > For multicast, there are even more differences.
> >
> >> But this is different than what (I believe) Lucy is arguing for. In
> >> the case of a multi-subnet VN, you have one VN, but it contains
> >> different subnets. Each subnet is intended to be one broadcast domain
> >> (i.e., equivalent of a VLAN), so that when sending LL multicast and
> >> the like on a specific subnet, such packets are *not* delivered to 
all
> >> nodes in the VN, but only those that are part of subnet.
> >
> > If one were to configure multiple subnets on a VLAN, I wonder if 
> LL traffic goes to all members of the VLAN, or just those in the 
> same subnet as the sender.  I suspect the former (but don't know).
> >
> >> This is a more complex type of service to provide. And I'm not sure 
we
> >> need this type of service to be provided by one VN.
> >
> > Agree.
> >
> >> A (seemingly
> >> simpler) alternative would be to put each subnet in its own VN and
> >> allow inter-subnet traffic to be handed as inter-VN traffic. So long
> >> as that case is optimized (i.e., the ingress NVE can tunnel directly
> >> to the egress NVE without adding triangular routing), this would seem
> >> to be a cleaner way to implement this.
> >
> > Can be done.  However, we're on Lucy's topic; mine was "route if 
> IP, bridge otherwise"; the goal was to rationalize the need for 
> Layer 2 forwarding for non-IP traffic, and inter- and intra-subnet 
routing.
> >
> > Kireeti.
> >
> >> Thomas
> >>
> >> _______________________________________________
> >> nvo3 mailing list
> >> [email protected]
> >> https://www.ietf.org/mailman/listinfo/nvo3
> >
> > _______________________________________________
> > nvo3 mailing list
> > [email protected]
> > https://www.ietf.org/mailman/listinfo/nvo3
> 

> 
> -- 
> Kireeti_______________________________________________
> nvo3 mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/nvo3
_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

Reply via email to