On Thu, Mar 6, 2014 at 7:24 PM, Lizhong Jin <lizho....@gmail.com> wrote: > Hi Tom, see inline below. > > Regards > Lizhong > >> -----Original Message----- >> From: Tom Herbert [mailto:therb...@google.com] >> Sent: 2014年3月7日 0:42 >> To: Lizhong Jin >> Cc: nvo3@ietf.org; mls.i...@gmail.com >> Subject: Re: [nvo3] Fwd: New Version Notification for draft-herbert-gue- >> 01.txt >> >> Hi Lizhong, thanks for the comments! >> >> On Wed, Mar 5, 2014 at 11:53 PM, Lizhong Jin <lizho....@gmail.com> wrote: >> > Hi Tom, >> > In section 2.3, the 16bit 0s is redundant for a packet. I prefer the >> > format like >> below: >> > 0 1 2 3 >> > 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 >> > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ >> > | 0x0 | Hlen |V|SEC|R|R|P|P|E| Protocol | >> > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ >> > >> > When Ebit = 0, protocol = 8bit IP protocol number. >> > When Ebit = 1, protocol = 16bit EtherType. >> > Then the ASIC could always use the combined 17bit (1bit Ebit + 16bit >> Protocol) as an index to parse the payload. >> > >> We considered combining Ethertype and IP protocol into one field, but came >> to the conclusion this isn't such a good idea. The points are: >> >> 1) Encapsulation of IPv4, IPv6, and EtherIP covers the vast majority of use >> cases, so encapsulating L3 protocols is the point to optimize >> 2) As you can see in your modified header format, EtherType needs an >> additional 8 bits of core header. This only leaves 2 bits for future >> extensions, >> based on the rate at which fields were added to GRE that does not seem like >> enough. > [Lizhong] R-bits number is indeed an issue. But follow the definition rule of > the security and private options in current draft, it is not extensible > enough. It seems every new option will occupy some R-bits. Why not change the > options to TLV like Geneve? The current definition of VNI is OK, but the > security and private options could be TLV. If we change these options to TLV, > then we could have more R-bits in the GUE header. > Like I said in previous comments, using flag-fields versus TLVs is an explicit tradeoff (probably the most interesting discussion needed regarding meta data). While the TLVs allow open ended extensibility, flag-fields are limited but extremely simple, more compact, and parsing is trivial. Giving the low levelness of the protocol and probable deployment in high PPS environment within data center, I still opt for the second. Adding new fields requires a lot of diligence, and we want to ensure that any new fields allow multiple uses so we for instance have a security field instead of a cookie field. I believe a good indicator of rate at which options are added would be GRE, In 20 years, looks like 2 new options were added (although I believe that there were a handful of non-standardize ones also).
>> 3) Lookup of Ethertype is more complex than IP protocol because of the >> larger space. A 64K lookup table (or in your proposal a 128K lookup >> table) is prohibitive in many environments. For instance, on a 64 bit host >> this >> becomes a 1/2M (and 1M) of memory. The 256 entry IP protocol table is only >> 2K memory. In fact, in Linux the EtherType lookup uses a hash table smaller >> than 64K, while IP protocol is directly indexed in a 256 entry table. Note, >> this >> will also be issue with NSH. > [Lizhong] as you said, 64k/128k table will not be implemented, and hash table > with supported type number will be used instead. The 17bit I suggested is the > hash key, and does not implicitly indicate the table size. > Hash tables require relatively more logic than directly indexing. proto_ctx = table[proto]; vs. proto_ctx = hash_table[proto & 0xff...] Deal with collisions... >> 4) I want to encourage that IP protocol = ESP becomes a common case :-) >> 5) Allowing encap of GRE header for other L2 protocols doesn't seem >> unreasonable. > [Lizhong] my concern is the 16 redundant bits. At current header definition > stage, it would be better if we could optimize the header. If GUE has been > widely deployed, then by using GRE would be a compatible method. > It's still only 4 bytes overhead. A single TLV in geneve has four bytes of overhead for instance. >> 6) If you remove the UDP part of the encapsulation it looks a lot like an IP >> extension header. This is not a coincidence! >> >> > In my mind, one of the potential use cases of the private fields is >> congestion control. Currently, there is only TCP have congestion control >> mechanism. Some other non-TCP traffics in DC also require flow based >> congestion control. >> > >> On CC and encap: >> >> 1) It is extremely important to consider! The obvious need is for non- >> conformant CC traffic-- basically anything from an untrusted guest >> 2) It's very likely that an untrusted guest and host will both be doing CC, >> so >> we need CC that doesn't interact poorly. >> 3) DCCP is an possible option (IP protocol = DCCP) without needing to add >> new fields, although it would be verbose (many fields, like ports, are >> probably unnecessary). >> 4) So congestion control probably warrants an additional field. I suspect it >> should be variable length field like SEC to allow pluggable CC. >> >> Thanks, >> Tom >> >> > Regards >> > Lizhong >> > >> >> -----Original Message----- >> >> From: Tom Herbert [mailto:therb...@google.com] >> >> Sent: 2014年3月5日 4:05 >> >> To: nvo3@ietf.org >> >> Cc: mls.i...@gmail.com >> >> Subject: [nvo3] Fwd: New Version Notification for >> >> draft-herbert-gue-01.txt >> >> >> >> Hi, >> >> >> >> i posted a new version of GUE (Generic UDP Encpasulation). I >> >> appreciate comments, however please bear in mind: >> >> >> >> 1) This is not just for network visualization, we anticipate other use >> >> cases. >> >> Network virtualization is a very important use case. >> >> 2) I do not claim that this solves all of the problems or addresses >> >> all requirements of encapsulation or networking virtualization. As >> >> far as I can tell, it does satisfy most of our needs for a generic >> >> and ubiquitous encapsulation (in one large data center environment). >> >> Even so, we still have need for other encapsulation protocols in different >> contexts. >> >> 3) Please take this as input for some requirements and potential >> >> solutions when contemplating standard encap protocols. >> >> 4) We did consider many alternate encapsulation protocols (see >> >> motivation section). GRE was the closest to what we need, and in fact >> >> the basic concepts of GUE are derived from GRE. GRE is very simple, >> >> generic, stateless, allows for extensions, is amenable to efficient >> >> HW implementation, and is suitable for high PPS applications. We >> >> can't use directly GRE because adding new fields (extensibility) >> >> breaks middleboxes which need to parse inner headers (header length >> >> ambiguity with new fields). So in GUE we have a header length field >> >> that can allow a device to skip over unknown options. Also, we chose >> >> to encapsulate by IP protocol as opposed to the EtherType which is >> >> more efficient and appropriate when doing L3oL3 encap (our majority use >> case). >> >> 6) GRE-like flag-fields are very limited and constrained compared to >> >> something like TLVs which allow open ended extensibility. Their use >> >> represents a trade-off. To their advantage flag-fields are very >> >> efficient and simple to to parse. They are very compact, order of >> >> fields in the packet is fixed, each field type occurs at most once in >> >> the packet, and random access of specific fields is possible. I don't >> >> foresee the need to add a whole bunch of new fields, and those add will >> likely be generic supporting "pluggable" >> >> semantics (like the security field in the draft). Other similar >> >> generic fields we've contemplated are a long inner flow identifier, >> >> QoS/classification, and congestion control. >> >> 7) We have deployed a variant of this protocol at scale and it is >> >> working pretty well! >> >> 8) I have posted patches for the initial GUE draft on Linux netdev. >> >> These implement IPIP/GUE, SIT/GUE, and GRE/GUE (also implements >> >> GRE/UDP draft). In testing we did demonstrate the value of UDP >> >> encapsulation to improve load balancing and steering in the network. >> >> >> >> Thanks, >> >> Tom >> >> >> >> ---------- Forwarded message ---------- >> >> From: <internet-dra...@ietf.org> >> >> Date: Tue, Mar 4, 2014 at 11:02 AM >> >> Subject: New Version Notification for draft-herbert-gue-01.txt >> >> To: Tom Herbert <therb...@google.com> >> >> >> >> >> >> >> >> A new version of I-D, draft-herbert-gue-01.txt has been successfully >> >> submitted by Tom Herbert and posted to the IETF repository. >> >> >> >> Name: draft-herbert-gue >> >> Revision: 01 >> >> Title: Generic UDP Encapsulation >> >> Document date: 2014-03-05 >> >> Group: Individual Submission >> >> Pages: 20 >> >> URL: >> >> http://www.ietf.org/internet-drafts/draft-herbert-gue-01.txt >> >> Status: https://datatracker.ietf.org/doc/draft-herbert-gue/ >> >> Htmlized: http://tools.ietf.org/html/draft-herbert-gue-01 >> >> Diff: http://www.ietf.org/rfcdiff?url2=draft-herbert-gue-01 >> >> >> >> Abstract: >> >> This specification describes Generic UDP Encapsulation (GUE), which >> >> is a scheme for using UDP to encapsulate packets of arbitrary IP >> >> protocols for transport across layer 3 networks. By encapsulating >> >> packets in UDP, specialized capabilities in networking hardware for >> >> efficient handling of UDP packets can be leveraged. GUE specifies >> >> basic encapsulation methods upon which higher level constructs, such >> >> tunnels and overlay networks, can be constructed. >> >> >> >> >> >> >> >> >> >> Please note that it may take a couple of minutes from the time of >> >> submission until the htmlized version and diff are available at >> tools.ietf.org. >> >> >> >> The IETF Secretariat >> >> >> > >> > > _______________________________________________ nvo3 mailing list nvo3@ietf.org https://www.ietf.org/mailman/listinfo/nvo3