On Mon, Mar 23, 2020 at 7:32 AM Jon Maloy <[email protected]> wrote: > > > > On 3/20/20 11:04 AM, Joseph Touch wrote: > > > > On Mar 20, 2020, at 7:09 AM, Jon Maloy <[email protected]> wrote: > > Adding cc to [email protected], since I forgot that in my original response. > > > On 3/19/20 9:18 PM, Joseph Touch wrote: > > > > On Mar 19, 2020, at 4:46 PM, Jon Maloy <[email protected]> wrote: > > IP addresses are no good in the *user API*, because they are location bound. > That is also why DNS was invented, I believe. > > > DNS names are intended to be a human-rememberable alias to an IP address. > They do not indicate a location any more than an IP address does or does not. > > Exactly. Read what I wrote again. > > > IP addresses are no good in the USER API because they are location bound. > False. DNS names are provided as an alternative for the user API because they > are easier for people to remember and type. > > Then I should probably rephrase this so saying that "IP addresses AND DNS > names and are no good in the user API...", although I don't quite agree with > that. DNS names are of course much more convenient for a user to deal with > than IP addresses. > > > Type in www.google.com > > Now type in its IPv6 address. > > Now see if you remember google’s website DNS or its IPv6 address. That’s what > the DNS was originally intended for. > > Yes. But in this case also demonstrates that both DNS names and the IP > address may be location independent. We have no clue whether a call will end > up in a server farm in the US or Europe, let alone which server it will be > handled on. So, even though the original purpose of DNS may have been > something else, it has clearly followed the obvious path of becoming a tool > for location independence. This is good, but not good enough for our purposes. > > > DNS names are no more or less location-independent than IP addresses. > > This is also why DNS was invented... > > False. The reason the DNS exists has nothing to do with location. It’s simply > string substitution for convenience, or at least was ONLY that originally. > > > I think you just supported my case for a location independent addressing > scheme. > > > I am - but then I’m baffled why you want to run direct over IP. Ethernet has > location independent addresses; IP does not* (see next part). > > > When I am talking about location independence I am always talking about what > the socket programmer/user sees. We don't want him to handle IP addresses, > and we probably don't want him to hard code DNS names either. > > But, at some level further down in the stack we never get around translating > location independent addresses to some form of location dependent ditto in > order to transmit the packets to the right node and socket. Be it MAC, IPc4, > IPv6 or anything else. > > This is what we do in TIPC : > > Socket Layer: {service type, service instance} > {port number} > ------------------ | > A > v > | > TIPC Binding Table: {port number, node number} > | > ------------------------- | > | > v > | > TIPC Link Layer: {UDP port, IP address} {UDP > port, IP address} > ----------------------- or {MAC address} > or {MAC address} > | > A > v > | > > +--------------------------------------------->+ > > > The {UDP port, IP address} tuple (or MAC address) at the link layer are never > visible to the user, and may change on-the-fly without him ever noticing. > The same is true for the {port number, node number} tuple, although the user > here has the option to use those directly, at the expense of location > transparency. > So, our request is simply about enabling us to use a third mapping at the > link layer, an IP address only. This does not in any way interfere with the > location transparency that is already provided at the socket level. > > > This was one of the original motivations for developing TIPC in the first > place. A programmer using TIPC can hard code his service addresses if he > wants to, ignoring the number of or location of the corresponding endpoints, > even as those move around or scale up/down quite fast. > > > Anycast gives you location independent addresses at the cost of doing > discovery “inside the network layer”. > > > Yes, and that is what we do. But for this to be of any use, that > discovery/translation has to be blistering fast, and that is also what we do. > > > However, even if you have those addresses, you still need to identify the > service types (which is what we use ports for). > > > UDP (at the link level) has only one service type in this case: "TIPC" > At the socket level we are using TIPC service addresses for this, i.e., a > {service type, service instance} tuple, each element being a 32-bit integer. > > > —— > > I’m still stuck at why you want to run direct over IP. If you want Ethernet > that bridges across routers, GRE does that. > > > Yes, we could use VxLAN or Geneve or whatever. But that always comes to a > cost both in performance and maintenance. > We want TIPC to be both performant and really simple to use. > > If you want loc-independent addresses for services, UDP over IP using anycast > does that. > > > Again yes, but IP is normally not location independent inside clusters. > 8.8.8.8 may be perceived as location independent, but 192.168.100.17 is > typically not. And UDP has well-known limitations: > > 1) - UDP has 16-bit port numbers, a number space which has to be strictly > managed. > - TIPC has a 32-bit+32-bit service address instead. This is what we want > to extend to 128+128 bits, so that nobody ever needs to register a > well-known address for TIPC. At least not for the purpose of > avoiding collisions. > 2) - UDP is best effort. > - Standard TIPC anycast is "better than best" effort, because packets will > never be lost in transport. Due to lack of socket level flow control, > there > is still a risk of seeing messages being dropped, though. > - Group anycast DOES have end-to-end flow control, so such messages > will never be lost or disordered. > 3) Furthermore, we have reliable multicast and broadcast using the same > address type. There is no way you can get that with UDP. > > > What is the specific gain of needing IP but not allowing a transport? AFAICT, > it’s all down to GSO - which is an implementation. If GSO doesn’t do what you > want, it would be useful to take your issues there or edit the code yourself > and submit the patches. > > > In that respect this is only an implementation issue, as you say, but it is > not a TIPC only one. > The slides referred to me by Tom Herbert describe GSO on large UDP messages, > but they don´t describe how we go one step further and do it on the inner > messages, or how we identify those as being TIPC in the first place. > Furthermore, we would have to re-write the host level GSO support, which am > highly uncertain that the Linux network community would accept, given that > everything needed already is there (i.e., if we only have a proper protocol > number.)
I don't understand why you think you need to rewrite GSO, there has been an enormous amount of work to make this usable and extensible. I suggest you take this up on the netdev list since this is about implementation. I'd also point out that having a separate protocol number is hardly a guarantee of acceptance in Linux, we would still be asking for a justification and why wasn't this done in UDP. > > GSO is only one of the reasons for our request. There are more reasons: > - Performance. The difference is not dramatic, but clearly measurable. > Terminating sockets in kernel space comes at a cost. And what exactly is the performance difference that do your measurements show? > - The need to be able to register a new socket type, which will map down > to a (compatible) TIPC v3 protocol. A new socket type does not require a new protocol number. There are many examples of that. AF_KCM for instance. > - Acceptance. We want to have TIPC recognized as a part of the IP protocol > family, controlled by IETF, like most other protocols. Well "most other protocols" nowadays are being defined over UDP-- e.g. QUIC, all the various encapsulation protocols. The reasons for this are: 1) there's only 256 IP protocol number, but 65536 port numbers, hence it's obviously going to be easier to get a port number assignment as opposed to a protocol number. 2) Network devices notoriously don't handle new protocols well. If a protocol number is assigned for TIPC and a packet is sent with the number, somewhere and sometime an intermediate device will drop the packet. 3) UDP is really cheap wire overhead (eight bytes) and we've put a lot of effort into optimizing it in implementation at least in Linux (like all the aforementioned GSO/GRO work). Tom > > > Regards > ///jon > > > Joe > > > _______________________________________________ > Int-area mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/int-area _______________________________________________ Int-area mailing list [email protected] https://www.ietf.org/mailman/listinfo/int-area
