Re: [Int-area] IP Protocol number allocation request for Transparent Inter Process Communication (TIPC) protocol

Tom Herbert Mon, 23 Mar 2020 08:40:11 -0700

On Mon, Mar 23, 2020 at 7:32 AM Jon Maloy <[email protected]> wrote:
>
>
>
> On 3/20/20 11:04 AM, Joseph Touch wrote:
>
>
>
> On Mar 20, 2020, at 7:09 AM, Jon Maloy <[email protected]> wrote:
>
> Adding cc to [email protected], since I forgot that in my original response.
>
>
> On 3/19/20 9:18 PM, Joseph Touch wrote:
>
>
>
> On Mar 19, 2020, at 4:46 PM, Jon Maloy <[email protected]> wrote:
>
> IP addresses are no good in the *user API*, because they are location bound.
> That is also why DNS was invented, I  believe.
>
>
> DNS names are intended to be a human-rememberable alias to an IP address. 
> They do not indicate a location any more than an IP address does or does not.
>
> Exactly. Read what I wrote again.
>
>
> IP addresses are no good in the USER API because they are location bound.
> False. DNS names are provided as an alternative for the user API because they 
> are easier for people to remember and type.
>
> Then I should probably rephrase this so saying that "IP addresses AND DNS 
> names and are no good in the user API...", although I don't quite agree with 
> that. DNS names are of course much more convenient for a user to deal with 
> than IP addresses.
>
>
> Type in www.google.com
>
> Now type in its IPv6 address.
>
> Now see if you remember google’s website DNS or its IPv6 address. That’s what 
> the DNS was originally intended for.
>
> Yes. But in this case also demonstrates that both DNS names and the IP 
> address may be location independent. We have no clue whether a call will end 
> up in a server farm in the US or Europe, let alone which server it will be 
> handled on. So, even though the original purpose of DNS may have been 
> something else, it has clearly followed the obvious path of becoming a tool 
> for location independence. This is good, but not good enough for our purposes.
>
>
> DNS names are no more or less location-independent than IP addresses.
>
> This is also why DNS was invented...
>
> False. The reason the DNS exists has nothing to do with location. It’s simply 
> string substitution for convenience, or at least was ONLY that originally.
>
>
> I think you just supported my case for a location independent addressing 
> scheme.
>
>
> I am - but then I’m baffled why you want to run direct over IP. Ethernet has 
> location independent addresses; IP does not* (see next part).
>
>
> When I am talking about location independence I am always talking about what 
> the socket programmer/user sees. We don't want him to handle IP addresses, 
> and we probably don't want him to hard code DNS names either.
>
> But, at some level further down in the stack we never get around translating 
> location independent addresses to some form of location dependent ditto in 
> order to transmit the packets to the right node and socket. Be it MAC, IPc4, 
> IPv6 or anything else.
>
> This is what we do in TIPC :
>
> Socket Layer:            {service type, service instance}                 
> {port number}
> ------------------                                  |                         
>                                  A
>                                                        v                      
>                                     |
> TIPC Binding Table:  {port number, node number}                               
>     |
> -------------------------                          |                          
>                                 |
>                                                        v                      
>                                     |
> TIPC Link Layer:            {UDP port, IP address}                       {UDP 
> port, IP address}
> -----------------------             or {MAC address}                          
>       or {MAC address}
>                                                        |                      
>                                     A
>                                                        v                      
>                                     |
>                                                        
> +--------------------------------------------->+
>
>
> The {UDP port, IP address} tuple (or MAC address) at the link layer are never 
> visible to the user, and may change on-the-fly without him ever noticing.
> The same is true for the {port number, node number} tuple, although the user 
> here has the option to use those directly, at the expense of location 
> transparency.
> So, our request is simply about enabling us to use a third mapping at the 
> link layer, an IP address only. This does not in any way interfere with the 
> location transparency that is already provided at the socket level.
>
>
> This was one of the original motivations for developing TIPC in the first 
> place.  A programmer using TIPC can hard code his service addresses if he 
> wants to, ignoring the number of or location of the corresponding endpoints, 
> even as those move around or scale up/down quite fast.
>
>
> Anycast gives you location independent addresses at the cost of doing 
> discovery “inside the network layer”.
>
>
> Yes, and that is what we do. But for this to be of any use, that 
> discovery/translation has to be blistering fast, and that is also what we do.
>
>
> However, even if you have those addresses, you still need to identify the 
> service types (which is what we use ports for).
>
>
> UDP (at the link level) has only one service type in this case: "TIPC"
> At the socket level we are using TIPC service addresses for this, i.e., a 
> {service type, service instance} tuple, each element being a 32-bit integer.
>
>
> ——
>
> I’m still stuck at why you want to run direct over IP. If you want Ethernet 
> that bridges across routers, GRE does that.
>
>
> Yes, we could use VxLAN or Geneve or whatever. But that always comes to a 
> cost both in performance and maintenance.
> We want TIPC to be both performant and really simple to use.
>
> If you want loc-independent addresses for services, UDP over IP using anycast 
> does that.
>
>
> Again yes, but IP is normally not location independent inside clusters. 
> 8.8.8.8 may be perceived as location independent, but 192.168.100.17 is 
> typically not. And UDP has well-known limitations:
>
> 1) - UDP has 16-bit port numbers, a number space which has to be strictly 
> managed.
>     - TIPC has a 32-bit+32-bit service address instead. This is what we want
>       to extend to 128+128 bits, so that nobody ever needs to register a
>       well-known address for TIPC. At least not for the purpose of
>       avoiding collisions.
> 2) - UDP is best effort.
>     - Standard TIPC anycast is "better than best" effort, because packets will
>       never be lost in transport. Due to lack of socket level flow control, 
> there
>       is still a risk of seeing messages being dropped, though.
>     - Group anycast DOES have end-to-end flow control, so such messages
>       will never be lost or disordered.
> 3) Furthermore, we have reliable multicast and broadcast using the same
>     address type. There is no way you can get that with UDP.
>
>
> What is the specific gain of needing IP but not allowing a transport? AFAICT, 
> it’s all down to GSO - which is an implementation. If GSO doesn’t do what you 
> want, it would be useful to take your issues there or edit the code yourself 
> and submit the patches.
>
>
> In that respect this is only an implementation issue, as you say, but it is 
> not a TIPC only one.
> The slides referred to me by Tom Herbert describe GSO on large UDP messages, 
> but they don´t describe how we go one step further and do it on the inner 
> messages, or how we identify those as being TIPC in the first place. 
> Furthermore, we would have to re-write the host level GSO support, which am 
> highly uncertain that the Linux network community would accept, given that 
> everything needed already is there (i.e., if we only have a proper protocol 
> number.)


I don't understand why you think you need to rewrite GSO, there has
been an enormous amount of work to make this usable and extensible. I
suggest you take this up on the netdev list since this is about
implementation. I'd also point out that having a separate protocol
number is hardly a guarantee of acceptance in Linux, we would still be
asking for a justification and why wasn't this done in UDP.

>
> GSO is only one of the reasons for our request. There are more reasons:
> - Performance. The difference is not dramatic, but clearly measurable.
>   Terminating sockets in kernel space comes at a cost.

And what exactly is the performance difference that do your measurements show?

> - The need to be able to register a new socket type, which will map down
>   to a (compatible) TIPC v3 protocol.

A new socket type does not require a new protocol number. There are
many examples of that. AF_KCM for instance.

> - Acceptance. We want to have TIPC recognized as a part of the IP protocol
>   family, controlled by IETF, like most other protocols.

Well "most other protocols" nowadays are being defined over UDP-- e.g.
QUIC, all the various encapsulation protocols. The reasons for this
are: 1) there's only 256 IP protocol number, but 65536 port numbers,
hence it's obviously going to be easier to get a port number
assignment as opposed to a protocol number. 2) Network devices
notoriously don't handle new protocols well. If a protocol number is
assigned for TIPC and a packet is sent with the number, somewhere and
sometime an intermediate device will drop the packet. 3) UDP is really
cheap wire overhead (eight bytes) and we've put a lot of effort into
optimizing it in implementation at least in Linux (like all the
aforementioned GSO/GRO work).

Tom

>
>
> Regards
> ///jon
>
>
> Joe
>
>
> _______________________________________________
> Int-area mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/int-area

_______________________________________________
Int-area mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/int-area

Re: [Int-area] IP Protocol number allocation request for Transparent Inter Process Communication (TIPC) protocol

Reply via email to